-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add unload_textual_inversion method #6656
Conversation
@yiyixuxu could you try to give this a review? |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for working on this! I left some questions :)
|
||
# Fix token ids in tokenizer | ||
key_id = 1 | ||
for token_id in tokenizer.added_tokens_decoder: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain why do we need this block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this block to make all token ids sequential after one of the added tokens is removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for explaining this! @fabiorigano
I'm not very familiar with the use case
cc @apolinario ad @linoytsaban here can you take a look to see if we need to reorder the remaining added tokens after we remove some?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fabiorigano! I looked a bit but I'm actually not quite sure why it's necessary to reorder 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @linoytsaban, the reordering block is useful to have the same indeces as the text embeddings in the encoder, so multiple unload_textual_inversion(<token>)
calls will remove the correct text embeddings. If the reordering is not done, when re-executing unload_textual_inversion(<another-token>)
the last for loop may fail, because it may remove a different text embedding than expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha! that makes total sense, thanks for explaining! 🤗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me! thanks for working on this:)
let's wait for a final review from @apolinario
|
||
# Fix token ids in tokenizer | ||
key_id = 1 | ||
for token_id in tokenizer.added_tokens_decoder: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for explaining this! @fabiorigano
I'm not very familiar with the use case
cc @apolinario ad @linoytsaban here can you take a look to see if we need to reorder the remaining added tokens after we remove some?
Co-authored-by: YiYi Xu <yixu310@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this PR @fabiorigano! LGTM :)
@fabiorigano @linoytsaban thank you both! great work as always @fabiorigano :) |
@yiyixuxu @linoytsaban thank you for your reviews! :) |
* Add unload_textual_inversion * Fix dicts in tokenizer * Fix quality * Apply suggestions from code review Co-authored-by: YiYi Xu <yixu310@gmail.com> * Fix variable name after last update --------- Co-authored-by: YiYi Xu <yixu310@gmail.com>
What does this PR do?
Fixes #6013
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@apolinario @sayakpaul