-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[Hub] feat: explicitly tag to diffusers when using push_to_hub #6678
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we maybe also use this PR as an opportunity to refactor all the "create model card" functions of the examples, e.g. here:
def save_model_card( |
?
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
That should be in a separate PR, @patrickvonplaten and I feel relatively strongly about it. This is because we prepare the README.mds from the training scripts manually and don't rely on the |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making this PR @sayakpaul! Left a few comments. Main points are:
- if a model card already exists and library_name is not set, let's set it to
diffusers
(not the case at the moment) - maybe have a default template more specific to diffusers, like SetFit is doing
not is_jinja_available(), | ||
reason="Model card tests cannot be performed with Jinja installed.", | ||
) | ||
def test_push_to_hub_library_name(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) I would add an explicit test both for when the model card doesn't exist yet and for when the model card already exists. Maybe not needed to test the full push_to_hub
method but simply the create_and_tag_model_card
helper (or whatever its name :) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does 99ce47c work for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't notice that there is a generate_model_card
and a create_model_card
in the hub utils. Should we merge them since they seem to do closely related things? (the difference is generating for a training or generating from anywhere, right?). Naming is misleading in that case (sorry, didn't notice before when I suggest generate_model_card
).
Regarding the test, yes it looks good to me :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, but it won't work since tmpdir
is local. What's the best way to test here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Wauplin I merged them. However, I am not sure about the test since tmpdir
is local and ModelCard.load()
would fail there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think tests are good. I've added a comment below to test from existing file with existing library_name that is not diffusers.
But what I meant above is that with this PR the hub utils feels clunky. We now have:
create_model_card
that creates a model card for a training. The method looks outdated and not used anywhere. However it introduces a template that is nice.generate_model_card
that either loads an existing model card or create a new one (from a different template) and addlibrary_name: diffusers
to it. This method is used in the codebase.
Maybe what I would do to solve this (and sorry if it's a revamp of the PR):
- deprecate
create_model_card
(or even remove it completely) if it's not used - add in
hub_utils.py
aload_or_create_model_card
helper that returns aModelCard
object without modifying anything. It's similar to thetry: (ModelCard.load) except EntryNotFound: (ModelCard.from_template)
part - add in
hub_utils.py
apopulate_model_card
that takes as input aModelCard
object and addlibrary_name: diffusers
if doesn't exist yet. - then in the codebase (for example in
pipeline_utils.py
), you would do
model_card = load_or_create_model_card(repo_id, token=token, is_pipeline=True)
populate_model_card(model_card)
model_card.save(os.path.join(save_directory, "README.md"))
WDTY?
(to take with a grain of salt, I'm not expert in diffusers
codebase so I might be missing some parts)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(sorry @sayakpaul I didn't see your comment while posting this message)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. The latest commit should be have reflected these. Let me know if that makes sense.
I have opted to remove create_model_card()
and the related test as it's not really used.
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Lucain <lucainp@gmail.com>
not is_jinja_available(), | ||
reason="Model card tests cannot be performed with Jinja installed.", | ||
) | ||
def test_push_to_hub_library_name(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think tests are good. I've added a comment below to test from existing file with existing library_name that is not diffusers.
But what I meant above is that with this PR the hub utils feels clunky. We now have:
create_model_card
that creates a model card for a training. The method looks outdated and not used anywhere. However it introduces a template that is nice.generate_model_card
that either loads an existing model card or create a new one (from a different template) and addlibrary_name: diffusers
to it. This method is used in the codebase.
Maybe what I would do to solve this (and sorry if it's a revamp of the PR):
- deprecate
create_model_card
(or even remove it completely) if it's not used - add in
hub_utils.py
aload_or_create_model_card
helper that returns aModelCard
object without modifying anything. It's similar to thetry: (ModelCard.load) except EntryNotFound: (ModelCard.from_template)
part - add in
hub_utils.py
apopulate_model_card
that takes as input aModelCard
object and addlibrary_name: diffusers
if doesn't exist yet. - then in the codebase (for example in
pipeline_utils.py
), you would do
model_card = load_or_create_model_card(repo_id, token=token, is_pipeline=True)
populate_model_card(model_card)
model_card.save(os.path.join(save_directory, "README.md"))
WDTY?
(to take with a grain of salt, I'm not expert in diffusers
codebase so I might be missing some parts)
|
||
|
||
class CreateModelCardTest(unittest.TestCase): | ||
@patch("diffusers.utils.hub_utils.get_full_repo_name") | ||
def test_create_model_card(self, repo_name_mock: Mock) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow the discussion here: #6678 (comment).
if not is_jinja_available(): | ||
raise ValueError( | ||
"Modelcard rendering is based on Jinja templates." | ||
" Please make sure to have `jinja` installed before using `create_model_card`." | ||
" To install it, please run `pip install Jinja2`." | ||
) | ||
|
||
if hasattr(args, "local_rank") and args.local_rank not in [-1, 0]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow the discussion here: #6678 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks for all the iterations @sayakpaul! 🔥
Left some minor comments but overall approved it :)
EDIT: MODEL_CARD_TEMPLATE_PATH
is not used anymore in diffusers
. Might be worth removing the constants and deleting model_card_template.md
file from the repo (except if this is considered as a breaking change). So that no one thinks the model card will be created from this template.
Co-authored-by: Lucain <lucainp@gmail.com>
I have addressed this too. If you want to take another look, feel free to. Will merge once @patrickvonplaten takes another look. In an immediate follow-up PR, will standard implementing a model card template. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay! 🎉
Co-authored-by: Lucain <lucainp@gmail.com>
Will fix the failing test |
Alright. Things seem to be pristine now. Let's see what Patrick has to say. |
@patrickvonplaten a gentle ping. |
TBH, I think this PR is exactly abuot refactoring the examples because this is the no.1 use case when people use
If you feel very strongy about it, I'm ok with doing it in a new PR right after, but overall I think the main purpose of the changes here are exactly for the example scripts and therefore they should be updated accordingly |
@yiyixuxu can you also take a look? |
I don't think so and the description makes it quite clear. See the Slack thread too. The purpose of this PR is to provide the diffusers model name when we do push to hub. I anticipate the training scripts would require a lot more changes because none of those share a unified approach when it comes to prepping the model cards. Therefore, I gently request you to reconsider what you suggested. Also how do you propose testing? |
If you feel strongly, ok to merge for me, but let's please not forgot to update the example scripts. Regarding testing, I would first test whether everything works as expected by using it in the example scripts (which is why I propose to change it here) 😅 Right now we know what the model cards looks like when training dreambooth - it would be good to see how the model card looks like when changing this here:
|
That is alright. I am going to merge and my next PR will be about updating the scripts. Thanks a mile for beating it with me! |
thanks for adding this @sayakpaul |
Sure, but the examples differ in how the model cards are created. So, we need to have a solid reference PR first. This is what I am gonna work on next :) |
…ngface#6678) * feat: explicitly tag to diffusers when using push_to_hub * remove tags. * reset repo. * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix: tests * fix: push_to_hub behaviour for tagging from save_pretrained * Apply suggestions from code review Co-authored-by: Lucain <lucainp@gmail.com> * Apply suggestions from code review Co-authored-by: Lucain <lucainp@gmail.com> * import fixes. * add library name to existing model card. * add: standalone test for generate_model_card * fix tests for standalone method * moved library_name to a better place. * merge create_model_card and generate_model_card. * fix test * address lucain's comments * fix return identation * Apply suggestions from code review Co-authored-by: Lucain <lucainp@gmail.com> * address further comments. * Update src/diffusers/pipelines/pipeline_utils.py Co-authored-by: Lucain <lucainp@gmail.com> --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Lucain <lucainp@gmail.com>
What does this PR do?
Explicitly adds the library name to be
diffusers
when usingpush_to_hub
. In a follow-up PR, will add this to the officially maintained training scripts, too.Internal discussion: https://huggingface.slack.com/archives/C04EX6W3QSY/p1704797847482439?thread_ts=1704785760.863359&cid=C04EX6W3QSY. Cc: @osanseviero as originally initiated by him.
Note to the reviewers:
library_name
to the model cards. If there's a need, we can always revisit this. Addingtags
support should be easy now.