Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fix: loading DBRX back from saved path #35728

Merged
merged 4 commits into from
Jan 28, 2025

Conversation

zucchini-nlp
Copy link
Member

@zucchini-nlp zucchini-nlp commented Jan 16, 2025

What does this PR do?

Fixes huggingface/trl#2574 and adds a test for that. Plus fixes one edge case when the composite model is loaded with torch_dtype=torch.float16 (i.e. the dtype is not string or dict) by adding one small condition

Copy link
Member

@qgallouedec qgallouedec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR does solve huggingface/trl#2574 👍

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I don't know exactly if these are already there, but good to check that we are handling most cases now:

  • from pretrained of a saved model that has different type for sub configs
  • same but with arguments that change the dtype
    etc ! 🤗

@zucchini-nlp
Copy link
Member Author

Added a comment in existing test that the tiny model dtype is fp32, and we test by loading it with full and half precisions

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@zucchini-nlp zucchini-nlp merged commit b764c20 into huggingface:main Jan 28, 2025
25 checks passed
ArthurZucker pushed a commit that referenced this pull request Jan 30, 2025
* fix dtype as dict for some models + add test

* add comment in tests
bursteratom pushed a commit to bursteratom/transformers that referenced this pull request Jan 31, 2025
* fix dtype as dict for some models + add test

* add comment in tests
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ValueError: Found unknown kwargs when loading DbrxForCausalLM
4 participants