Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fix LoRa weight merging in export #19

Merged
merged 1 commit into from
Mar 16, 2023
Merged

Conversation

antimatter15
Copy link
Contributor

I was trying to export the model for use with llama.cpp but noticed that it just copied weights from the base model verbatim. I'm suspect there's a "right" way to do this, but this was the approach that worked for me.

@tloen
Copy link
Owner

tloen commented Mar 16, 2023

Have you tested that the existing code doesn't work? Assuming you're not loading the foundation model in 8bit, the call to .eval() should merge the LoRA weights into the base weights.

@antimatter15
Copy link
Contributor Author

antimatter15 commented Mar 16, 2023 via email

@tloen
Copy link
Owner

tloen commented Mar 16, 2023

eval() should be an alias for train(false), and merge_weights should default to true. I'll look into it tomorrow morning.

@tloen tloen merged commit 6681523 into tloen:main Mar 16, 2023
@tloen
Copy link
Owner

tloen commented Mar 16, 2023

Can't hurt

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants