Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Rewrite mergekit-extract-lora #505

Merged
merged 13 commits into from
Feb 7, 2025
Merged

Rewrite mergekit-extract-lora #505

merged 13 commits into from
Feb 7, 2025

Conversation

cg123
Copy link
Collaborator

@cg123 cg123 commented Feb 7, 2025

Now with better embedding handling, multi-gpu execution, and lazy loading/saving of tensors.

When extracting a LoRA from an 8B model, execution time goes from ~6 minutes down to 40 seconds with --cuda --multi-gpu on an 8-GPU machine.

Additionally, the --sv-epsilon flag can be used to set a tolerance for singular values to opportunistically reduce rank when the fine tuned difference is inherently lower rank.

Also reimplement a couple of merge methods using the @easy_define decorator and add some missing tests.

@cg123 cg123 merged commit a2dda31 into main Feb 7, 2025
8 checks passed
@cg123 cg123 deleted the rewrites branch February 7, 2025 00:51
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant