Fix llama conversion, improve parameter conversion #94

jlamypoirier · 2024-12-17T03:07:44Z

✨ Description

Fixes: #93

#55 added support for rope scaling, but requires the value to be a dict. However, the value can also be None, in which case #55 broke conversion.

This could not be solved in an easy, non-hacky way, because there is a variable set of entries on the HF side, which may depend on each other. I handled this by making a converter for the entire rope scaling dict on the HF side, but this has to be mapped to multiple values on the Fast-LLM side, which is also a problem. (Mapping to the whole rotary config dict wouldn't work because of rope theta.)

So the solution is to adapt parameter converters to take a variable number of parameters on each side. The rope scaling if the first use case for this, but I expect more in the future. Also it's the same thing we do for weight converters (and for exactly the same reason), so it's a natural next step. That makes converter definitions a bit more complicated since I had to enforce 2d arrays of tuples to make things less confusing and error-prone, but I think it's worth it in the long run.

Added the DEFAULT tag that gets replaced with the default value during config validation, so far it's used to avoid the ad-hoc None handling in rope config, but I'm sure it will have more uses elsewhere.

Added a plain llama model in the tests, it's the same as mistral but uses the llama converter.

Note that it will force some small changes in unmerged converters (only #5 and #84)

🔍 Type of change

Select all that apply:

🐛 Bug fix (non-breaking change that addresses a specific issue)
🚀 New feature (non-breaking change that adds functionality)
⚠️ Breaking change (a change that could affect existing functionality)
📈 Performance improvement/optimization (improves speed, memory usage, or efficiency)
🛠️ Code refactor (non-functional changes that improve code readability, structure, etc.)
📦 Dependency bump (updates dependencies, including Dockerfile or package changes)
📝 Documentation change (updates documentation, including new content or typo fixes)
🔧 Infrastructure/Build change (affects build process, CI/CD, or dependencies)

jlamypoirier added 3 commits December 16, 2024 21:35

Fix llama conversion, improve parameter conversion

5e2efe6

cleanup

1dc5def

Fix, add llama config

b234331

jlamypoirier marked this pull request as ready for review December 17, 2024 05:48

jlamypoirier requested review from RaymondLi0 and tscholak December 17, 2024 05:50

jlamypoirier changed the title ~~Fix improve conversion~~ Fix llama conversion, improve parameter conversion Dec 17, 2024

jlamypoirier merged commit d8f3390 into main Dec 17, 2024
4 checks passed

jlamypoirier deleted the fix_improve_conversion branch December 17, 2024 21:11

jlamypoirier mentioned this pull request Dec 17, 2024

olmoe HF conversion #84

Closed

25 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix llama conversion, improve parameter conversion #94

Fix llama conversion, improve parameter conversion #94

jlamypoirier commented Dec 17, 2024 •

edited

Loading

Fix llama conversion, improve parameter conversion #94

Fix llama conversion, improve parameter conversion #94

Conversation

jlamypoirier commented Dec 17, 2024 • edited Loading

✨ Description

🔍 Type of change

jlamypoirier commented Dec 17, 2024 •

edited

Loading