support for llama3 in autoquant #67

CrispStrobe · 2024-04-28T11:54:43Z

... would need vocab_type bpe, see here for illustration
https://colab.research.google.com/drive/1q1hTxLZOCRf9n0KdxSSu3tD0EI5QufrV?usp=sharing
(i also made a few adaptions for faster running for my use case)
thank you and keep up the great work!!

CrispStrobe · 2024-05-10T14:34:12Z

in the meanwhile, there is also a fix for the pretokenizer. i have included it in this Kaggle notebook. of course you can adapt it if you wish.

mlabonne · 2024-05-15T15:57:58Z

Sorry for the slow response, thanks a lot for opening this issue. I saw a lot of comments about issues with the tokenization in GGUF, so I don't know if it's the right time to update AutoQuant.

I like your improvements in the first notebook. Do you think I should transfer them or should I wait until the situation is fixed?

CrispStrobe · 2024-05-15T19:15:14Z

indeed might be better to wait with regard to the pre-tokenizer. i am not completely sure i understood the procedure for new models like say llama3 merges. but my current understanding is illustrated by this updated kaggle script.
there is also now a problem with older models: there are some models, like phi2, which need convert-hf-to-gguf.py and not convert.py. and after the new pre-tokenizer-fix, some of these will not easily work now.
i wonder why the script not simply falls back on default in such cases. my workaround is to just use an older version for such cases.
so atm we have at least 3 number of cases afaik:

old models like phi2 ==> older convert-hf-to-gguf.py
new bpe models like llama3 ==> newer convert-hf-to-gguf.py with complicated pre-tokenizer-handling
others ==> convert.py

CrispStrobe changed the title ~~support for llama3~~ support for llama3 in autoquant Apr 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support for llama3 in autoquant #67

support for llama3 in autoquant #67

CrispStrobe commented Apr 28, 2024

CrispStrobe commented May 10, 2024

mlabonne commented May 15, 2024

CrispStrobe commented May 15, 2024

support for llama3 in autoquant #67

support for llama3 in autoquant #67

Comments

CrispStrobe commented Apr 28, 2024

CrispStrobe commented May 10, 2024

mlabonne commented May 15, 2024

CrispStrobe commented May 15, 2024