-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
📌 AutoAWQ Roadmap #32
Comments
Hey Casper, first of all, amazing work! I'm actually really curious - what's the reasoning behind supporting legacy models such as GPT-2 or GPT-J/OPT that are already in? In my perception, the latest developments mostly on MPT/Llama 2 are by orders of magnitude better than the legacy models. |
Supporting older models is on the roadmap because people still use those models and ask for them. However, I do try to focus my efforts on optimizing the newer models. |
yi-34b 能支持吗?看数据这个模型很牛叉啊。 |
Yi is now supported on the main branch |
Can you please implement Phi 1.5 support? Thank you for all the amazing work! |
Hi Casper, thank you for your wonderful work! I wonder if there is some tutorial for adding support for new model? I have noticed that Baichuan is on the roadmap. I would like try to add support for this model, could you please give me some pointer on how to support new model? |
@xTayEx I do not have a written guide, but here are the steps:
For creating the model class, look into the llama class or other classes to see how they are defined. |
Phi 1.5 support has been attempted, but they have a very unusual model definition. Until it's been standardized, I am not sure I will support it. |
Oh :( Do you mean until a new phi model comes out? What would roughly be the steps to implement it on our own? |
Hi @casper-hansen First of all thank you for the Amazing work. From my understanding there is an AWQ TheBloke Mixtral 8x7b Base Instruct version. I tried to run inference on it and ran into issues. Would this model be supported? Also is there a way to contribute with a donation? |
We achieved most items on the roadmap, so closing this for now to focus on other things. |
Optimization
split_k_iters
Create tuning section in quant_config #39More models
gpt-neox
model #41Ease of access
Software integration and quality
The text was updated successfully, but these errors were encountered: