-
Notifications
You must be signed in to change notification settings - Fork 11.5k
AWQ: Activation-aware Weight Quantization??? #1685
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
Are you interested in implementing this new algorithm in Llama.cpp? The performance with 3 bits seems amazing |
Sorry,I'm not pro developer 😔 |
If my understanding is correct, it is that GPTQ and AWQ are stored in very similar formats and can be stored in the same format as well. |
Hi everyone, I have tried to make a PR to add AWQ. I really appreciate the comments to make it better, thanks! |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
https://github.com/mit-han-lab/llm-awq
AWQ: Activation-aware Weight Quantization sounds interesting 🧐🧐🧐
The text was updated successfully, but these errors were encountered: