Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Bug]: qwen cannot be quantized in vllm #10263

Closed
1 task done
yananchen1989 opened this issue Nov 12, 2024 · 4 comments
Closed
1 task done

[Bug]: qwen cannot be quantized in vllm #10263

yananchen1989 opened this issue Nov 12, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@yananchen1989
Copy link

Your current environment

gpu A10
vllm version: 0.6.3.post1

Model Input Dumps

No response

🐛 Describe the bug

for qwen series, such as Qwen/Qwen2.5-7B-Instruct, it seems that vllm cannot apply quantization to it.
no matther for bitsandbytes or awq .
even for unsloth version, unsloth/Qwen2.5-7B-Instruct-bnb-4bit it does not work either.

error message:
AttributeError: Model Qwen2ForCausalLM does not support BitsAndBytes quantization yet.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@yananchen1989 yananchen1989 added the bug Something isn't working label Nov 12, 2024
@jeejeelee
Copy link
Collaborator

jeejeelee commented Nov 12, 2024

The current release version indeed doesn't support this. It should be supported in the upcoming release version, see: #8941. You can either build from the main branch or refer to the latest code installation doc at: https://docs.vllm.ai/en/latest/getting_started/installation.html#install-the-latest-code

@yananchen1989
Copy link
Author

thanks.

could you also take a look at phi series such as microsoft/Phi-3.5-mini-instruct ?
same issue with qwen : bnb quantization error

@yananchen1989
Copy link
Author

@jeejeelee

@mgoin
Copy link
Member

mgoin commented Nov 12, 2024

Hi @yananchen1989, phi is also supported with bitsandbytes on vLLM main, so please wait for the next release

I have tested with

vllm serve unsloth/Phi-3.5-mini-instruct-bnb-4bit --quantization bitsandbytes --load-format bitsandbytes

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants