[Bug]: qwen cannot be quantized in vllm #10263

yananchen1989 · 2024-11-12T16:09:56Z

Your current environment

gpu A10
vllm version: 0.6.3.post1

Model Input Dumps

No response

🐛 Describe the bug

for qwen series, such as Qwen/Qwen2.5-7B-Instruct, it seems that vllm cannot apply quantization to it.
no matther for bitsandbytes or awq .
even for unsloth version, unsloth/Qwen2.5-7B-Instruct-bnb-4bit it does not work either.

error message:
AttributeError: Model Qwen2ForCausalLM does not support BitsAndBytes quantization yet.

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

jeejeelee · 2024-11-12T16:30:45Z

The current release version indeed doesn't support this. It should be supported in the upcoming release version, see: #8941. You can either build from the main branch or refer to the latest code installation doc at: https://docs.vllm.ai/en/latest/getting_started/installation.html#install-the-latest-code

yananchen1989 · 2024-11-12T17:16:22Z

thanks.

could you also take a look at phi series such as microsoft/Phi-3.5-mini-instruct ?
same issue with qwen : bnb quantization error

yananchen1989 · 2024-11-12T17:16:28Z

@jeejeelee

mgoin · 2024-11-12T20:41:51Z

Hi @yananchen1989, phi is also supported with bitsandbytes on vLLM main, so please wait for the next release

I have tested with

vllm serve unsloth/Phi-3.5-mini-instruct-bnb-4bit --quantization bitsandbytes --load-format bitsandbytes

yananchen1989 added the bug Something isn't working label Nov 12, 2024

yananchen1989 closed this as completed Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: qwen cannot be quantized in vllm #10263

[Bug]: qwen cannot be quantized in vllm #10263

yananchen1989 commented Nov 12, 2024

jeejeelee commented Nov 12, 2024 •

edited

Loading

yananchen1989 commented Nov 12, 2024

yananchen1989 commented Nov 12, 2024

mgoin commented Nov 12, 2024 •

edited

Loading

[Bug]: qwen cannot be quantized in vllm #10263

[Bug]: qwen cannot be quantized in vllm #10263

Comments

yananchen1989 commented Nov 12, 2024

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...

jeejeelee commented Nov 12, 2024 • edited Loading

yananchen1989 commented Nov 12, 2024

yananchen1989 commented Nov 12, 2024

mgoin commented Nov 12, 2024 • edited Loading

jeejeelee commented Nov 12, 2024 •

edited

Loading

mgoin commented Nov 12, 2024 •

edited

Loading