-
-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[Doc]: Clarify QLoRA (Quantized Model + LoRA) Support in Documentation #13179
Comments
I think this means that transformers-fallback doesn't support these 2 features. For models integrated with vllm, we support QLoRA. BTW, Afer #13166 was landed, I think |
Would be great, if you could point me to a more specific example, my understanding of vllm/transformers isn't too deep. Take qwen2 for example, it is integrated (if I understand correctly) here |
Could you please provide more detailed information, such as log information and errors |
The documentation does not state otherwise. The documentation explicitly states that quantisation and LoRA are not compatible together with the Transformers fallback. |
I see now, thanks for clarifying. It's easy to miss on the docs site, that
vllm/docs/source/models/supported_models.md Lines 57 to 59 in 4c0d93f
|
Ok, we should make that clearer. Thank you for the feedback! |
The documentation change in #12960 should help with this |
📚 The doc issue
Two parts of the documentation appear to contradict each other, especially at first glance.
Here, it is explicitly stated that LoRA inference with a quantized model is not supported:
vllm/docs/source/models/supported_models.md
Lines 59 to 61 in 4c0d93f
However, here, an example is provided for running offline inference with a quantized model and a LoRA adapter:
vllm/examples/offline_inference/lora_with_quantization_inference.py
Lines 3 to 4 in 4c0d93f
To resolve this confusion, it would be very helpful to clarify the following points directly (please correct me if I am mistaken):
Edit:
It's easy to miss on the docs site, that
##### LORA and quantization
is a subsection of### Transformers fallback
, that's why I was confused.vllm/docs/source/models/supported_models.md
Line 43 in 4c0d93f
vllm/docs/source/models/supported_models.md
Lines 57 to 59 in 4c0d93f
The text was updated successfully, but these errors were encountered: