-
Notifications
You must be signed in to change notification settings - Fork 43
Added changes to ensure mxint8 compilations of VLMs work. #336
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
base: main
Are you sure you want to change the base?
Conversation
@@ -1385,7 +1342,7 @@ def from_pretrained( | |||
model, kv_offload=kv_offload | |||
) | |||
|
|||
return cls(model, is_tlm=is_tlm, continuous_batching=continuous_batching) | |||
return cls(model, is_tlm=is_tlm, continuous_batching=continuous_batching, enable_qnn=enable_qnn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please inform QNN team about this change before merging, so that nothing breaks at their end due to this change.
ctx_len: int = 150, | ||
full_batch_size: Optional[int] = None, | ||
kv_cache_batch_size: Optional[int] = None, | ||
encoder_ctx_len: int = 1500, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
always use meaningful constants rather than magic numbers in the code
108866f
to
03f0676
Compare
e1a9044
to
068d3c6
Compare
Modified modelling files of InternVL and Llava to have 'vision_embeds' as the name of the image_embeddings. Modified modeling_auto file to incorporate mxint8 modifications for VLMs. LIMITATIONS: It is expected that the Processor of a model always gives vision components in 'float16'. Signed-off-by: quic-dhirajku <quic_dhirajku@quicinc.com> Signed-off-by: Dhiraj Kumar Sah <quic_dhirajku@quicinc.com>
… out older qnn based changes. Signed-off-by: quic-dhirajku <quic_dhirajku@quicinc.com> Signed-off-by: Dhiraj Kumar Sah <quic_dhirajku@quicinc.com>
068d3c6
to
dc3b639
Compare
Signed-off-by: Dhiraj Kumar Sah <quic_dhirajku@quicinc.com>
Looks good to me. Verified in vLLM. |
Modified modelling files of InternVL and Llava to have 'vision_embeds' as the name of the image_embeddings.
Modified modeling_auto file to incorporate mxint8 modifications for VLMs.
LIMITATIONS: It is expected that the Processor of a model always gives vision components in 'float16'.