Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Support qwenvl model for HPU #793

Open
wants to merge 1 commit into
base: habana_main
Choose a base branch
from

Conversation

yingjie-han
Copy link

@yingjie-han yingjie-han commented Feb 7, 2025

This PR aims to support qwenvl vision infer on HPU.

Issue to solve

The function merge_multimodal_embeddings() in utils.py has dynamic problem on HPU.

Solution

Flatten the embeddings tensor , and use index_put_() to merge the multimodal embeddings in qwen.py instead of calling merge_multimodal_embeddings() in utils.py.

Test

Single image
python examples/offline_inference/vision_language.py -m qwen_vl

Multiple images
python examples/offline_inference/vision_language_multi_image.py -m qwen_vl_chat

@yingjie-han
Copy link
Author

@michalkuligowski @jikunshang @PatrykWo could you help to review the code?

inputs_embeds = merge_multimodal_embeddings(
input_ids, inputs_embeds, multimodal_embeddings,
self.transformer.visual.image_pad_id)
batch_size, seq_length, hidden_size = inputs_embeds.shape

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please solve merge conflicts

Copy link
Author

@yingjie-han yingjie-han Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michalkuligowski Merge conflicts has been solved. Please review it. Thanks

inputs_embeds = merge_multimodal_embeddings(
input_ids, inputs_embeds, multimodal_embeddings,
self.transformer.visual.image_pad_id)
batch_size, seq_length, hidden_size = inputs_embeds.shape

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldnt be in model definition. Please try fixing the merge_multimodal_embeddings method. You can check whether its hpu to call your implementation

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
New Model Issue o PR to enable a new model
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants