Mllama(single + dual) + InternVL(single) + Llava (single) #267

ochougul · 2025-02-10T21:05:53Z

Adding generalized infrastructure to support VLMs with Dual/single QPC approaches

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com> Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Co-authored-by: Amit Raj <quic_amitraj@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

1. Mllama single qpc support added 2. Simplified generate inputs for single and dual qpc --------- Signed-off-by: Amit Raj <quic_amitraj@quicinc.com> Co-authored-by: asmigosw <asmigosw@qti.qualcomm.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Added support for Laava model single QPC Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

QEfficient/generation/text_generation_inference.py

QEfficient/transformers/modeling_utils.py

QEfficient/transformers/models/internvl/modeling_internvl.py

anujgupt-github · 2025-02-11T09:14:52Z

QEfficient/transformers/models/modeling_auto.py


+class QEFFTransformersBase(QEFFBaseModel):


why is this entire part showing up as diff? possible to clean up the PR?

because we moved the QEFFAutoModelForCausalLM class to the bottom of the file from top as we wanted to support Intern via that.

As we discussed should I chnage it to use QEFFAutoModelImageTextTOText?

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>

QEfficient/transformers/modeling_utils.py

QEfficient/transformers/models/internvl/modeling_internvl.py

QEfficient/transformers/models/mllama/modeling_mllama.py

QEfficient/transformers/models/modeling_auto.py

QEfficient/transformers/models/pytorch_transforms.py

Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>

quic-hemagnih

Can we please add the description about newly added argument model

QEfficient/base/modeling_qeff.py

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

QEfficient/base/pytorch_transforms.py

QEfficient/transformers/models/modeling_auto.py

quic-hemagnih · 2025-02-14T01:40:51Z

QEfficient/transformers/models/modeling_auto.py

+        config = AutoConfig.from_pretrained(pretrained_model_name_or_path, trust_remote_code=True)
+        config._attn_implementation = "eager"
+        config.vision_config.use_flash_attn = "false"
+        model = cls._hf_auto_class.from_pretrained(pretrained_model_name_or_path, config, *args, **kwargs)


Why we are explicitly fetching the config and directly calling the from_pretrained() of AutoModelForImageTextToText. We could have called super.from_pretrained() and then internally it would have handled this

quic-hemagnih · 2025-02-14T01:48:09Z

QEfficient/transformers/models/modeling_auto.py

+            logger.warning("Updating low_cpu_mem_usage=False")
+
+        kwargs.update({"attn_implementation": "eager", "low_cpu_mem_usage": False})
+        model = cls._hf_auto_class.from_pretrained(pretrained_model_name_or_path, *args, **kwargs)


Instead of making these changes in derived class from_pretained() function shouldn't we make these changes in Base class and then call it from here, it might be good from design perspective

We think we can remove QEFFTransformerBaseClass we will need a bigger discussion to decide this.

quic-hemagnih · 2025-02-14T01:50:47Z

QEfficient/transformers/models/modeling_auto.py

+        pretrained_model_name_or_path,
+        *args,
+        **kwargs,
+    ):
        if kwargs.get("attn_implementation", None) not in {None, "eager"}:


why we are not calling super.from_pretained() here?, it already has this peice of code

because we need to change config options in this method that are not needed for other auto clases.

Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>

Adding generalized infrastructure to support VLMs with Dual/single QPC approaches --------- Signed-off-by: Amit Raj <quic_amitraj@quicinc.com> Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com> Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com> Co-authored-by: Amit Raj <quic_amitraj@quicinc.com> Co-authored-by: Amit Raj <168538872+quic-amitraj@users.noreply.github.com> Co-authored-by: asmigosw <asmigosw@qti.qualcomm.com> Signed-off-by: Hem Agnihotri <quic_hemagnih@quicinc.com>

ochougul requested a review from quic-rishinr as a code owner February 10, 2025 21:05

quic-rishinr and others added 15 commits February 10, 2025 21:09

Mllama Vision support (#254)

48d24da

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com> Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Co-authored-by: Amit Raj <quic_amitraj@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Compiler command fix

0bdeea5

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Export fix

649cd32

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Generate fix-1

32f544c

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

minor-fix

7ebf06b

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Model swap fix at the time of export and compile

3bc06be

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

two_qpc_working

5accf3f

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Minor-fix-1

6ae6835

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

working single and double with single soc

1e181f4

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

ruff checks and format

5fb0acb

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Updated factory class

87e07d0

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Added support for Llava model single QPC (#265)

fc323f4

Added support for Laava model single QPC Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Added support for InternVL single QPC (#264)

1ac6b62

Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

final revision VLM

b3a5d22

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

quic-amitraj force-pushed the mllama_single_dual_qpc branch from b9c0bc1 to b3a5d22 Compare February 10, 2025 21:10

quic-amitraj added 2 commits February 10, 2025 21:13

fixed liscence

d33e9a5

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

Minor fixes-1

905b703

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

anujgupt-github reviewed Feb 11, 2025

View reviewed changes

quic-amitraj and others added 3 commits February 11, 2025 12:05

Added get_input_info support and other compilers arguments

990adb1

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

generalized getting img_size from config

afe3cd2

Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>

refactor basic

b458193

Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>

vbaddi requested changes Feb 12, 2025

View reviewed changes

ochougul added 3 commits February 12, 2025 17:24

added warnings and auto_correct_inputs function

d1981c2

Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>

remove unused imports

a32007e

Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>

addressed comments

81cea10

Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>

quic-amitraj force-pushed the mllama_single_dual_qpc branch from 24fad68 to 81cea10 Compare February 13, 2025 09:09

final commit changed documentation added better warnings

2f1ec08

Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>

quic-hemagnih reviewed Feb 13, 2025

View reviewed changes

ochougul mentioned this pull request Feb 13, 2025

Support of Llama 3.2 and AutoModelForImageTextToText model class #260

Closed

quic-hemagnih reviewed Feb 13, 2025

View reviewed changes

QEfficient/base/modeling_qeff.py Outdated Show resolved Hide resolved

quic-amitraj force-pushed the mllama_single_dual_qpc branch from 2de6982 to f5efdae Compare February 13, 2025 09:50

quic-amitraj mentioned this pull request Feb 13, 2025

Transformers version upgrade #238

Closed

Addressed comments

ad594d7

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>

quic-amitraj force-pushed the mllama_single_dual_qpc branch from f5efdae to ad594d7 Compare February 13, 2025 11:49

quic-hemagnih reviewed Feb 13, 2025

View reviewed changes

QEfficient/base/pytorch_transforms.py Show resolved Hide resolved

quic-hemagnih reviewed Feb 14, 2025

View reviewed changes

QEfficient/transformers/models/modeling_auto.py Show resolved Hide resolved

quic-hemagnih reviewed Feb 14, 2025

View reviewed changes

QEfficient/transformers/models/modeling_auto.py Outdated Show resolved Hide resolved

quic-hemagnih reviewed Feb 14, 2025

View reviewed changes

ochougul added 3 commits February 14, 2025 13:07

minor bugfix

94e813c

Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>

addressed comments

a661840

Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>

removed image_text models tests to avoid pytest issues

ed7d5f2

Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>

ochougul merged commit d0ee7bc into main Feb 14, 2025
4 checks passed

This was referenced Feb 14, 2025

VLM Pipeline (Intern,LLava) #256

Closed

VLM Pipeline for Model Onboarding through QEff #261

Closed

Adding VLM pipeline #234

Closed

vbaddi mentioned this pull request Feb 14, 2025

Any plans on supporting Llama3.2 text and multimodal on Qualcomm AI 100? #152

Closed

quic-rishinr deleted the mllama_single_dual_qpc branch March 25, 2025 06:16

Mllama(single + dual) + InternVL(single) + Llava (single) #267

Mllama(single + dual) + InternVL(single) + Llava (single) #267

Uh oh!

Conversation

ochougul commented Feb 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anujgupt-github Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

ochougul Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

quic-hemagnih left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

quic-hemagnih Feb 14, 2025

Choose a reason for hiding this comment

Uh oh!

quic-hemagnih Feb 14, 2025

Choose a reason for hiding this comment

Uh oh!

ochougul Feb 14, 2025

Choose a reason for hiding this comment

Uh oh!

quic-hemagnih Feb 14, 2025

Choose a reason for hiding this comment

Uh oh!

ochougul Feb 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ochougul commented Feb 10, 2025 •

edited

Loading