VLM: Model Tracing Guide #1030

kylesayrs · 2025-01-02T23:49:44Z

Purpose

This guide explains the concepts of tracing as they relate to LLM Compressor and how to modify your model to support recipes which require using the Sequential Pipeline.

Through reading this guide, you will learn

Why tracing is required when compressing with recipes involving the Sequential Pipeline and modifiers such as GPTQModifier
How to determine if your model is traceable for your dataset
How to modify your model definition to be traceable

Prerequisites

Explicit dataset tokenizer text kwarg #1031

Changes

Add a model tracing guide src/llmcompressor/transformers/tracing/README.md with pictures
Add a readme for the sequential pipeline which points to the Tracing Guide src/llmcompressor/pipelines/sequential/README.md
Add a debug script to help users debug their models for traceability src/llmcompressor/transformers/tracing/debug.py
- Add the llm-compressor.attempt_trace entrypoint for ease of use
Swap the order of arguments in llava_example.py and and pixtral_example.py to match the order of arguments on the modifier

Testing

Use the llmcompressor.attempt_trace debug script

llmcompressor.attempt_trace \
    --model_id llava-hf/llava-1.5-7b-hf
    --model_class TraceableLlavaForConditionalGeneration
    --sequential-targets LlamaDecoderLayer
    --ignore "re:.*lm_head" "re:vision_tower.*" "re:multi_modal_projector.*"
    --multimodal_data

Stretch

It might be nice if this tracing debugger tool also printed the model graph to an svg

…s. Requires patching modeling_llava

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

…tokenized datasets should not be given labels Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

…ataset

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

…anup-custom-dataset

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

## Purpose ## * Allow VLM processors to be used to tokenize datasets with prompt keys ## Postrequisites ## * #1030 ## Changes ## * Use `text` argument name for tokenizing the prompt column ## Testing ## * w.r.t. tokenizers, using the `text` kwarg follows the precedent set by [PretrainedTokenizerBase](https://github.com/huggingface/transformers/blob/main/src/transformers/tokenization_utils_base.py#L2790) * w.r.t. processors, most processors use the text kwarg Below are all the models I know to be compatible with this change, I'm assuming that most other processors follow the same standard 1. [llama](https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/tokenization_llama.py#L233) 2. [pixtral](https://github.com/huggingface/transformers/blob/main/src/transformers/models/pixtral/processing_pixtral.py#L160) 3. [phi3_vision](https://huggingface.co/microsoft/Phi-3.5-vision-instruct/blob/main/processing_phi3_v.py#L321) 4. [mllama](https://github.com/huggingface/transformers/blob/main/src/transformers/models/mllama/processing_mllama.py#L232) 5. [qwen2_vl](https://github.com/huggingface/transformers/blob/main/src/transformers/models/qwen2_vl/processing_qwen2_vl.py#L71) Example of using VLM processor to tokenize a dataset with prompt key ```python3 from transformers import AutoProcessor from llmcompressor.transformers import DataTrainingArguments, TextGenerationDataset models_to_test = [ "meta-llama/Meta-Llama-3-8B-Instruct", "mistralai/Mixtral-8x7B-Instruct-v0.1", "Qwen/Qwen2-VL-2B-Instruct", # fails without changes "mgoin/pixtral-12b", # fails without changes ] for model_id in models_to_test: processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True) data_args = DataTrainingArguments( dataset="ultrachat-200k", splits={"calibration": "test_sft[:1]"} ) dataset = TextGenerationDataset.load_from_registry( data_args.dataset, data_args=data_args, split=data_args.splits["calibration"], processor=processor, )(add_labels=False) ``` Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

setup.py

src/llmcompressor/transformers/tracing/debug.py

horheynm · 2025-01-13T17:53:00Z

src/llmcompressor/transformers/tracing/debug.py

+    return model_cls
+
+
+def parse_args():


We have click in the setup.py, might be worth using for cli

I don't really see a good reason to https://click.palletsprojects.com/en/stable/why/#why-not-argparse

src/llmcompressor/transformers/tracing/README.md

horheynm · 2025-01-13T18:55:02Z

src/llmcompressor/transformers/tracing/README.md

+legacy_processing = (
+    (input_ids == self.config.image_token_index).sum(1).max() < self.config.image_seq_length
+) or (input_ids.shape[-1] == 1 and pixel_values is not None).item()
+```


I read the whole thing.

I like how much time and thought you put into making this doc.

Right now, the audience needs to read until 3rd paragraph to know what the problem is and when to use the tracing -- encoder-decoder models using GPTQ, SparseGPT Modifiers. If we move those to the intro, it will be clearer for the audience to know if the doc is applicable to them or not.

Then a small paragraph introducing what 1, 2, and 3 will be helpful for --
1 shows the description of why it cannot use the previous methods and why the seq pipeline solves the problem, 2. is how to run using cli, and 3. is debugging/contribution.

This way I think the audience can have an easier time navigating to the appropriate section by reading less.

Right now, the audience needs to read until 3rd paragraph to know what the problem is and when to use the tracing

As for when to use tracing, that's described in the second sentence

Through reading this guide, you will learn 1. Why tracing is required when compressing with recipes involving the Sequential Pipeline and modifiers such as GPTQModifier

As for what the problem is, that's described in the first section

## 1. Why is Tracing Required? ##

Right now, the audience needs to read until 3rd paragraph to know what the problem is and when to use the tracing -- encoder-decoder models using GPTQ, SparseGPT Modifiers

That's incorrect, tracing is used for all model architectures, not just encoder-decoder models. As described in the second paragraph, tracing is used when compressing with recipes involving the Sequential Pipeline and modifiers such as GPTQModifier.

Then a small paragraph introducing what 1, 2, and 3 will be helpful for
This way I think the audience can have an easier time navigating to the appropriate section by reading less.

I think the section titles + the list of things you will learn from reading each of the sections is enough context for a reader to go on. For example, if the reader doesn't care about the why, they can skip 1. If the reader doesn't care about what tracability is, they can skip 2. If the reader doesn't care about how to make a model traceable, they can skip 3.

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs added 30 commits November 21, 2024 18:00

able to run without hooks

188896e

issue with different sizes

8ef9c23

able to run through pixtral without issue and using real proxy tensor…

1362ca2

…s. Requires patching modeling_llava

nits

0539df7

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

Merge remote-tracking branch 'origin' into kylesayrs/HooksMixin

a734393

Merge branch 'kylesayrs/HooksMixin' into kylesayrs/gptq-partition

ea10aed

fix all variable

ed96ee4

tmp

5f26711

wip

ebc2c41

wip

922b407

testing with lots of models

0577f36

preliminary data pipeline

3830696

WIP

1ecaa39

delete unnecessary files

9aa9679

Merge remote-tracking branch 'origin' into kylesayrs/gptq-partition

7e6fe17

Merge branch 'kylesayrs/gptq-hooks' into kylesayrs/gptq-partition

034c0b1

clean up CustomDataset

a62617c

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

chchchchanges

57b5e02

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

wip: use rename to processor, going through tests

fa317fd

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

remove labels from calibration dataset rather than assuming that all …

f3f5875

…tokenized datasets should not be given labels Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

cleanup

58c3afe

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

cleanup, etc

72aecfc

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

Merge remote-tracking branch 'origin' into kylesayrs/cleanup-custom-d…

77217fb

…ataset

fix typehinting

4461a3e

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

add typechecking imports

fb33001

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

remove sparseml utilities

bf4744a

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

Merge branch 'kylesayrs/remove-sparseml-utilities' into kylesayrs/cle…

62ae31d

…anup-custom-dataset

use in model_load

7e516c1

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

Merge branch 'main' into kylesayrs/calculate_offload_default_gpus

d69106e

remove use of RECIPE FILE NAME

9e33641

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs added 10 commits January 9, 2025 20:24

fix link

3230f88

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

sequential targets and ignore

f62dadd

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

guide roadmapping

5a92be0

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

Defining your own Traceable Model Definitions

d906bc5

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

fix links

eedfc5a

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

WIP

7040bdf

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

WIP

0161feb

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

add argparse

ea46517

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

WIP: more progress

76e6078

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

add attempt_trace entrypoint

adadbec

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs self-assigned this Jan 10, 2025

kylesayrs added 3 commits January 10, 2025 05:39

general readability, typos

eef15b4

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

first draft readme

407f325

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

fix link

50301b7

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs marked this pull request as ready for review January 10, 2025 06:08

kylesayrs requested review from mgoin, dsikka and horheynm January 10, 2025 06:09

Merge branch 'main' into kylesayrs/traceability-readme

0e3e8bd

horheynm reviewed Jan 13, 2025

View reviewed changes

setup.py Outdated Show resolved Hide resolved

horheynm reviewed Jan 13, 2025

View reviewed changes

src/llmcompressor/transformers/tracing/debug.py Outdated Show resolved Hide resolved

horheynm reviewed Jan 13, 2025

View reviewed changes

src/llmcompressor/transformers/tracing/README.md Outdated Show resolved Hide resolved

horheynm reviewed Jan 13, 2025

View reviewed changes

kylesayrs requested a review from horheynm January 13, 2025 19:25

kylesayrs added 4 commits January 13, 2025 19:28

partial derivatives are not alphanumeric

feeb67e

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

rename attempt_trace to trace

d6441f5

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

Merge remote-tracking branch 'origin' into kylesayrs/traceability-readme

3bd3ca7

Merge branch 'main' into kylesayrs/traceability-readme

bb7ca2e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VLM: Model Tracing Guide #1030

VLM: Model Tracing Guide #1030

kylesayrs commented Jan 2, 2025 •

edited

Loading

horheynm Jan 13, 2025

kylesayrs Jan 13, 2025

horheynm Jan 13, 2025

kylesayrs Jan 13, 2025 •

edited

Loading

kylesayrs Jan 13, 2025 •

edited

Loading

kylesayrs Jan 13, 2025 •

edited

Loading

VLM: Model Tracing Guide #1030

Are you sure you want to change the base?

VLM: Model Tracing Guide #1030

Conversation

kylesayrs commented Jan 2, 2025 • edited Loading

Purpose

Prerequisites

Changes

Testing

Stretch

horheynm Jan 13, 2025

Choose a reason for hiding this comment

kylesayrs Jan 13, 2025

Choose a reason for hiding this comment

horheynm Jan 13, 2025

Choose a reason for hiding this comment

kylesayrs Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

kylesayrs Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

kylesayrs Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

kylesayrs commented Jan 2, 2025 •

edited

Loading

kylesayrs Jan 13, 2025 •

edited

Loading

kylesayrs Jan 13, 2025 •

edited

Loading

kylesayrs Jan 13, 2025 •

edited

Loading