fix: improve gguf performance with torch.compile #8031

keturn · 2025-05-22T01:12:39Z

Summary

When using torch.compile with a Flux-type GGUF model, tlparse reports this error:

Unsupported method call
Explanation: Dynamo does not know how to trace method __contains__ of class set
Hint: Avoid calling set.__contains__ in your code.
Hint: Please report an issue to PyTorch.

This is in get_dequantized_tensor, which gets called frequently enough for this to have a significant influence.

Changing the collection from a set to a list is sufficient to make it compatible.

Related Issues / Discussions

See pytorch/pytorch#145761

QA Instructions

Run a GGUF.

Merge Plan

Checklist

The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
Documentation added / updated (if applicable)
Updated What's New copy (if doing a release after this PR)

psychedelicious

Looks like all we do w/ this constant is check if something is in it - no functional reason it needs to be in a set.

pytorch 2.7 does not implement `set.__contains__`, so make this a list instead. See pytorch/pytorch#145761

StrongerXi · 2025-06-05T00:38:14Z

@keturn hi I'm curious how torch.compile is used within InvokeAI?

keturn · 2025-06-05T16:03:24Z

@StrongerXi It's not used anywhere in core at the moment—this issue came up when I tried applying it in an extension for the Chroma model.

That did see some appreciable performance gains, so I tried to take advantage of that experience and do the same thing with Invoke's FLUX model, but that's been a fraught experience. I've been unable to explain why that gives me so much more trouble in compilation and so much less to gain for it, considering Chroma and FLUX are very nearly the same model.

(In fact, since I tried that with FLUX, Chroma's compilation only succeeds on its second invocation, failing with something about UserDefinedObjectVariable not having "proxy" the first time around. It was compiling okay without that issue a week ago and I'm at a loss as to what changed.)

StrongerXi · 2025-06-05T18:37:24Z

failing with something about UserDefinedVariableObject not having "proxy" the first time around. It was compiling okay without that issue a week ago and I'm at a loss as to what changed.

That sounds very similar to what we fixed in PyTorch nightly a few months ago, can you give nightly a try? This post has all the context.

keturn · 2025-06-05T19:46:56Z

Unfortunately nightly (2.8.0.dev20250605+cu128) does not seem to improve things. That category of error ("proxy") is still there, but unlike stable 2.7.1, running it a second time fails with a different error (in compute_ancestors) that I hadn't seen before.

StrongerXi · 2025-06-05T20:03:35Z

@keturn thanks, if you can provide a repro and/or a full error message running with TORCHDYNAMO_VERBOSE=1, that'll be awesome. I can help look into this.

keturn · 2025-06-05T21:40:02Z

I'm a long way from a minimal reproduction but I dumped some logs in pytorch/pytorch#155266

keturn requested review from lstein, blessedcoolant, hipsterusername and jazzhaiku as code owners May 22, 2025 01:12

github-actions bot added python PRs that change python files backend PRs that change backend files labels May 22, 2025

psychedelicious approved these changes May 22, 2025

View reviewed changes

psychedelicious enabled auto-merge (rebase) May 22, 2025 01:35

psychedelicious force-pushed the fix/gguf-compile-set branch from d550559 to 59b7b35 Compare May 22, 2025 01:35

fix: improve gguf performance with torch.compile

4d40b32

pytorch 2.7 does not implement `set.__contains__`, so make this a list instead. See pytorch/pytorch#145761

psychedelicious force-pushed the fix/gguf-compile-set branch from 59b7b35 to 4d40b32 Compare May 22, 2025 03:37

psychedelicious requested a review from maryhipp as a code owner May 22, 2025 03:37

psychedelicious merged commit 8bd52ed into invoke-ai:main May 22, 2025
12 checks passed

keturn deleted the fix/gguf-compile-set branch May 28, 2025 18:14

keturn mentioned this pull request Jun 5, 2025

torch.compile with InvokeAI: 'UserDefinedObjectVariable' object has no attribute 'proxy' / compute_ancestors KeyError op4 pytorch/pytorch#155266

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: improve gguf performance with torch.compile #8031

fix: improve gguf performance with torch.compile #8031

Uh oh!

keturn commented May 22, 2025

Uh oh!

psychedelicious left a comment

Uh oh!

Uh oh!

StrongerXi commented Jun 5, 2025

Uh oh!

keturn commented Jun 5, 2025 •

edited

Loading

Uh oh!

StrongerXi commented Jun 5, 2025

Uh oh!

keturn commented Jun 5, 2025

Uh oh!

StrongerXi commented Jun 5, 2025

Uh oh!

keturn commented Jun 5, 2025

Uh oh!

Uh oh!

fix: improve gguf performance with torch.compile #8031

fix: improve gguf performance with torch.compile #8031

Uh oh!

Conversation

keturn commented May 22, 2025

Summary

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

Uh oh!

psychedelicious left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

StrongerXi commented Jun 5, 2025

Uh oh!

keturn commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

StrongerXi commented Jun 5, 2025

Uh oh!

keturn commented Jun 5, 2025

Uh oh!

StrongerXi commented Jun 5, 2025

Uh oh!

keturn commented Jun 5, 2025

Uh oh!

Uh oh!

keturn commented Jun 5, 2025 •

edited

Loading