Upgrade pytorch to 2.6, python to 3.12 #7847

ebr · 2025-03-26T21:16:23Z

Summary

upgrade pytorch to 2.6.0 and unpin some of the strict dependency pins to facilitate upgrading co-dependencies. Some depenencies are now unpinned to help continuously stay up to date; uv.lock will be used for reproducibility.
allow python 3.12 as we're no longer constrained by the torch version
use uv.lock to pin dependencies - this will help with reproducible builds
refactor Dockerfile; get rid of multi-stage build; upgrade to python 3.12. This should make the images smaller, the builds faster, and could potentially allow us to utilize docker layer cache in GHA. Notably, changing any application code will only update only the very last layer, and will not result in rebuilding of the entire virtualenv, which means with proper caching changes between images will be tiny and builds will be super fast
unpin timm and controlnet-aux - see also Change timm and controlnet-aux pins to fix LLaVA model support #7846 - may need to rebase this PR if that one is merged first
add python 3.12 to the test matrix, update docs to recommend python 3.12 and use uv to manage it
remove some obsoleted depenencies that were used by the CLI
reintroduce GPU_DRIVER build arg in CI container build, as it has apparently been removed, meaning our "rocm" and "cpu" docker images were just cuda images for some time

QA Instructions

Usual manual testing - these changes have a large surface area, so we'll rely on the community to help test all edge cases.
special attention should be given to testing the Docker container.

Merge Plan

Checklist

The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
Documentation added / updated (if applicable)
Updated What's New copy (if doing a release after this PR)

jazzhaiku

LGTM 👍

pyproject.toml

psychedelicious · 2025-03-26T22:45:34Z

How do we regenerate uv.lock and how does it interact with pins in pyproject.toml?

psychedelicious · 2025-03-26T23:04:58Z

We need to be careful about this change and its interaction w/ launcher.

Though the launcher supports "pinning" versions of invoke to different versions of python, that information is hardcoded and we'll need to do a launcher release.

I suspect attempting to install the version of Invoke that enforces python 3.12 without updating the launcher beforehand will result in errors.

Maybe we should figure out how to make the launcher query PyPI to get the python version required by a package?

ebr · 2025-03-27T14:30:42Z

How do we regenerate uv.lock and how does it interact with pins in pyproject.toml?

uv lock re-runs dependency resolution and updates the uv.lock file. pins in pyproject.toml still guide the resolution process and can be as restrictive or loose as we want them to be, but the lockfile pins dependencies to the actual hashes in the index. Works the same way as package-lock.json.

we do still need to test how uv.lock interacts with our use of different indexes, such as for rocm and cpu-only pytorch. if there's breakage, there's a "conflicting extras" syntax we might be able to use to work around it: https://docs.astral.sh/uv/reference/settings/#conflicts

ebr · 2025-03-27T14:42:44Z

We need to be careful about this change and its interaction w/ launcher.

100% agreed, we'd need to enable the launcher to install different python versions

Maybe we should figure out how to make the launcher query PyPI to get the python version required by a package?

depends on whether we want to continue supporting <=3.11 ... but yes, the launcher could easily query pypi for this info:

$ curl -s https://pypi.org/pypi/invokeai/json |jq '.info.requires_python'
"<3.12,>=3.10"

we can also query for any particular release | jq '.releases where releases is a mapping of vX.Y.Z : [ { <pkg_info> }, {...} ]

psychedelicious · 2025-03-31T07:58:49Z

I believe the remaining failing python tests CI failures will be resolved by #7826. Some magic pydantic bs.

psychedelicious · 2025-03-31T08:14:59Z

Rebased on main & realized I didn't update the uv lockfile correctly before, which caused the typegen checks CI workflow to fail. Fixed that now

psychedelicious · 2025-03-31T08:26:49Z

I believe this PR is ready. At least, all CI is passing. We need to check on how it might interact with the launcher and existing installs. So I've marked it as draft to prevent a premature merge.

…itate upgrading co-dependencies. we will use uv.lock to ensure reproducibility

…3.12

…arently been removed

… when loading them In `ObjectSerializerDisk`, we use `torch.load` to load serialized objects from disk. With torch 2.6.0, torch defaults to `weights_only=True`. As a result, torch will raise when attempting to deserialize anything with an unrecognized class. For example, our `ConditioningFieldData` class is untrusted. When we load conditioning from disk, we will get a runtime error. Torch provides a method to add trusted classes to an allowlist. This change adds an arg to `ObjectSerializerDisk` to add a list of safe globals to the allowlist and uses it for both `ObjectSerializerDisk` instances. Note: My first attempt inferred the class from the generic type arg that `ObjectSerializerDisk` accepts, and added that to the allowlist. Unfortunately, this doesn't work. For example, `ConditioningFieldData` has a `conditionings` attribute that may be one some other untrusted classes representing model-specific conditioning data. So, even if we allowlist `ConditioningFieldData`, loading will fail when torch deserializes the `conditionings` attribute.

In pydantic/pydantic#10029, pydantic made an improvement to its generated JSON schemas (OpenAPI schemas). The previous and new generated schemas both meet the schema spec. When we parse the OpenAPI schema to generate node templates, we use some typeguard to narrow schema components from generic OpenAPI schema objects to a node field schema objects. The narrower node field schema objects contain extra data. For example, they contain a `field_kind` attribute that indicates it the field is an input field or output field. These extra attributes are not part of the OpenAPI spec (but the spec allows does allow for this extra data). This typeguard relied on a pydantic implementation detail. This was changed in the linked pydantic PR, which released with v2.9.0. With the change, our typeguard rejects input field schema objects, causing parsing to fail with errors/warnings like `Unhandled input property` in the JS console. In the UI, this causes many fields - mostly model fields - to not show up in the workflow editor. The fix for this is very simple - instead of relying on an implementation detail for the typeguard, we can check if the incoming schema object has any of our invoke-specific extra attributes. Specifically, we now look for the presence of the `field_kind` attribute on the incoming schema object. If it is present, we know we are dealing with an invocation input field and can parse it appropriately.

Ran `uv lock --upgrade-package pydantic`

Doing a dev build so I can test the launcher.

The launcher will query this file to get the pins needed for installation

psychedelicious · 2025-04-04T07:43:36Z

Superseded by #7873, which I needed to create as a branch on the origin repo to do testing of CI changes. I did quite a bit of extra work on it.

ebr requested review from psychedelicious, blessedcoolant, maryhipp, hipsterusername, lstein and jazzhaiku as code owners March 26, 2025 21:16

github-actions bot added CI-CD Continuous integration / Continuous delivery docker Root frontend-deps PRs that change frontend dependencies frontend PRs that change frontend files docs PRs that change docs python-deps PRs that change python dependencies labels Mar 26, 2025

hipsterusername approved these changes Mar 26, 2025

View reviewed changes

psychedelicious requested a review from brandonrising as a code owner March 26, 2025 22:11

github-actions bot added api python PRs that change python files backend PRs that change backend files services PRs that change app services labels Mar 26, 2025

jazzhaiku approved these changes Mar 26, 2025

View reviewed changes

github-actions bot added the python-tests PRs that change python tests label Mar 26, 2025

psychedelicious reviewed Mar 26, 2025

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

ebr force-pushed the ebr/torch-260 branch from b16c32c to 99774b2 Compare March 27, 2025 13:49

psychedelicious force-pushed the ebr/torch-260 branch from a197b16 to b728e18 Compare March 31, 2025 08:14

psychedelicious force-pushed the ebr/torch-260 branch from b728e18 to 2bdd3f2 Compare March 31, 2025 08:16

psychedelicious mentioned this pull request Mar 31, 2025

refactor(nodes): invocation registration logic #7826

Merged

4 tasks

psychedelicious added the DO NOT MERGE label Mar 31, 2025

psychedelicious marked this pull request as draft March 31, 2025 08:24

ebr force-pushed the ebr/torch-260 branch from 2bdd3f2 to 5edcccc Compare April 2, 2025 13:28

ebr and others added 18 commits April 3, 2025 08:55

upgrade pytorch and unpin some of the strict dependency pins to facil…

b4c276b

…itate upgrading co-dependencies. we will use uv.lock to ensure reproducibility

use uv.lock to pin dependencies

9a0efb3

refactor Dockerfile; get rid of multi-stage build; upgrade to python …

752b62d

…3.12

update uv.lock

d4655ea

update nodes schema / typegen

4c12f5a

modify docs for python 3.12

fb777b4

remove obsoleted depenencies that were used by the CLI

b2337b5

reintroduce GPU_DRIVER build arg in CI container build, as it has app…

f53ff5f

…arently been removed

resolve conflict between timm version needed by LLaVA and controlnet-aux

c3ffb0f

tests: update tests/test_object_serializer_disk.py

643b71f

chore(ui): typegen

0a13640

chore: remove pydantic pin

52e247c

chore: typegen

df294db

chore: update uv.lock for latest pydantic

6a79b1c

Ran `uv lock --upgrade-package pydantic`

chore: bump version to v5.10.0dev1

eed4260

Doing a dev build so I can test the launcher.

experiment: add pins.json to repo

bea9d03

The launcher will query this file to get the pins needed for installation

psychedelicious force-pushed the ebr/torch-260 branch from 48d5ce6 to bea9d03 Compare April 2, 2025 22:56

psychedelicious mentioned this pull request Apr 3, 2025

chore: support python 3.12, torch 2.6.0, clean up build/deps #7873

Merged

4 tasks

psychedelicious closed this Apr 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade pytorch to 2.6, python to 3.12 #7847

Upgrade pytorch to 2.6, python to 3.12 #7847

ebr commented Mar 26, 2025 •

edited

Loading

jazzhaiku left a comment

psychedelicious commented Mar 26, 2025

psychedelicious commented Mar 26, 2025

ebr commented Mar 27, 2025

ebr commented Mar 27, 2025 •

edited

Loading

psychedelicious commented Mar 31, 2025

psychedelicious commented Mar 31, 2025 •

edited

Loading

psychedelicious commented Mar 31, 2025

psychedelicious commented Apr 4, 2025

Upgrade pytorch to 2.6, python to 3.12 #7847

Upgrade pytorch to 2.6, python to 3.12 #7847

Conversation

ebr commented Mar 26, 2025 • edited Loading

Summary

QA Instructions

Merge Plan

Checklist

jazzhaiku left a comment

Choose a reason for hiding this comment

psychedelicious commented Mar 26, 2025

psychedelicious commented Mar 26, 2025

ebr commented Mar 27, 2025

ebr commented Mar 27, 2025 • edited Loading

psychedelicious commented Mar 31, 2025

psychedelicious commented Mar 31, 2025 • edited Loading

psychedelicious commented Mar 31, 2025

psychedelicious commented Apr 4, 2025

ebr commented Mar 26, 2025 •

edited

Loading

ebr commented Mar 27, 2025 •

edited

Loading

psychedelicious commented Mar 31, 2025 •

edited

Loading