enable cpu offloading of new pipelines on XPU & use device agnostic empty to make pipelines work on XPU #11671

yao-matrix · 2025-06-06T07:24:37Z

No description provided.

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

yao-matrix · 2025-06-06T07:45:30Z

src/diffusers/pipelines/kandinsky/pipeline_kandinsky_combined.py

@@ -193,7 +193,7 @@ def __init__(
    def enable_xformers_memory_efficient_attention(self, attention_op: Optional[Callable] = None):
        self.decoder_pipe.enable_xformers_memory_efficient_attention(attention_op)

-    def enable_sequential_cpu_offload(self, gpu_id: Optional[int] = None, device: Union[torch.device, str] = "cuda"):
+    def enable_sequential_cpu_offload(self, gpu_id: Optional[int] = None, device: Union[torch.device, str] = None):


per discussion in this PR #11288, we change the default to None, so cpu_offloading can work on other accelerators like XPU w/ application code change.

with such changes, cases like tests/pipelines/wuerstchen/test_wuerstchen_combined.py::WuerstchenCombinedPipelineFastTests::test_cpu_offload_forward_pass_twice, tests/pipelines/kandinsky2_2/test_kandinsky_combined.py::KandinskyV22PipelineImg2ImgCombinedFastTests::test_cpu_offload_forward_pass_twice can pass on XPU

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

yao-matrix · 2025-06-08T23:55:37Z

@a-r-r-o-w @DN6 , pls help review, thx very much.

DN6 · 2025-06-11T08:45:20Z

@bot /style

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

HuggingFaceDocBuilderDev · 2025-06-12T11:26:59Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

yao-matrix · 2025-06-13T05:16:18Z

src/diffusers/pipelines/pipeline_utils.py

-            device_mod = getattr(torch, self.device.type, None)
-            if hasattr(device_mod, "empty_cache") and device_mod.is_available():
-                device_mod.empty_cache()  # otherwise we don't see the memory savings (but they probably exist)
+            empty_device_cache(orig_device_type)



@DN6 , the original code has a bug, the empty_cache will not be executed.
The logic is like: check module's device type w/ self.device.type, at this time, it's "cuda" or "xpu", then it goes to the if scope, then it put module to cpu w/ to, after this, the self.device.type will be "cpu". So, "device_mod" will be torch.cpu, it has no empty_cache, so the following check will not pass, so empty_cache will not be called, so no empty_cache behavior.

I changed the code to make it will empty_cache of device, pls review in case that's not what in your design.

PS. I attached the PR which changes to current code here: #4191

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

yao-matrix added 7 commits June 6, 2025 07:22

commit 1

46bfc58

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

patch 2

cc2f3f8

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

Update pipeline_pag_sana.py

7184867

Update pipeline_sana.py

6acd1e8

Update pipeline_sana_controlnet.py

f08c8a5

Update pipeline_sana_sprint_img2img.py

b0c45fd

Update pipeline_sana_sprint.py

dea64fc

yao-matrix commented Jun 6, 2025

View reviewed changes

yao-matrix marked this pull request as draft June 6, 2025 08:00

yao-matrix added 2 commits June 9, 2025 07:28

Merge branch 'main' into pipeline-offload-xpu

8a65523

fix style

d2a3784

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

yao-matrix marked this pull request as ready for review June 8, 2025 23:55

Merge branch 'main' into pipeline-offload-xpu

2cdf27f

yao-matrix added 3 commits June 11, 2025 13:35

fix fat-thumb while merge conflict

39b4c15

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

Merge branch 'main' into pipeline-offload-xpu

438388a

Merge branch 'main' into pipeline-offload-xpu

1d1865d

yao-matrix commented Jun 13, 2025

View reviewed changes

yao-matrix added 3 commits June 13, 2025 15:10

Merge branch 'main' into pipeline-offload-xpu

a70c2e8

fix ci issues

1f063a1

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

Merge branch 'main' into pipeline-offload-xpu

d725b39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

enable cpu offloading of new pipelines on XPU & use device agnostic empty to make pipelines work on XPU #11671

enable cpu offloading of new pipelines on XPU & use device agnostic empty to make pipelines work on XPU #11671

Uh oh!

yao-matrix commented Jun 6, 2025

Uh oh!

yao-matrix Jun 6, 2025

Uh oh!

yao-matrix Jun 6, 2025

Uh oh!

yao-matrix commented Jun 8, 2025

Uh oh!

DN6 commented Jun 11, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jun 12, 2025

Uh oh!

yao-matrix Jun 13, 2025 •

edited

Loading

Uh oh!

Uh oh!

enable cpu offloading of new pipelines on XPU & use device agnostic empty to make pipelines work on XPU #11671

Are you sure you want to change the base?

enable cpu offloading of new pipelines on XPU & use device agnostic empty to make pipelines work on XPU #11671

Uh oh!

Conversation

yao-matrix commented Jun 6, 2025

Uh oh!

yao-matrix Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

yao-matrix Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

yao-matrix commented Jun 8, 2025

Uh oh!

DN6 commented Jun 11, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jun 12, 2025

Uh oh!

yao-matrix Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yao-matrix Jun 13, 2025 •

edited

Loading