Skip to content

make enable_sequential_cpu_offload more generic for third-party devices #4191

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 2 commits into from
Jul 21, 2023
Merged

make enable_sequential_cpu_offload more generic for third-party devices #4191

merged 2 commits into from
Jul 21, 2023

Conversation

ji-huazhong
Copy link
Contributor

@ji-huazhong ji-huazhong commented Jul 21, 2023

What does this PR do?

This PR make enable_sequential_cpu_offload more generic for third-party devices.


I noticed that in #4114, enable_sequential_cpu_offload has been refactored to be more generic for other devices.
But inside the function enable_sequential_cpu_offload, we use torch.cuda.empty_cache to release all unoccupied cache memory, which has no effect for other devices (such as xpu)

if self.device.type != "cpu":
self.to("cpu", silence_dtype_warnings=True)
torch.cuda.empty_cache() # otherwise we don't see the memory savings (but they probably exist)

We could change torch.cuda.empty_cache() with another from outside, like

+ torch.cuda.empty_cache = torch.xpu.empty_cache
device = torch.device("xpu")
pipeline.enable_sequential_cpu_offload(device=device)

but it looks a little weird.

I think a better way is

  • get the device module according to the device type first
  • and then call the corresponding empty_cache method.

Now, we can use enable_sequential_cpu_offload more conveniently with xpu, like

device = torch.device("xpu")
pipeline.enable_sequential_cpu_offload(device=device)

Before submitting

Who can review?

@patrickvonplaten and @sayakpaul

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 21, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me!

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@sayakpaul sayakpaul merged commit e2bbaa4 into huggingface:main Jul 21, 2023
orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request Aug 1, 2023
…es (huggingface#4191)

* make enable_sequential_cpu_offload more generic for third-party devices

* make style
orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request Aug 1, 2023
…es (huggingface#4191)

* make enable_sequential_cpu_offload more generic for third-party devices

* make style
orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request Aug 1, 2023
…es (huggingface#4191)

* make enable_sequential_cpu_offload more generic for third-party devices

* make style
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
…es (huggingface#4191)

* make enable_sequential_cpu_offload more generic for third-party devices

* make style
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
…es (huggingface#4191)

* make enable_sequential_cpu_offload more generic for third-party devices

* make style
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants