-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[Core] add suport for using sharded models in the pipeline context #8428
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
You get this error :
I'm not use how to handle the case where we can pass a model that is loaded with |
But we are not doing auto device_map in the pipeline. What restrictions can we impose internally within the pipeline implementation? Perhaps not force hooks when |
Yes, but I was thinking that the user might do that. Not sure yet. I will test a few combination to see what error pops out. But I don't think we should create a model with device_map and use pipeline. In a multi-gpu setup, I think that the above script will fail. |
A model within diffusers will most likely always be used with pipelines. So, we need to consider it. For a multi-GPU setup I think we should restrict it, for sure. |
@SunMarc I will let you run a couple tests and let me know if the above plan is good. |
I tested a bit the this feature and I think that we should try to make the following work instead. from diffusers import UNet2DConditionModel, StableDiffusionXLPipeline
import torch
unet = UNet2DConditionModel.from_pretrained(
"sayakpaul/sdxl-unet-sharded", torch_dtype=torch.float16
)
pipeline = StableDiffusionXLPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0", unet=unet, torch_dtype=torch.float16
).to("cuda")
image = pipeline("dog", num_inference_steps=20).images[0] Right now, it returns an error because since for checkpoint_file in checkpoint_files:
loaded_checkpoint = load_state_dict(checkpoint_file, device_map=device_map)
if device_map is None:
model.load_state_dict(loaded_checkpoint, strict=strict)
unexpected_keys.update(set(loaded_checkpoint.keys()) - model_keys) One way to do that would be to the following modification. This would simply load the sharded model on cpu without the hooks, so you can safely move the model. And we would still be able to capture the else: # else let accelerate handle loading and dispatching.
# Load weights and dispatch according to the device_map
# by default the device_map is None and the weights are loaded on the CPU
force_hook = True
device_map = _determine_device_map(model, device_map, max_memory, torch_dtype)
if device_map is None and is_sharded:
# we load the parameters on the cpu
device_map = {"":"cpu"}
force_hook = False
try:
accelerate.load_checkpoint_and_dispatch(
model,
model_file if not is_sharded else sharded_ckpt_cached_folder,
device_map,
max_memory=max_memory,
offload_folder=offload_folder,
offload_state_dict=offload_state_dict,
dtype=torch_dtype,
force_hooks=force_hook,
strict=True
) Another way would be to make the following logic compatible with loading sharded checkpoint but we would essentially rewrite what was done in if device_map is None and not is_sharded:
param_device = "cpu"
state_dict = load_state_dict(model_file, variant=variant)
... LMK what you think @sayakpaul @yiyixuxu ! Lastly, about passing from diffusers import UNet2DConditionModel, StableDiffusionXLPipeline
import torch
unet = UNet2DConditionModel.from_pretrained(
"sayakpaul/sdxl-unet-sharded", torch_dtype=torch.float16, device_map=0
)
pipeline = StableDiffusionXLPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0", unet=unet, torch_dtype=torch.float16
).to(1)
image = pipeline("dog", num_inference_steps=20).images[0] |
Thanks for the investigation! I think you already have a good idea, would you mind opening a PR? |
@SunMarc, thanks for the long explanation! however I think this error message we want to catch here is specifically for loading from a deprecated checkpoint
since :
I think it is safe to make the cc @pcuenca here too if you have time! |
Hi @yiyixuxu, I'm fine with setting strict to
This might have some complications as explained here with respect to training. I've opened a PR with the solution I proposed in the long comment. Let me know which one you prefer ! |
@SunMarc when I do along with the changes from your PR and this PR: from diffusers import UNet2DConditionModel, StableDiffusionXLPipeline
import torch
unet = UNet2DConditionModel.from_pretrained(
"sayakpaul/sdxl-unet-sharded", torch_dtype=torch.float16, device_map="auto"
)
pipeline = StableDiffusionXLPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0", unet=unet, torch_dtype=torch.float16
).to("cuda")
image = pipeline("a cute dog running on the grass", num_inference_steps=30).images[0]
image.save("dog.png") I still face: You shouldn't move a model that is dispatched using accelerate hooks. |
Hi @sayakpaul, remove the device_map arg from
and it should work as expected ! |
My bad. |
What does this PR do?
After adding support for model sharding through #6396 and #7830, it's now time for us to use a shared model within a pipeline.
This PR enables that.
TODOs
Currently, the following works:
However, it prints the following:
You shouldn't move a model that is dispatched using accelerate hooks.
Is this relevant/unsafe in our context?
Once I get initial approval from @SunMarc, I will proceed with the rest of the TODOs.