[core] Move community AnimateDiff ControlNet to core #8972

a-r-r-o-w · 2024-07-25T16:09:08Z

What does this PR do?

Moves the community implementation of AnimateDiff ControlNet to core. Part of supporting long vid2vid generations here.

Code

import torch
from diffusers import AnimateDiffControlNetPipeline, AutoencoderKL, ControlNetModel, MotionAdapter, LCMScheduler
from diffusers.utils import export_to_gif, load_video

# Additionally, you will need a preprocess videos before they can be used with the ControlNet
# HF maintains just the right package for it: `pip install controlnet_aux`
from controlnet_aux.processor import ZoeDetector

# Download controlnets from https://huggingface.co/lllyasviel/ControlNet-v1-1 to use .from_single_file
# Download Diffusers-format controlnets, such as https://huggingface.co/lllyasviel/sd-controlnet-depth, to use .from_pretrained()
controlnet = ControlNetModel.from_single_file("control_v11f1p_sd15_depth.pth", torch_dtype=torch.float16)

# We use AnimateLCM for this example but one can use the original motion adapters as well (for example, https://huggingface.co/guoyww/animatediff-motion-adapter-v1-5-3)
motion_adapter = MotionAdapter.from_pretrained("wangfuyun/AnimateLCM")

vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse", torch_dtype=torch.float16)
pipe: AnimateDiffControlNetPipeline = AnimateDiffControlNetPipeline.from_pretrained(
    "SG161222/Realistic_Vision_V5.1_noVAE",
    motion_adapter=motion_adapter,
    controlnet=controlnet,
    vae=vae,
).to(device="cuda", dtype=torch.float16)
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config, beta_schedule="linear")

pipe.load_lora_weights("wangfuyun/AnimateLCM", weight_name="AnimateLCM_sd15_t2v_lora.safetensors", adapter_name="lcm-lora")
pipe.set_adapters(["lcm-lora"], [0.8])

depth_detector = ZoeDetector.from_pretrained("lllyasviel/Annotators").to("cuda")
video = load_video("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatediff-vid2vid-input-1.gif")
conditioning_frames = []

with pipe.progress_bar(total=len(video)) as progress_bar:
    for frame in video:
        conditioning_frames.append(depth_detector(frame))
        progress_bar.update()

prompt = "a panda, playing a guitar, sitting in a pink boat, in the ocean, mountains in background, realistic, high quality"
negative_prompt = "bad quality, worst quality"

video = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_frames=len(video),
    num_inference_steps=10,
    guidance_scale=2.0,
    conditioning_frames=conditioning_frames,
    generator=torch.Generator().manual_seed(42),
).frames[0]

export_to_gif(video, "animatediff_controlnet.gif", fps=8)

Source Video	Output Video
raccoon playing a guitar	a panda, playing a guitar, sitting in a pink boat, in the ocean, mountains in background, realistic, high quality

Documentation PR requires merging: https://huggingface.co/datasets/huggingface/documentation-images/discussions/351

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@DN6 @yiyixuxu

HuggingFaceDocBuilderDev · 2024-07-25T16:14:55Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

a-r-r-o-w · 2024-07-25T18:11:01Z

src/diffusers/utils/loading_utils.py

@@ -47,3 +50,77 @@ def load_image(
        image = image.convert("RGB")

    return image
+
+
+def load_video(


Added this here to make it a little more easier to deal with videos instead of using imageio (which maybe could be a separate backend here). Let me know if this would need a separate PR and is out of scope to add here for a cleaner commit history

docs/source/en/api/pipelines/animatediff.md

DN6 · 2024-07-26T05:02:24Z

src/diffusers/utils/loading_utils.py

+        if was_tempfile_created:
+            os.remove(video_path)
+
+    elif isinstance(video, list) and all(isinstance(frame, PIL.Image.Image) for frame in video):


If passing a list of PIL images, we would just return the same list back? Why support passing in a list of images then?

I followed the implementation of load_image. And the same list might not be returned since in the code that follows, convert_method ensures a callback is called or we convert the images to RGB (possibly from RGBA/HSV/etc.)

Hmm seems like there's been a bit of back and forth about it
#6479
#6904

IMO it doesn't make much sense to pass a list of already loaded images into the load_video just to run preprocessing. If the user is at a point where they already have this list of images, it should be up to them to preprocess on their own.

I think the conversion to RGB by default should also be removed. A loading function should just return the objects. If additional processing has to be done, it can be done via the convert_method

I see, okay yeah that makes sense. Will remove this change

src/diffusers/utils/loading_utils.py

DN6

Left a comment on the load video function, but the rest looks good to me 👍🏽

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

a-r-r-o-w · 2024-07-30T11:21:03Z

Left a comment on the load video function, but the rest looks good to me 👍🏽

thanks, will merge once CI is green!

* initial work draft for freenoise; needs massive cleanup * fix freeinit bug * add animatediff controlnet implementation * revert attention changes * add freenoise * remove old helper functions * add decode batch size param to all pipelines * make style * fix copied from comments * make fix-copies * make style * copy animatediff controlnet implementation from #8972 * add experimental support for num_frames not perfectly fitting context length, ocntext stride * make unet motion model lora work again based on #8995 * copy load video utils from #8972 * copied from AnimateDiff::prepare_latents * address the case where last batch of frames does not match length of indices in prepare latents * decode_batch_size->vae_batch_size; batch vae encode support in animatediff vid2vid * revert sparsectrl and sdxl freenoise changes * revert pia * add freenoise tests * make fix-copies * improve docstrings * add freenoise tests to animatediff controlnet * update tests * Update src/diffusers/models/unets/unet_motion_model.py * add freenoise to animatediff pag * address review comments * make style * update tests * make fix-copies * fix error message * remove copied from comment * fix imports in tests * update --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* add animatediff controlnet to core * make style; remove unused method * fix copied from comment * add tests * changes to make tests work * add utility function to load videos * update docs * update pipeline example * make style * update docs with example * address review comments * add latest freeinit test from #8969 * LoraLoaderMixin -> StableDiffusionLoraLoaderMixin * fix docs * Update src/diffusers/utils/loading_utils.py Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * fix: variable out of scope --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* initial work draft for freenoise; needs massive cleanup * fix freeinit bug * add animatediff controlnet implementation * revert attention changes * add freenoise * remove old helper functions * add decode batch size param to all pipelines * make style * fix copied from comments * make fix-copies * make style * copy animatediff controlnet implementation from #8972 * add experimental support for num_frames not perfectly fitting context length, ocntext stride * make unet motion model lora work again based on #8995 * copy load video utils from #8972 * copied from AnimateDiff::prepare_latents * address the case where last batch of frames does not match length of indices in prepare latents * decode_batch_size->vae_batch_size; batch vae encode support in animatediff vid2vid * revert sparsectrl and sdxl freenoise changes * revert pia * add freenoise tests * make fix-copies * improve docstrings * add freenoise tests to animatediff controlnet * update tests * Update src/diffusers/models/unets/unet_motion_model.py * add freenoise to animatediff pag * address review comments * make style * update tests * make fix-copies * fix error message * remove copied from comment * fix imports in tests * update --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

a-r-r-o-w added 11 commits July 25, 2024 13:53

add animatediff controlnet to core

df554ec

make style; remove unused method

e9bc322

Merge branch 'main' into animatediff/controlnet

764f1fd

fix copied from comment

a182cd2

add tests

9f93ddb

changes to make tests work

8498846

add utility function to load videos

217bf49

update docs

5942811

update pipeline example

18cc43b

make style

f84ffc0

update docs with example

4f1be16

yiyixuxu requested a review from DN6 July 25, 2024 17:46

a-r-r-o-w commented Jul 25, 2024

View reviewed changes

DN6 reviewed Jul 26, 2024

View reviewed changes

a-r-r-o-w added 7 commits July 26, 2024 10:38

address review comments

9a69392

Merge branch 'main' into animatediff/controlnet

cbaa1c9

Merge branch 'main' into animatediff/controlnet

98798b9

add latest freeinit test from #8969

3f1e1f1

LoraLoaderMixin -> StableDiffusionLoraLoaderMixin

c141d9d

Merge branch 'main' into animatediff/controlnet

a961b41

fix docs

df14eef

a-r-r-o-w added a commit that referenced this pull request Jul 27, 2024

copy animatediff controlnet implementation from #8972

691facf

a-r-r-o-w added a commit that referenced this pull request Jul 28, 2024

copy load video utils from #8972

7000186

a-r-r-o-w mentioned this pull request Jul 28, 2024

[core] FreeNoise #8948

Merged

DN6 reviewed Jul 30, 2024

View reviewed changes

src/diffusers/utils/loading_utils.py Outdated Show resolved Hide resolved

DN6 reviewed Jul 30, 2024

View reviewed changes

a-r-r-o-w and others added 3 commits July 30, 2024 16:48

Update src/diffusers/utils/loading_utils.py

60f960e

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

Merge branch 'main' into animatediff/controlnet

cd99bd7

fix: variable out of scope

fa4e2ec

a-r-r-o-w merged commit e5b94b4 into main Jul 30, 2024
18 checks passed

a-r-r-o-w deleted the animatediff/controlnet branch July 30, 2024 11:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] Move community AnimateDiff ControlNet to core #8972

[core] Move community AnimateDiff ControlNet to core #8972

a-r-r-o-w commented Jul 25, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Jul 25, 2024

a-r-r-o-w Jul 25, 2024

DN6 Jul 26, 2024

a-r-r-o-w Jul 26, 2024

DN6 Jul 26, 2024

a-r-r-o-w Jul 26, 2024

DN6 left a comment

a-r-r-o-w commented Jul 30, 2024

[core] Move community AnimateDiff ControlNet to core #8972

[core] Move community AnimateDiff ControlNet to core #8972

Conversation

a-r-r-o-w commented Jul 25, 2024 • edited Loading

What does this PR do?

Who can review?

HuggingFaceDocBuilderDev commented Jul 25, 2024

a-r-r-o-w Jul 25, 2024

Choose a reason for hiding this comment

DN6 Jul 26, 2024

Choose a reason for hiding this comment

a-r-r-o-w Jul 26, 2024

Choose a reason for hiding this comment

DN6 Jul 26, 2024

Choose a reason for hiding this comment

a-r-r-o-w Jul 26, 2024

Choose a reason for hiding this comment

DN6 left a comment

Choose a reason for hiding this comment

a-r-r-o-w commented Jul 30, 2024

a-r-r-o-w commented Jul 25, 2024 •

edited

Loading