-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[core] Move community AnimateDiff ControlNet to core #8972
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@@ -47,3 +50,77 @@ def load_image( | |||
image = image.convert("RGB") | |||
|
|||
return image | |||
|
|||
|
|||
def load_video( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this here to make it a little more easier to deal with videos instead of using imageio (which maybe could be a separate backend here). Let me know if this would need a separate PR and is out of scope to add here for a cleaner commit history
src/diffusers/utils/loading_utils.py
Outdated
if was_tempfile_created: | ||
os.remove(video_path) | ||
|
||
elif isinstance(video, list) and all(isinstance(frame, PIL.Image.Image) for frame in video): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If passing a list of PIL images, we would just return the same list back? Why support passing in a list of images then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed the implementation of load_image. And the same list might not be returned since in the code that follows, convert_method
ensures a callback is called or we convert the images to RGB (possibly from RGBA/HSV/etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm seems like there's been a bit of back and forth about it
#6479
#6904
IMO it doesn't make much sense to pass a list of already loaded images into the load_video
just to run preprocessing. If the user is at a point where they already have this list of images, it should be up to them to preprocess on their own.
I think the conversion to RGB by default should also be removed. A loading function should just return the objects. If additional processing has to be done, it can be done via the convert_method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, okay yeah that makes sense. Will remove this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment on the load video function, but the rest looks good to me 👍🏽
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
thanks, will merge once CI is green! |
* initial work draft for freenoise; needs massive cleanup * fix freeinit bug * add animatediff controlnet implementation * revert attention changes * add freenoise * remove old helper functions * add decode batch size param to all pipelines * make style * fix copied from comments * make fix-copies * make style * copy animatediff controlnet implementation from #8972 * add experimental support for num_frames not perfectly fitting context length, ocntext stride * make unet motion model lora work again based on #8995 * copy load video utils from #8972 * copied from AnimateDiff::prepare_latents * address the case where last batch of frames does not match length of indices in prepare latents * decode_batch_size->vae_batch_size; batch vae encode support in animatediff vid2vid * revert sparsectrl and sdxl freenoise changes * revert pia * add freenoise tests * make fix-copies * improve docstrings * add freenoise tests to animatediff controlnet * update tests * Update src/diffusers/models/unets/unet_motion_model.py * add freenoise to animatediff pag * address review comments * make style * update tests * make fix-copies * fix error message * remove copied from comment * fix imports in tests * update --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
* add animatediff controlnet to core * make style; remove unused method * fix copied from comment * add tests * changes to make tests work * add utility function to load videos * update docs * update pipeline example * make style * update docs with example * address review comments * add latest freeinit test from #8969 * LoraLoaderMixin -> StableDiffusionLoraLoaderMixin * fix docs * Update src/diffusers/utils/loading_utils.py Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * fix: variable out of scope --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
* initial work draft for freenoise; needs massive cleanup * fix freeinit bug * add animatediff controlnet implementation * revert attention changes * add freenoise * remove old helper functions * add decode batch size param to all pipelines * make style * fix copied from comments * make fix-copies * make style * copy animatediff controlnet implementation from #8972 * add experimental support for num_frames not perfectly fitting context length, ocntext stride * make unet motion model lora work again based on #8995 * copy load video utils from #8972 * copied from AnimateDiff::prepare_latents * address the case where last batch of frames does not match length of indices in prepare latents * decode_batch_size->vae_batch_size; batch vae encode support in animatediff vid2vid * revert sparsectrl and sdxl freenoise changes * revert pia * add freenoise tests * make fix-copies * improve docstrings * add freenoise tests to animatediff controlnet * update tests * Update src/diffusers/models/unets/unet_motion_model.py * add freenoise to animatediff pag * address review comments * make style * update tests * make fix-copies * fix error message * remove copied from comment * fix imports in tests * update --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
What does this PR do?
Moves the community implementation of AnimateDiff ControlNet to core. Part of supporting long vid2vid generations here.
Code
Documentation PR requires merging: https://huggingface.co/datasets/huggingface/documentation-images/discussions/351
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@DN6 @yiyixuxu