-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[Core] Add Kolors #8812
[Core] Add Kolors #8812
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me overall!
I wonder if we should call it pipeline_stable_diffusion_xl_kolors
? it is sort of a "variant" , let's ask the author for their preference!
I'm ignoring for now the failed test with the docs until I finish with all the other stuff. Kind of curious why the docs are the only ones that fail with a circular import. |
u guys need made it under diffusionpipeline just like playground |
@s9anus98a We decided not to do that because this pipeline doesn't share the same defaults and also nothing from SDXL works with this model right now, like IP Adapters, LoRAs and TIs |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Just some minor fixes related to docstrings. Could we also add fast tests please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks absolutely amazing! Just a few missing copied-froms
it seems its not picking up the selected variant correctly? import torch
import diffusers
from rich.traceback import install
install()
pipe = diffusers.KolorsPipeline.from_pretrained(
'Kwai-Kolors/Kolors',
variant = 'fp16',
torch_dtype = torch.float16,
cache_dir = '/mnt/models/Diffusers',
)
there are both fp32 and fp16 variants available in this is the local snapshot state - no unet downloaded at all:
|
@vladmandic it was the decision of the model authors to have a different repo for the diffusers compatible model, so the repo is |
thanks for pointing it out, it works
(yes, both can be monkey-patched in app, just would be nice to have out-of-the-box solution) |
https://huggingface.co/Kwai-Kolors/Kolors-diffusers/tree/main Currently, only fp16 models are available under huggingface. Should fp32 models also be provided to allow control via the variant parameter in |
@vladmandic edit: The vae won't be changed since there was a performance degradation detected, users will need to replace it by themselves. |
We can add them but AFAIK the model was trained in fp16, so you'll be just wasting VRAM with this. @JincanDeng can you confirm this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Looks veryyyyy good. Left a couple of comments.
original_size: Optional[Tuple[int, int]] = None, | ||
crops_coords_top_left: Tuple[int, int] = (0, 0), | ||
target_size: Optional[Tuple[int, int]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker for merging but did you have some time play around with these params? They are fun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that you mention it, I've never played with these, not even with the base SDXL, I'll do some experiments but for kolors they change the generation but doesn't seem to do what they should. Maybe because of the training?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could very well be. @JincanDeng do you have any suggestions here?
Yes, the model was trained in fp16. |
can you share that info? fp16-fixed should be better than requiring to run vae with upcast? |
That was the answer I got, I don't have more info. I'm also curious about it so I intend to do some test when I have time. Also probably the human eye doesn't see the difference between them but the numbers are more important to researchers (understandable) than the VRAM or if there's no visual difference. |
What does this PR do?
Adds Kolors from the Kwai-Kolors team
Text to Image
Image to Image
Fixes #8801
TODO
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@yiyixuxu @sayakpaul @JincanDeng