-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Support for Kolors #8801
Comments
Hi, thanks for your work, it's a nice model. The weights seems to be saved with errors. The diffusion_pytorch_model.safetensors which should be the float32 seems to be the float16 one and the float16 throws an error. I can open a PR to fix it if you want. If you fix that, we can load the model like this: text_encoder = ChatGLMModel.from_pretrained("Kwai-Kolors/Kolors", subfolder="text_encoder", torch_dtype=torch.float16)
tokenizer = ChatGLMTokenizer.from_pretrained("Kwai-Kolors/Kolors", subfolder="text_encoder")
pipe = StableDiffusionXLPipeline.from_pretrained(
"Kwai-Kolors/Kolors",
tokenizer=tokenizer,
text_encoder=text_encoder,
torch_dtype=torch.float16,
variant="fp16",
).to("cuda") So basically for this model to work with diffusers without additional dependencies, we'll just need for cc: @yiyixuxu @sayakpaul |
@asomoza We actually don’t need to integrate ChatGLM code directly into the transformers. Instead, we can simply utilize the existing infrastructure, similar to the following code snippet:
|
i keep getting memory crashed (for colab free under 12GB vram) even with quantized 4bit
|
I think we should support this model to welcome more models that are inherently multi-lingual. What do we need to get it in? |
Thank you for your suggestion. We've fixed the model's fp16 and fp32 weights on huggingface. However, the pipeline still throws an error when loading directly via My running code: from kolors.pipelines.pipeline_stable_diffusion_xl_chatglm_256 import StableDiffusionXLPipeline
from kolors.models.tokenization_chatglm import ChatGLMTokenizer
from kolors.models.modeling_chatglm import ChatGLMModel
text_encoder = ChatGLMModel.from_pretrained(ckpt_dir, subfolder="text_encoder", torch_dtype=torch.float16)
tokenizer = ChatGLMTokenizer.from_pretrained(ckpt_dir, subfolder="text_encoder")
pipe = StableDiffusionXLPipeline.from_pretrained(
ckpt,
tokenizer=tokenizer,
text_encoder=text_encoder,
torch_dtype=torch.float16,
variant="fp16",
).to("cuda") The error is:
|
tried using standard implementation: text_encoder = transformers.AutoModel.from_pretrained('THUDM/chatglm3-6b', torch_dtype=torch.float16, trust_remote_code=True)
tokenizer = transformers.AutoTokenizer.from_pretrained('THUDM/chatglm3-6b', trust_remote_code=True)
pipe = diffusers.StableDiffusionXLPipeline.from_pretrained('Kwai-Kolors/Kolors', tokenizer=tokenizer, text_encoder=text_encoder) this loads
Kolors pipeline is similar-but-different to SDXL pipeline from kolors.models.modeling_chatglm import ChatGLMModel
from kolors.models.tokenization_chatglm import ChatGLMTokenizer
from kolors.pipelines.pipeline_stable_diffusion_xl_chatglm_256 import StableDiffusionXLPipeline but you should not redefine a well-known |
Model/Pipeline/Scheduler description
Yesterday Kwai-Kolors published their new model named Kolors, which uses unet as backbone and ChatGLM3 as text encoder.
Kolors is a large-scale text-to-image generation model based on latent diffusion, developed by the Kuaishou Kolors team. Trained on billions of text-image pairs, Kolors exhibits significant advantages over both open-source and closed-source models in visual quality, complex semantic accuracy, and text rendering for both Chinese and English characters. Furthermore, Kolors supports both Chinese and English inputs, demonstrating strong performance in understanding and generating Chinese-specific content.
Open source status
Provide useful links for the implementation
Implementation: https://github.com/Kwai-Kolors/Kolors
Weights: https://huggingface.co/Kwai-Kolors/Kolors
The text was updated successfully, but these errors were encountered: