Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

I am encountering a key error with the IP adapter when using the train_lcm_distill_lora_sdxl.py script to train the LCM Lora model for SDXL. After training, when I use it together with SD XL and IP adapter, it throws an IP adapter key error. #6382

Closed
cjt222 opened this issue Dec 29, 2023 · 11 comments
Labels
bug Something isn't working

Comments

@cjt222
Copy link

cjt222 commented Dec 29, 2023

Describe the bug

I am encountering a key error with the IP adapter when using the train_lcm_distill_lora_sdxl.py script to train the LCM Lora model for SDXL. After training, when I use it together with SD XL and IP adapter, it throws an IP adapter key error.

Loading adapter weights from state_dict led to unexpected keys not found in the model: ['down_blocks.0.resnets.0.conv1.lora_A_1.default_0.weight', 'down_blocks.0.resnets.0.conv1.lora_B_1.default_0.weight', 'down_blocks.0.resnets.0.time_emb_proj.lora_A_1.default_0.weight', 'down_blocks.0.resnets.0.time_emb_proj.lora_B_1.default_0.weight', 'down_blocks.0.resnets.0.conv2.lora_A_1.default_0.weight', 'down_blocks.0.resnets.0.conv2.lora_B_1.default_0.weight', 'down_blocks.0.resnets.1.conv1.lora_A_1.default_0.weight', 'down_blocks.0.resnets.1.conv1.lora_B_1.default_0.weight', 'down_blocks.0.resnets.1.time_emb_proj.lora_A_1.default_0.weight', 'down_blocks.0.resnets.1.time_emb_proj.lora_B_1.default_0.weight', 'down_blocks.0.resnets.1.conv2.lora_A_1.default_0.weight', 'down_blocks.0.resnets.1.conv2.lora_B_1.default_0.weight', 'down_blocks.0.downsamplers.0.conv.lora_A_1.default_0.weight', 'down_blocks.0.downsamplers.0.conv.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.proj_in.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.proj_in.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.proj_out.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.proj_out.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.proj_in.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.proj_in.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.proj_out.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.proj_out.lora_B_1.default_0.weight', 'down_blocks.1.resnets.0.conv1.lora_A_1.default_0.weight', 'down_blocks.1.resnets.0.conv1.lora_B_1.default_0.weight', 'down_blocks.1.resnets.0.time_emb_proj.lora_A_1.default_0.weight', 'down_blocks.1.resnets.0.time_emb_proj.lora_B_1.default_0.weight', 'down_blocks.1.resnets.0.conv2.lora_A_1.default_0.weight', 'down_blocks.1.resnets.0.conv2.lora_B_1.default_0.weight', 'down_blocks.1.resnets.0.conv_shortcut.lora_A_1.default_0.weight', 'down_blocks.1.resnets.0.conv_shortcut.lora_B_1.default_0.weight', 'down_blocks.1.resnets.1.conv1.lora_A_1.default_0.weight', 'down_blocks.1.resnets.1.conv1.lora_B_1.default_0.weight', 'down_blocks.1.resnets.1.time_emb_proj.lora_A_1.default_0.weight', 'down_blocks.1.resnets.1.time_emb_proj.lora_B_1.default_0.weight', 'down_blocks.1.resnets.1.conv2.lora_A_1.default_0.weight', 'down_blocks.1.resnets.1.conv2.lora_B_1.default_0.weight', 'down_blocks.1.downsamplers.0.conv.lora_A_1.default_0.weight', 'down_blocks.1.downsamplers.0.conv.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.proj_in.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.proj_in.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn2.to_q.lora_A_1.default_0.weight'

Reproduction

from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, StableDiffusionXLPipeline
import torch
import numpy as np
import cv2
from PIL import Image
from diffusers.utils import load_image
from transformers import pipeline, CLIPVisionModelWithProjection
from diffusers import LCMScheduler
from controlnet_aux import CannyDetector
from controlnet_aux import ContentShuffleDetector

lcm_lora_id = "/home/kas/kas_workspace/cjt/save_lcm_sdxl_models/checkpoint-34000"
pipe = StableDiffusionXLPipeline.from_pretrained(
"/home/kas/kas_workspace/zijunhuang/MODEL_LIBS/models/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16)
#pipe = StableDiffusionXLPipeline.from_pretrained(

"/home/kas/style_transfer/ip_adapters/models/RealVisXL_V3.0", torch_dtype=torch.float16)

pipe.to("cuda")
pipe.safety_checker=None

pipe.load_ip_adapter("h94/IP-Adapter", subfolder="sdxl_models", weight_name="ip-adapter-plus_sdxl_vit-h.safetensors")
pipe.image_encoder = CLIPVisionModelWithProjection.from_pretrained("h94/IP-Adapter", subfolder="models/image_encoder",torch_dtype=torch.float16).to('cuda')
pipe.set_ip_adapter_scale(0.6)
pipe.load_lora_weights(lcm_lora_id)
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)

ref_image = load_image("/home/kas/style_transfer/AesPA-Net/style_image/dog.jpg")
#ref_image_2 = load_image("/home/kas/style_transfer/AesPA-Net/style_image/23.jpg")

generator = torch.Generator(device="cpu").manual_seed(330)
images = pipe(
prompt='Whimsical steampunk-inspired airship soaring through the skies amidst floating islands, best quality, high quality',
ip_adapter_image=ref_image,
negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality",
num_inference_steps=4,
generator=generator,
guidance_scale=1,
clip_skip=1
).images
images[0].save("ip_adapters_xl.png")

Logs

No response

System Info

  • diffusers version: 0.25.0.dev0
  • Platform: Linux-5.4.0-48-generic-x86_64-with-glibc2.27
  • Python version: 3.10.13
  • PyTorch version (GPU?): 2.1.0+cu121 (True)
  • Huggingface_hub version: 0.20.1
  • Transformers version: 4.36.2
  • Accelerate version: 0.24.1
  • xFormers version: 0.0.22.post7
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

@sayakpaul @yiyixuxu

@cjt222 cjt222 added the bug Something isn't working label Dec 29, 2023
@sayakpaul
Copy link
Member

Can you load your pipeline with the LoRA first and then populate with IP?

@JunruiXiao
Copy link

JunruiXiao commented Dec 29, 2023

It's a dict error, you can try load lcm-lora as follows:

def get_module_kohya_state_dict(module, prefix: str, dtype: torch.dtype, adapter_name: str = "default"):
    kohya_ss_state_dict = {}
    for peft_key, weight in module.items():
        kohya_key = peft_key.replace("unet.base_model.model", prefix)
        kohya_key = kohya_key.replace("lora_A", "lora_down")
        kohya_key = kohya_key.replace("lora_B", "lora_up")
        kohya_key = kohya_key.replace(".", "_", kohya_key.count(".") - 2)
        kohya_ss_state_dict[kohya_key] = weight.to(dtype)
        # Set alpha parameter
        if "lora_down" in kohya_key:
            alpha_key = f'{kohya_key.split(".")[0]}.alpha'
            kohya_ss_state_dict[alpha_key] = torch.tensor(8).to(dtype)

    return kohya_ss_state_dict

lcm_lora = get_module_kohya_state_dict(load_file(args.lcm_lora_path), "lora_unet", torch.float16)
sd_pipeline.load_lora_weights(lcm_lora, adapter_name="lcmlora")

@sayakpaul
Copy link
Member

sayakpaul commented Dec 29, 2023

Seems like you're using an older version of the script here.

If you check: https://github.com/huggingface/diffusers/blob/main/examples/consistency_distillation/train_lcm_distill_lora_sdxl.py, you'd notice there's no use of get_module_kohya_state_dict() there. I suggest you to use that.

Also, to be fair, there's still a couple of unknowns here:

  • How did you obtain the LoRA LCM checkpoint? What was the training command?
  • We don't have access to the LCM checkpoint you're referring. We cannot be expected to run training to debug issues 😅

@cjt222
Copy link
Author

cjt222 commented Dec 29, 2023

training command is as below:
export MODEL_NAME="/home/kas/general_model/diffusers_workspace/text_to_image/cvitai/dreamshaper_xl1.0"
export OUTPUT_DIR="/home/kas/kas_workspace/cjt/save_lcm_sdxl_models"

DATA_DIR="/home/kas/kas_workspace/zijunhuang/other_data/coyo_part4"
VAE_NAME="/home/kas/general_model/diffusers_workspace/train_sd_xl/text2image/projects/llm_text_encoder/sdxl-vae-fp16-fix"
/home/kas/.conda/envs/control/bin/accelerate launch --mixed_precision="bf16" --multi_gpu --config_file="./distill.yaml" train_lcm_distill_lora_sdxl.py
--pretrained_teacher_model=$MODEL_NAME
--pretrained_vae_model_name_or_path=$VAE_NAME
--output_dir=$OUTPUT_DIR
--train_data_dir=$DATA_DIR
--mixed_precision=bf16
--cache_dir="./cache_dir"
--resolution=1024
--lora_rank=64
--learning_rate=1e-4 --loss_type="huber" --adam_weight_decay=0.0
--lr_scheduler="constant"
--lr_warmup_steps=0
--max_train_steps=1000000
--max_train_samples=3000000
--dataloader_num_workers=8
--checkpointing_steps=1000 --checkpoints_total_limit=10
--train_batch_size=24
--gradient_checkpointing --enable_xformers_memory_efficient_attention
--gradient_accumulation_steps=1
--use_8bit_adam
--resume_from_checkpoint="latest"

@sayakpaul
Copy link
Member

Are you not using peft and the latest version? If so, then Kohya related utilities should not be used there. If you follow #5778 then you would notice that it has all the worked out and running examples. What am I missing out?

@cjt222
Copy link
Author

cjt222 commented Dec 29, 2023

I trained the model completely according to the latest version of the script follow #5778 . The model can load Lora, but there is a key issue when loading the IP adapter.

@cjt222
Copy link
Author

cjt222 commented Dec 29, 2023

It's a dict error, you can try load lcm-lora as follows:

def get_module_kohya_state_dict(module, prefix: str, dtype: torch.dtype, adapter_name: str = "default"):
    kohya_ss_state_dict = {}
    for peft_key, weight in module.items():
        kohya_key = peft_key.replace("unet.base_model.model", prefix)
        kohya_key = kohya_key.replace("lora_A", "lora_down")
        kohya_key = kohya_key.replace("lora_B", "lora_up")
        kohya_key = kohya_key.replace(".", "_", kohya_key.count(".") - 2)
        kohya_ss_state_dict[kohya_key] = weight.to(dtype)
        # Set alpha parameter
        if "lora_down" in kohya_key:
            alpha_key = f'{kohya_key.split(".")[0]}.alpha'
            kohya_ss_state_dict[alpha_key] = torch.tensor(8).to(dtype)

    return kohya_ss_state_dict

lcm_lora = get_module_kohya_state_dict(load_file(args.lcm_lora_path), "lora_unet", torch.float16) sd_pipeline.load_lora_weights(lcm_lora, adapter_name="lcmlora")

it still cause error" File "/home/kas/style_transfer/ip_adapters/src_repo/ip_adapter_sdxl_lcm.py", line 52, in
pipe.load_lora_weights(lcm_lora_id)
File "/home/kas/style_transfer/ip_adapters/diffusers/src/diffusers/loaders/lora.py", line 1402, in load_lora_weights
raise ValueError("Invalid LoRA checkpoint.")
ValueError: Invalid LoRA checkpoint.

@sayakpaul
Copy link
Member

The model can load Lora, but there is a key issue when loading the IP adapter.

You just said LoRA works fine but your above error message suggests otherwise. It's getting confused I must say.

I would like to request you to provide a minimal and fully reproducible piece of code with no unknown variables. Otherwise, we won't be able to look into it.

@cjt222
Copy link
Author

cjt222 commented Dec 29, 2023

I apologize for not describing it clearly. It may be an environment issue, but I'm not sure which software version is causing it. However, when I switched to a different environment, I was able to successfully load it.

@sayakpaul
Copy link
Member

No problem. Glad that we sorted that one out!

@sayakpaul
Copy link
Member

Hello, can we close this issue?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants