Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Bug]: 1.7x DEV v1.7.0-329-g85bf2eb4 alphas_cumprod Downcast setting got lost and stuck on some models #14610

Open
1 of 6 tasks
ibrainventures opened this issue Jan 10, 2024 · 4 comments
Labels
bug Report of a confirmed bug

Comments

@ibrainventures
Copy link
Contributor

ibrainventures commented Jan 10, 2024

Checklist

  • The issue exists after disabling all extensions
  • The issue exists on a clean installation of webui
  • The issue is caused by an extension, but I believe it is caused by a bug in the webui
  • The issue exists in the current version of the webui
  • The issue has not been reported before recently
  • The issue has been reported before but has not been fixed yet

What happened?

A image generated without downcast gets downcasted after 1 run of the XYZ Plot (checkpoint change)

Steps to reproduce the problem

  1. Generate a Image without Downcast (default setting Dev Version)
  2. Save it
  3. Reuse it with the save subseed number
  4. Make a xyz plot with this image and 2 models (also the origin model)
  5. Run the plot (it shows right)
  6. Run the generation without a Script
  7. Now The Image is generated WITH downcast
  8. Stop the webui service Process / Start the webui Service Process (nor UI Reload)
  9. Generate the Image -- Now it is WITHOUT Downcast

What should have happened?

The "without Downcast" should be respected also after running a xzy Script.

What browsers do you use to access the UI ?

Mozilla Firefox

Sysinfo

sysinfo-2024-01-10-17-17.json

Console logs

Total progress: 100%|███████████████████████████████████████████████████████████████████████████████████████████████|

Additional information

No response

@ibrainventures ibrainventures added the bug-report Report of a bug, yet to be confirmed label Jan 10, 2024
@ibrainventures ibrainventures changed the title [Bug]: 1.7x DEV v1.7.0-329-g85bf2eb4 alphas_cumprod Downcast setting got lost after XYZ Plot [Bug]: 1.7x DEV v1.7.0-329-g85bf2eb4 alphas_cumprod Downcast setting got lost after a some checkpoints Jan 10, 2024
@ibrainventures ibrainventures changed the title [Bug]: 1.7x DEV v1.7.0-329-g85bf2eb4 alphas_cumprod Downcast setting got lost after a some checkpoints [Bug]: 1.7x DEV v1.7.0-329-g85bf2eb4 alphas_cumprod Downcast setting got lost and stuck on some models Jan 10, 2024
@ibrainventures
Copy link
Contributor Author

After some tests this issue (only) happens on starting or switching to 30% of my checkpoints:

some examples of problematic checkpoints (SD 1.5):

analogmadness_v70
juggernaut_reborn
cyberrealistic_v41

Starting or switching to those models are breaking the downcast function.

Starting with a "unproblematic" checkpoint (eg : epiccartoon_v1) and all works fine (with or without downcast..).
Switching or Starting with / on of the "problematic" checkpoints breaks the downcast option. After that the system is stucked in the "use_downcasted_alpha_bar true" mode. Only a restart of the webui process WITH a unproblematic checkpoint resolves the stuck-ness.

@ibrainventures
Copy link
Contributor Author

ibrainventures commented Jan 10, 2024

add:
It also get stucked / freezed on the unproblematic checkpoints, if i add the

"Downcast model alphas_cumprod to fp16 before sampling. For reproducing old seeds."

checkbox to the frontend and run a xyz script with 2 models / checkpoint change and the option set to true.

@Cyberbeing
Copy link
Contributor

Cyberbeing commented Jan 11, 2024

I can confirm this issue with various other models as well.
I just ran into it when using Refiner with a couple SD1.5 based models, no scripts.

I did some testing, and it seems this issue is being caused by models without alphas_cumprod, inheriting alphas_cumprod from the next model loaded after #14145. Since some FP16 converted ckpt/safetensor models have alphas_cumprod in FP16 as well, it triggers this bug.

I can reproduce the bug as follows:

  1. Disable Downcast model alphas_cumprod to fp16
  2. Load a model which does not contain alphas_cumprod in the ckpt/safetensors
  3. Exit Webui, Start Webui
  4. Do a generation (save the result for comparisons)
  5. Enable Downcast model alphas_cumprod to fp16
  6. Do a generation (save the result for comparisons)
  7. Disable Downcast model alphas_cumprod to fp16
  8. Do a generation (confirm the result matches 4.)
  9. Enable refiner with a model which contains alphas_cumprod with FP16 dtype
  10. Do a generation with refiner enabled
  11. Disable refiner
  12. Do a generation (the result now looks like 6. when it should look like 4. since the model has inherited FP16 alphas_cumprod from the refiner model)
  13. Enable refiner with a model with contains alphas_cumprod with FP32 dtype
  14. Do a generation with refiner enabled
  15. Disable refiner
  16. Do a generation (the result now looks like 4. since the model has inherited FP32 alphas_cumprod from the refiner model)

Another unrelated observation, is that models containing a FP16 alphas_cumprod don't actually end up with a FP32 precision alphas_cumprod when Downcast model alphas_cumprod to fp16 is disabled (when enabled only tiny details change if any).

On the other hand, models containing a FP32 alphas_cumprod can have rather significant changes to output between Downcast model alphas_cumprod to fp16 enabled and disabled (i.e. I've seen major characteristics of people and objects completely change while maintaining nearly identical composition) . It makes me realize I may need to go back and re-convert all my FP32 models to FP16 while maintaining the true FP32 precision alphas_cumprod to make the Downcast model alphas_cumprod to fp16 switch useful. Though if this bug is fixed, I could instead just delete FP16 alphas_cumprod from the models. I can only assume LDM must generate FP32 alphas_cumprod automatically if it is missing from the model on load.

@catboxanon catboxanon added bug Report of a confirmed bug and removed bug-report Report of a bug, yet to be confirmed labels Jan 12, 2024
@Cyberbeing
Copy link
Contributor

Cyberbeing commented Jan 13, 2024

I did a bit more testing today, and discovered another way to trigger this issue.

  1. Disable Downcast model alphas_cumprod to fp16
  2. Load a model which does not contain alphas_cumprod in the ckpt/safetensors
  3. Exit Webui, Start Webui
  4. Do a generation (save the result for comparisons)
  5. Enable Downcast model alphas_cumprod to fp16
  6. Do a generation (save the result for comparisons)
  7. Disable Downcast model alphas_cumprod to fp16
  8. Switch to a model which does not contain alphas_cumprod with FP16 dtype
  9. Do a generation (ignore results, though at this point this model is also stuck with fp16 alphas_cumprod)
  10. Switch back to the original model
  11. Do a generation (the results will now match 6, when it should look like 4.

Edit: It seems this issue switching issue occurs even with models containing fp32 alphas_cumprod.

What this tells me, is that rather than from the model itself, in this case the dtype of alphas_cumprod is being inherited from the alphas_cumprod dtype of your last generation prior to switching models. Expected behavior would be for alphas_cumprod to return to FP32 when Downcast model alphas_cumprod was disabled in 7, even if you didn't perform a generation prior to switching models. It's noteworthy that the issue doesn't occur in this case if I do a generation between staps 7 & 8. It's only when the Downcast model alphas_cumprod to fp16 state has been changed, but the effect of the change has not being triggered prior to model switch.

It would appear this issue is being caused in part by reuse_model_from_already_loaded() never calling load_model() when shared.opts.sd_checkpoints_limit is exceeded. So if you are unable to reproduce the bug, try setting shared.opts.sd_checkpoints_limit=1. Without load_model() being called, it seems alpha_cumprod has a chance of getting stuck as float16 on model switch under certain conditions:

elif len(model_data.loaded_sd_models) > 0:
sd_model = model_data.loaded_sd_models.pop()
model_data.sd_model = sd_model
sd_vae.base_vae = getattr(sd_model, "base_vae", None)
sd_vae.loaded_vae_file = getattr(sd_model, "loaded_vae_file", None)
sd_vae.checkpoint_info = sd_model.sd_checkpoint_info
print(f"Reusing loaded model {sd_model.sd_checkpoint_info.title} to load {checkpoint_info.title}")
return sd_model

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Report of a confirmed bug
Projects
None yet
Development

No branches or pull requests

3 participants