-
Notifications
You must be signed in to change notification settings - Fork 27k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Implement zero terminal SNR noise schedule option #14145
Conversation
From the code it looks like the feature only works for half precision: alphas_cumprod_original is only added for half, and nothing is done if alphas_cumprod_original does not exist. This looks like a good change overall but since it alters generation by default, I don't want to be hasty with merging it. If anyone wants to discuss his, please do here. Some images for comparison can also help. |
I will fix its implementation on the other code paths. Here are a few grids for comparing the zero SNR noise schedule, generated using ZeroDiffusion 0.9. The first two are on Euler A with the default noise schedule. Note how they are overexposed and that there is a black border on some samples on the first grid. Also note that these images were generated without any CFG rescale or dynamic thresholding applied. With the zero SNR noise schedule, these modes of failure are less likely to occur. You can also note that there's no longer a glowing outline around the subject in the second grid. This is one sample of the compatibility mode: This should be compared with the second grid posted above. The changes are extremely minor, I only notice a few patches of the first image and one patch on the third image having any differences, the others appear to be identical. |
With further testing I am getting this error: Give me some time to investigate this before merging since it seems obvious that something is being casted inappropriately (line numbers will be wrong because of my debug prints) edit: It appears that this is not a casting issue, it is an issue with torchsde's brownian interval rounding the start and stop values. I don't know if this has a significant impact, but rounding the sigma schedule to 6 decimal places would match it. It might make more sense for this to be fixed upstream in the k-diffusion repo. |
Current implementation also seems to break Heun, but not Restart for some odd reason. Need to investigate. |
I have identified the root cause for some samplers breaking. It appears that Heun, DDIM, and UniPC use other values that can be derived from |
This is the only place these values are ever referenced outside of training code so this change is very justifiable and more consistent.
I went forward with re-deriving the values in the one place that used those values and now the DDIM sampler works more or less as expected. I have further investigated the problems with Heun and UniPC and have concluded that both of these issues are present in the original ComfyUI implementation of zero terminal SNR sampling that I based this on. Model outputs for Heun start out in a ridiculous range and never seem to reach the expected range of -1 to 1, probably partly because it starts at sigma ~4500 and then goes to a sigma around 60. Under Restart, model outputs still start out in a rather high range but end up leveling out since the second sigma is around 3300. Outputs look normal under Heun if I use CFG rescale with fi=1.0, so I think the issue is lack of said rescale, and I would consider that out of scope of this pull request. As far as I am concerned, this pull request is complete and ready. There's a number of other things that could be altered in order to make zero terminal SNR better supported (I would want to look at the discrepancy in Restart's sigma schedule for instance now that I know how to get reasonable outputs from Heun), but those can be investigated separately later. |
I’ve been seeing much closer prompt adherence especially with regard to fine details after applying this patch. I haven’t used the new noise scheduler yet so is this caused by the alphas_cumprod fix? |
The changes shouldn't really be substantial enough to have a noticeable impact on prompt adherence. If it does then that may indicate something is wrong. Could you provide a side-by-side comparison with and without the patch on the same seed and settings to demonstrate? |
I decided to add the system I was considering previously, and that was recently suggested to me in discord, for automatically enabling backwards compatibility options; it does not completely solve the problem, but alleviates it a bit. And with that the PR is accepted. One thing I'd like to ask you to do is to make another PR where you remove unneeded |
I updated today to the dev version and in fine details / photorealistic the improvement by your PR is seeable / great. only difference between the generation was: (hint for the api / override_setting param user) Img 1: 'use_downcasted_alpha_bar' => true Img 2: 'use_downcasted_alpha_bar' => false / (or no param) (the "new" calculation) Thanks for this PR. |
Description
Draft progress
but I don't know how to get them to apply setting overrides from the PNG info tab. This is one of the reasons why I am making this a draft for the moment.fixed thanks to catboxanonChecklist: