Skip to content

[Community] Move the number "0.18215" from the image2image process to VAE config #726

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
wangyu-ustc opened this issue Oct 4, 2022 · 12 comments
Assignees
Labels
good first issue Good for newcomers hacktoberfest stale Issues that haven't received updates

Comments

@wangyu-ustc
Copy link

There is a magic number "0.18215" in the repository

In the file src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py, there is a number "0.18215" in line 220 and line 342, which is strange since it does occur in the original repo. Is there someone clarifying why is that and where does this number come from?

@WASasquatch
Copy link

WASasquatch commented Oct 5, 2022

It's a constant used to scale the latents so it can be decoded back into a image (src)

# scale and decode the image latents with vae
latents = 1 / 0.18215 * latents
image = vae.decode(latents).sample

@guaneec
Copy link

guaneec commented Oct 5, 2022

I think the constant is defined in the model config file from CompVis/stable-diffusion.

@pcuenca
Copy link
Member

pcuenca commented Oct 5, 2022

There's more explanation about it in #437.

@patrickvonplaten
Copy link
Contributor

Let's put it maybe directly in the VAE config then ? cc @patil-suraj

@neverix
Copy link
Contributor

neverix commented Oct 5, 2022

Maybe this can be a method for a VAE that is overridable? For supporting more complex squashing functions 😉

@github-actions github-actions bot added the stale Issues that haven't received updates label Nov 4, 2022
@huggingface huggingface deleted a comment from github-actions bot Nov 7, 2022
@patrickvonplaten
Copy link
Contributor

Think we can have this be a config parameter that is overrideable and a Union[int, str] with the string describing a more complex squashing function that can be implemented down the road.

Marking this for now as a community feature as it seems like no one finds the time to open a PR here - in case you're interested @neverix - we'd be more than happy to review a PR :-)

@patrickvonplaten patrickvonplaten changed the title Why is there a number "0.18215" in the image2image process [Community] Move the number "0.18215" from the image2image process to VAE config Nov 7, 2022
@patrickvonplaten
Copy link
Contributor

Should be solved by: #1460

@williamberman could you maybe tackle this?

@williamberman
Copy link
Contributor

williamberman commented Dec 1, 2022

Put up draft PR here: #1515 still need to think about a few things before finishing

@fepegar
Copy link

fepegar commented Dec 19, 2022

For reference, here's some code to estimate the magic value: #437 (comment).

@patrickvonplaten
Copy link
Contributor

Thanks a lot @fepegar !

@hervenivon
Copy link

Put up draft PR here: #1515 still need to think about a few things before finishing

For people following this: the new PR is #1860

@patil-suraj
Copy link
Contributor

#1860 is now merged, closing the issue.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
good first issue Good for newcomers hacktoberfest stale Issues that haven't received updates
Projects
None yet
Development

No branches or pull requests

10 participants