Skip to content
This repository has been archived by the owner on Feb 7, 2025. It is now read-only.

Crop and pad in LatentDiffusionInferer #420

Closed
virginiafdez opened this issue Sep 19, 2023 · 0 comments · Fixed by #421
Closed

Crop and pad in LatentDiffusionInferer #420

virginiafdez opened this issue Sep 19, 2023 · 0 comments · Fixed by #421
Assignees

Comments

@virginiafdez
Copy link
Contributor

To maximise the shape that can go through the VAE or VQ-VAE, sometimes we must pick size shapes that result in latent space shapes that cannot go through the LDM for not being multiple of 2**(num levels on the unet).
To overcome this, a solution is to pad the VAE latent space and crop it back after sampling before passing it to the VAE - VQVAE. Having a simple MONAI transform on the inferer would be enough.

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
1 participant