Crop and pad in LatentDiffusionInferer #420

virginiafdez · 2023-09-19T12:15:54Z

To maximise the shape that can go through the VAE or VQ-VAE, sometimes we must pick size shapes that result in latent space shapes that cannot go through the LDM for not being multiple of 2**(num levels on the unet).
To overcome this, a solution is to pad the VAE latent space and crop it back after sampling before passing it to the VAE - VQVAE. Having a simple MONAI transform on the inferer would be enough.

virginiafdez self-assigned this Sep 19, 2023

virginiafdez linked a pull request Sep 19, 2023 that will close this issue

Add pad and cropping options to the Latent Diffusion Inferer (+ test). #421

Merged

marksgraham closed this as completed in #421 Oct 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crop and pad in LatentDiffusionInferer #420

Crop and pad in LatentDiffusionInferer #420

virginiafdez commented Sep 19, 2023

Crop and pad in LatentDiffusionInferer #420

Crop and pad in LatentDiffusionInferer #420

Comments

virginiafdez commented Sep 19, 2023