Skip to content

Latest commit

 

History

History
56 lines (28 loc) · 3.05 KB

README.md

File metadata and controls

56 lines (28 loc) · 3.05 KB

Virtual Try-On: Details preservation

We have noticed that virtual try-on models struggle to preserve the details of the garments during reconstruction (LaDI-VTON). That is why we want to work on this aspect, taking inspiration from the Structural-Correction Module from the paper AutoDIR: Automatic All-in-One Image Restoration with Latent Diffusion.

Table of Contents

Dataset

To train an SCM module, we need, for each instance, the image of the product, the image of the model wearing the product, and the image generated by a VTON module of the model wearing the product. Therefore, we used VITON-HD dataset and LaDI-VTON inference with pretrained models to generate images.

In order to build our dataset we followed the LaDI-VTON README instructions and we applied it on VITON-HD.

Approach

Approach Schema

  • We implemented SCM from AutoDIR in Refiners micro-framework. SCM is based on NAF-Net architecture.
  • We validated our implementation to be sure we didn't make any mistakes translating the model
  • We tested a lot of trainings configurations using different inputs as context image (original garment, warped garment produced by LaDI-VTON, warped garment taken from ground truth with gaussian noise degradation)
  • Qualitative and quantitative evaluation of our trained models

Results

Reconstruction example with different inputs

The results are not completely satisfying, we believed virtual-try-on is a use case difficult to transpose to the AutoDIR SCM problem. AutoDIR SCM take as input two almost identical images that can be superposed, and this is why we tested the warped garments:

AutoDIR

In virtual-try-on tasks is very hard to have almost two identical images to fix structural errors on the garment...

Repository Organisation

The repository is divided in three sections corresponding to the chronological milestones:

  • First a ramp_up folder. Here we have compiled all the work done during our ramp up in deep learning and refiners framework. Basically we learned the basics by coding an autoencoder to reconstruct cat images Example 1 Example 2

  • Then we tried to use a variational autoencoder (vae folder) to build our dataset with structural errors. At that time, we weren't aware LaDI-VTON released pretrained models. This is why we tested this alternative approach but we didn't manage to train correctly the VAE because of high computational cost.

  • Finally, scm folder where you will find our implementation, training scripts and dataset object. This is the main contribution of this project.