This repository is the official implementation of TPD
Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On
Xu Yang, Changxing Ding, Zhibin Hong, Junhao Huang, Jin Tao, Xiangmin Xu
- Release inference code
- Release model weights
- Release training code
- Release evaluation code
conda env create -f environment.yml
conda activate TPD
Download the pretrained checkpoint and save it in the checkpoints folder like:
checkpoints
|-- release
|-- TPD_240epochs.ckpt
Download the VITON-HD dataset from here.
You should copy the test folder for validation and the dataset structure should be like:
datasets/VITONHD/
test | train | validation(copied from test)
|-- agnostic-mask
|-- agnostic-v3.2
|-- cloth
|-- cloth_mask
|-- image
|-- image-densepose
|-- image-parse-agnostic-v3.2
|-- image-parse-v3
|-- openpose_img
|-- openpose_json
Refer to commands/inference.sh
We utilize the pretrained Paint-by-Example as initialization, and increase it's first conv-layer from 9 to 18 channels (zero initiated). Please download the pretrained model first and save it in the checkpoints folder. Then run utils/rm_clip_and_add_channels.py to add input channels of the first conv-layer and remove CLIP module. The final checkpoints folder structure is like:
checkpoints
|-- original
|-- model.ckpt
|-- mode_prepared.ckpt
Refer to commands/train.sh
LPIPS: https://github.com/richzhang/PerceptualSimilarity
FID: https://github.com/mseitzer/pytorch-fid
Run utils/generate_GT.py to generate GT images with 384*512 resolution
Refer to calculate_metrics/calculate_metrics.sh
Thanks to Paint-by-Example, our code is heavily borrowed from it.
@InProceedings{Yang_2024_CVPR,
author = {Yang, Xu and Ding, Changxing and Hong, Zhibin and Huang, Junhao and Tao, Jin and Xu, Xiangmin},
title = {Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2024},
pages = {7017-7026}
}