Implementation code:Advancing Pose-Guided Image Synthesis with Progressive Conditional Diffusion Models, accepted at International Conference on Learning Representations (ICLR) 2024.
You can directly download our test results from Google Drive: (1) PCDMs vs SOTA (2) PCDMs Results.
The PCDMs vs SOTA compares our method with several state-of-the-art methods e.g. ADGAN, PISE, GFLA, DPTN, CASD, NTED, PIDM. Each row contains target_pose, source_image, ground_truth, ADGAN, PISE, GFLA, DPTN, CASD, NTED, PIDM, and PCDMs (ours) respectively.
We present a simplified version of PCDMs and only use stage2, utilizing training data from TikTok and DeepFashion. The weights can be obtained from Google drive.
Download dwpose weights (dw-ll_ucoco_384.pth
, yolox_l_8x8_300e_coco_20211126_140236-d3bd2b23.pth
) following this.
# install diffusers & pose extractor
pip install diffusers==0.24.0
pip install controlnet-aux==0.0.7
pip install transformers==4.32.1
pip install accelerate==0.24.1
# install DWPose which is dependent on MMDetection, MMCV and MMPose
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.1"
mim install "mmdet>=3.1.0"
mim install "mmpose>=1.1.0"
# clone code
git clone https://github.com/tencent-ailab/PCDMs.git
# download the models
cd PCDMs
mv {weights} ./PCDMs_ckpt.pt
# then you can use the notebook
{pcdms_demo.ipynb}
This link contains processed and prepared data that is ready for use. The data has been processed in the following ways:
• Rename image
• Split the train/test set
• keypoints extracted with Openpose
The folder structure of dataset should be as follows:
Deepfashion/
├── all_data_png # including train and test images
│ ├── img1.png
│ ├── ...
│ ├── img52712.png
├── train_lst_256_png # including train images of 256 size
│ ├── img1.png
│ ├── ...
│ ├── img48674.png
├── train_lst_512_png # including train images of 512 size
│ ├── img1.png
│ ├── ...
│ ├── img48674.png
├── test_lst_256_png # including test images of 256 size
│ ├── img1.png
│ ├── ...
│ ├── img4038.png
├── test_lst_512_png # including test images of 512 size
│ ├── img1.png
│ ├── ...
│ ├── img4038.png
├── normalized_pose_txt.zip # including pose coordinate of train and test set
│ ├── pose_coordinate1.txt
│ ├── ...
│ ├── pose_coordinate40160.txt
├── train_data.json
├── test_data.json
Download img_highres.zip
of the DeepFashion Dataset from In-shop Clothes Retrieval Benchmark.
Unzip img_highres.zip
. You will need to ask for password from the dataset maintainers.
We provide 3 stage checkpoints available here.
- train/test stage1-prior
sh run_stage1.sh & sh run_test_stage1.sh
- train/test stage2-inpaint
sh run_stage2.sh & sh run_test_stage2.sh
- train/test stage3-refined
sh run_stage3.sh & sh run_test_stage3.sh
If this work is useful to you, please consider citing our paper:
@article{shen2023advancing,
title={Advancing Pose-Guided Image Synthesis with Progressive Conditional Diffusion Models},
author={Shen, Fei and Ye, Hu and Zhang, Jun and Wang, Cong and Han, Xiao and Yang, Wei},
journal={arXiv preprint arXiv:2310.06313},
year={2023}
}
If you have any questions, please feel free to contact with me at shenfei140721@126.com.