Official code for 'DatasetDM:Synthesizing Data with Perception Annotations Using Diffusion Models'
-
[note] We will release the code within three months. Please wait.
-
[2023.8.11] We initialize the Repo.
ToDo
- Instance Segmentation (COCO2017)
- Semantic Segmentation (VOC, Cityscapes)
- Depth Estimation
- Open Pose
- DeepFashion Segmentation
- Open Segmentation
- Long-tail Segmentation
To demonstrate the high-quality synthetic data, we visualized synthetic data from two domains: human-centric and urban city:
Large language model, GPT-4, is adopted to enhance the diversity of generative data:
- Hugging Face Demo
- ...
conda create -n DatasetDM python=3.8
Install the corresponding torch==1.9.1, please refer to pytorch. Such as:
pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html
Then install other packages:
python -m pip install -r requirements.txt
Download the weights and configuration files of SD 1.4 and place them in the ./dataset/ckpts
directory.
Download the diffusers
cd model
git clone https://github.com/huggingface/diffusers.git
-
Depth Estimation: Please follow MED to prepare the dataset on
./data
-
Segmentation: VOC, Cityscapes, and COCO: Please follow Mask2former to prepare the dataset on
./data
The final dataset should be ordered as follow:
data/
PascalVOC12/
JPEGImages
SegmentationClassAug
splits/
train_aug.txt
COCO2017/
train2017/
2011_003261.jpg
...
annotations/
instances_train2017.json
person_keypoints_train2017.json
VirtualKITTI2/
Depth/
Scene01
Scene02
...
Image/
Scene01
Scene02
...
nyudepthv2/
sync/
official_splits/
test/
nyu_class_list.json
train_list.txt
test_list.txt
kitti/
input/
gt_depth/
kitti_eigen_train.txt
deepfashion-mm/
images/
segm/
captions.json/
train_set.txt/
test_set.txt
Besides, you also need to order the prompt txt files as follows:
dataset/
Prompts_From_GPT/
deepfashion_mm/
general.txt
coco_pose/
general.txt
KITTI/
general.txt
NYU/
general.txt
coco/
toothbrush.txt
hair drier.txt
book.txt
...
cityscapes/
bicycle.txt
motorcycle.txt
bus.txt
...
- Semantic Segmentation
- Instance Segmentation
- Depth Estimation
- Open Pose
- Zero-Shot Semantic Segmentation
- Fashion Segmentation
- Long tail Segmentation
# For Segmentation Tasks
sh ./script/train_semantic_VOC.sh
# Generate synthetic data for VOC
sh ./script/data_generation_VOC_semantic.sh
# Visualization of generative data
python ./DataDiffusion/vis_VOC.py
# For Segmentation Tasks
sh ./script/train_semantic_Cityscapes.sh
# Generate synthetic data for Cityscapes
sh ./script/data_generation_Cityscapes_semantic.sh
Before training the existing segmentation model~(), you should adopt the augmentation:
sh ./script/augmentation_Cityscapes.sh
# Visualization of generative data
python ./DataDiffusion/vis_Cityscapes.py
# For Segmentation Tasks
sh ./script/train_COCO.sh
# Generate synthetic data for COCO
sh ./script/data_generation_coco_instance.sh
# Visualization of generative data
python ./DataDiffusion/vis_COCO.py
Data Augmentation with image splicing
# Augmentation of generative data
sh ./script/augmentation_coco.sh
Then training Mask2former with these synthetic data, enjoy!
# Training Depth Estimation Tasks on KITTI
sh ./script/train_depth_KITTI.sh
If you want to training with Virtual_KITTI_2, using the blow script:
# Training Depth Estimation Tasks on Virtual KITTI 2
sh ./script/train_depth_Virtual_KITTI_2.sh
# Generate synthetic data for KITTI
sh ./script/data_generation_KITTI_depth.sh
Then training any existing Depth Estimation Method with these synthetic data, enjoy!
In our paper, we adopt Depthformer to valid the quality of generative data.
# For Depth Estimation Tasks
sh ./script/train_depth_NYU.sh
# Generate synthetic data for NYU
sh ./script/data_generation_NYU_depth.sh
Data Augmentation with image splicing
# Augmentation of generative data
sh ./script/augmentation_NYU.sh
Then training any existing Depth Estimation Method with these synthetic data, enjoy!
In our paper, we adopt Depthformer to valid the quality of generative data.
# Training Pose Estimation Tasks on COCO2017
sh ./script/train_pose_coco.sh
# Generate synthetic data for Pose on COCO
sh ./script/data_generation_COCO_Pose.sh
Then you need convert the data to coco format, and training any existing Pose Estimation Method with these dataset. Here, we adopt SimCC to valid the quality of generative data.
Download VOC 2012, and order the dataset.
# For Zero Shot Segmentation Tasks
sh ./script/train_semantic_VOC_zero_shot.sh
# Generate synthetic data for VOC
sh ./script/data_generation_VOC_semantic.sh
Data Augmentation with image splicing
# Augmentation of generative data
sh ./script/augmentation_VOC.sh
Then training Mask2former with these synthetic data, enjoy!
Download DeepFashion-MM, and order the dataset.
# Train DeepFashion Segmentation Tasks
sh ./script/train_semantic_DeepFashion_MM.sh
# Generate synthetic data for DeepFashion-MM
sh ./script/parallel_generate_Semantic_DeepFashion.py
Then training Mask2former or other Segmentation Methods~mmsegmentation with these synthetic data, enjoy!
# For LongTail semantic segmentation
sh ./script/train_semantic_VOC_LongTail.sh
# Generate synthetic data for VOC
sh ./script/data_generation_VOC_semantic.sh
Data Augmentation with image splicing
# Augmentation of generative data
sh ./script/augmentation_VOC.sh
# For LongTail semantic segmentation
sh ./script/train_instance_LVIS.sh
# Generate synthetic data for VOC
sh ./script/data_generation_LVIS_instance.sh
This work draws inspiration from the following code as references. We extend our gratitude to these remarkable contributions:
@article{wu2023datasetdm,
title={DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models},
author={Wu, Weijia and Zhao, Yuzhong and Chen, Hao and Gu, Yuchao and Zhao, Rui and He, Yefei and Zhou, Hong and Shou, Mike Zheng and Shen, Chunhua},
journal={arXiv preprint arXiv:2308.06160},
year={2023}
}