Rectified-Diffusion: A Python repository from YangLing0818

Rectified Diffusion: Straightness is Not Your Need

by Fu-Yun Wang¹, Ling Yang², Zhaoyang Huang¹, Mengdi Wang³, Hongsheng Li¹

¹CUHK-MMLab ²Peking University ³Princeton University

@article{wang2024rectified,
  title={Rectified Diffusion},
  author={Wang, Fu-Yun and Yang, Ling and Huang, Zhaoyang and Wang, Mengdi and Li, Hongsheng},
  journal={arXiv preprint},
  year={2024}
}

Introduction

TLDR: Rectified Diffusion identifies the straighness is not the essential training target and extends the scope of rectified flow.

Training efficiency and efficacy:

Set up

Install environment

conda env create -f environment.yml

Download public model weights

git clone https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5

git clone https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0

git clone https://huggingface.co/madebyollin/sdxl-vae-fp16-fix

Evaluation

# download weights
git clone https://huggingface.co/wangfuyun/Rectified-Diffusion
# including 4 weights
- https://huggingface.co/wangfuyun/Rectified-Diffusion/resolve/main/weights/rd.ckpt
- https://huggingface.co/wangfuyun/Rectified-Diffusion/resolve/main/weights/cm.ckpt
- https://huggingface.co/wangfuyun/Rectified-Diffusion/resolve/main/weights/phased.ckpt
- https://huggingface.co/wangfuyun/Rectified-Diffusion/resolve/main/weights/phasedxl.ckpt
# you can download the weights through wget -c
wget -c https://huggingface.co/wangfuyun/Rectified-Diffusion/resolve/main/weights/rd.ckpt

# download coco-2017
wget -c http://images.cocodataset.org/zips/val2017.zip

# generate 5k pairs for evaluation 
bash gen_5k.sh

python -m pytorch_fid coco_5k results/rectifieddiffusion/5k/rd_cfg1.5_1step

python -m pytorch_fid coco_5k results/rectifieddiffusion/5k/rd_cfg1.5_2step

python -m pytorch_fid coco_5k results/rectifieddiffusion/5k/rd_cfg1.5_4step

python -m pytorch_fid coco_5k results/rectifieddiffusion/5k/rd_cfg1.5_8step

python -m pytorch_fid coco_5k results/rectifieddiffusion/5k/rd_cfg1.5_16step

python -m pytorch_fid coco_5k results/rectifieddiffusion/5k/rd_cfg1.5_25step


python -m pytorch_fid coco_5k results/cm/5k/cfg1.0_1step


python -m pytorch_fid coco_5k results/cm/5k/cfg1.0_2step


python -m pytorch_fid coco_5k results/phased/5k/cfg1.5_4step

python -m pytorch_fid coco_5k results/phasedxl/5k/cfg1.5_4step

Comparison

Reproduced results on FID on COCO-2017: Lower is better.

Configuration	NFE	Reproduced	Reported
`Rectified Diffusion`	1	27.1	27.26
`Rectified Diffusion`	2	22.96	22.98
`Rectified Diffusion`	25	21.34	21.28
`Rectified Diffusion (CM)`	1	22.75	22.83
`Rectified Diffusion (CM)`	2	21.38	21.66
`Rectified Diffusion (Phased)`	4	20.49	20.64
`Rectified Diffusion-XL (Phased)`	4	25.59	25.81

Train

Train Rectified Diffusion on Stable Diffusion v1-5

bash gen_pairs.sh # generate 1.6M noise-sample (latents) pairs pair
# Since the authors of InstaFlow did not specify the prompts used, we choosed random sampled 1.6M prompts. 
# You might find the following links to be useful
# https://huggingface.co/datasets/MuhammadHanif/Laion_aesthetics_5plus_1024_33M
# https://huggingface.co/datasets/laion/laion2B-en-aesthetic

bash run.sh # train the rectified diffusion models
# I use a small batch size and small learning rate and train for more interations. The training hyper-parameters were just empirically defined instead of being carefully searched. You might find other training configurations to be better.

Train Rectified Diffusion (Phased) on Stabel Diffusion v1-5

# you should first donwload a subset of laion-2b for training. I use a set of 500k images for training.
bash run_phased.sh

Train Rectified Diffusion (Phased) on Stabel Diffusion XL

# you should first donwload a subset of laion-2b for training. I use a set of 500k images for training.
bash run_phasedxl.sh

Contact

If you have any questions, please feel free to contact us: Fu-Yun Wang (fywang@link.cuhk.edu.hk) and Ling Yang (yangling0818@163.com).

YangLing0818/Rectified-Diffusion