Official implementation of Diffusion Autoencoders
A CVPR 2022 (ORAL) paper (paper, site, 5-min video):
@inproceedings{preechakul2021diffusion,
title={Diffusion Autoencoders: Toward a Meaningful and Decodable Representation},
author={Preechakul, Konpat and Chatthee, Nattanat and Wizadwongsa, Suttisak and Suwajanakorn, Supasorn},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2022},
}
Usage
Note: Since we expect a lot of changes on the codebase, please fork the repo before using.
Prerequisites
See requirements.txt
pip install -r requirements.txt
Quick start
A jupyter notebook.
For unconditional generation: sample.ipynb
For manipulation: manipulate.ipynb
For interpolation: interpolate.ipynb
For autoencoding: autoencoding.ipynb
Aligning your own images:
- Put images into the
imgs
directory - Run
align.py
(need topip install dlib requests
) - Result images will be available in
imgs_align
directory
Original in imgs directory |
Aligned with align.py |
Using manipulate.ipynb |
---|
Checkpoints
We provide checkpoints for the following models:
- DDIM: FFHQ128 (72M, 130M), Bedroom128, Horse128
- DiffAE (autoencoding only): FFHQ256, FFHQ128 (72M, 130M), Bedroom128, Horse128
- DiffAE (with latent DPM, can sample): FFHQ256, FFHQ128, Bedroom128, Horse128
- DiffAE's classifiers (for manipulation): FFHQ256's latent on CelebAHQ, FFHQ128's latent on CelebAHQ
Checkpoints ought to be put into a separate directory checkpoints
.
Download the checkpoints and put them into checkpoints
directory. It should look like this:
checkpoints/
- bedroom128_autoenc
- last.ckpt # diffae checkpoint
- latent.ckpt # predicted z_sem on the dataset
- bedroom128_autoenc_latent
- last.ckpt # diffae + latent DPM checkpoint
- bedroom128_ddpm
- ...
LMDB Datasets
We do not own any of the following datasets. We provide the LMDB ready-to-use dataset for the sake of convenience.
The directory tree should be:
datasets/
- bedroom256.lmdb
- celebahq256.lmdb
- celeba.lmdb
- ffhq256.lmdb
- horse256.lmdb
You can also download from the original sources, and use our provided codes to package them as LMDB files. Original sources for each dataset is as follows:
- FFHQ (https://github.com/NVlabs/ffhq-dataset)
- CelebAHQ (https://github.com/switchablenorms/CelebAMask-HQ)
- CelebA (https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html)
- LSUN (https://github.com/fyu/lsun)
The conversion codes are provided as:
data_resize_bedroom.py
data_resize_celebhq.py
data_resize_celeba.py
data_resize_ffhq.py
data_resize_horse.py
Google drive: https://drive.google.com/drive/folders/1abNP4QKGbNnymjn8607BF0cwxX2L23jh?usp=sharing
Training
We provide scripts for training & evaluate DDIM and DiffAE (including latent DPM) on the following datasets: FFHQ128, FFHQ256, Bedroom128, Horse128, Celeba64 (D2C's crop).
Usually, the evaluation results (FID's) will be available in eval
directory.
Note: Most experiment requires at least 4x V100s during training the DPM models while requiring 1x 2080Ti during training the accompanying latent DPM.
FFHQ128
# diffae
python run_ffhq128.py
# ddim
python run_ffhq128_ddim.py
A classifier (for manipulation) can be trained using:
python run_ffhq128_cls.py
FFHQ256
We only trained the DiffAE due to high computation cost. This requires 8x V100s.
sbatch run_ffhq256.py
After the task is done, you need to train the latent DPM (requiring only 1x 2080Ti)
python run_ffhq256_latent.py
A classifier (for manipulation) can be trained using:
python run_ffhq256_cls.py
Bedroom128
# diffae
python run_bedroom128.py
# ddim
python run_bedroom128_ddim.py
Horse128
# diffae
python run_horse128.py
# ddim
python run_horse128_ddim.py
Celeba64
This experiment can be run on 2080Ti's.
# diffae
python run_celeba64.py