Denoising Diffusion Implicit Models (DDIM)

Jiaming Song, Chenlin Meng and Stefano Ermon, Stanford

Implements sampling from an implicit model that is trained with the same procedure as Denoising Diffusion Probabilistic Model, but costs much less time and compute if you want to sample from it (click image below for a video demo):

Integration with 🤗 Diffusers library

DDIM is now also available in 🧨 Diffusers and accesible via the DDIMPipeline. Diffusers allows you to test DDIM in PyTorch in just a couple lines of code.

You can install diffusers as follows:

pip install diffusers torch accelerate

And then try out the model with just a couple lines of code:

from diffusers import DDIMPipeline

model_id = "google/ddpm-cifar10-32"

# load model and scheduler
ddim = DDIMPipeline.from_pretrained(model_id)

# run pipeline in inference (sample random noise and denoise)
image = ddim(num_inference_steps=50).images[0]

# save image
image.save("ddim_generated_image.png")

More DDPM/DDIM models compatible with hte DDIM pipeline can be found directly on the Hub

To better understand the DDIM scheduler, you can check out this introductionary google colab

The DDIM scheduler can also be used with more powerful diffusion models such as Stable Diffusion

You simply need to accept the license on the Hub, login with huggingface-cli login and install transformers:

pip install transformers

Then you can run:

from diffusers import StableDiffusionPipeline, DDIMScheduler

ddim = DDIMScheduler.from_config("runwayml/stable-diffusion-v1-5", subfolder="scheduler")
pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", scheduler=ddim)

image = pipeline("An astronaut riding a horse.").images[0]

image.save("astronaut_riding_a_horse.png")

Running the Experiments

The code has been tested on PyTorch 1.6.

Train a model

Training is exactly the same as DDPM with the following:

python main.py --config {DATASET}.yml --exp {PROJECT_PATH} --doc {MODEL_NAME} --ni

Sampling from the model

Sampling from the generalized model for FID evaluation

python main.py --config {DATASET}.yml --exp {PROJECT_PATH} --doc {MODEL_NAME} --sample --fid --timesteps {STEPS} --eta {ETA} --ni

where

ETA controls the scale of the variance (0 is DDIM, and 1 is one type of DDPM).
STEPS controls how many timesteps used in the process.
MODEL_NAME finds the pre-trained checkpoint according to its inferred path.

If you want to use the DDPM pretrained model:

python main.py --config {DATASET}.yml --exp {PROJECT_PATH} --use_pretrained --sample --fid --timesteps {STEPS} --eta {ETA} --ni

the --use_pretrained option will automatically load the model according to the dataset.

We provide a CelebA 64x64 model here, and use the DDPM version for CIFAR10 and LSUN.

If you want to use the version with the larger variance in DDPM: use the --sample_type ddpm_noisy option.

Sampling from the model for image inpainting

Use --interpolation option instead of --fid.

Sampling from the sequence of images that lead to the sample

Use --sequence option instead.

The above two cases contain some hard-coded lines specific to producing the image, so modify them according to your needs.

References and Acknowledgements

@article{song2020denoising,
  title={Denoising Diffusion Implicit Models},
  author={Song, Jiaming and Meng, Chenlin and Ermon, Stefano},
  journal={arXiv:2010.02502},
  year={2020},
  month={October},
  abbr={Preprint},
  url={https://arxiv.org/abs/2010.02502}
}

This implementation is based on / inspired by:

https://github.com/hojonathanho/diffusion (the DDPM TensorFlow repo),
https://github.com/pesser/pytorch_diffusion (PyTorch helper that loads the DDPM model), and
https://github.com/ermongroup/ncsnv2 (code structure).

ermongroup/ddim