Diffusion models tutorial

A while ago Stable Diffusion Public Release made available to everyone one of the most powerful deep learning models for image generation.

This repository contains:

Tutorials on the theory behind diffusion models and on the software frameworks used to implement them.
A collection of scripts and notebooks that can be used to generate images with Stable Diffusion.
A basic guide to prompt engineering.
A list of resources to dig deeper into the world of diffusion models.

1. 🚀 Quick start

📃 Read up

Check these slides for a short introduction about the idea behind diffusion models and stable diffusion.

💻 Play with notebooks

To try out Stable Diffusion by running run one of the Colab notebooks below.

Text to image
Image to image

To try out Stable Diffusion 2, you can run one of the Colab notebooks below.

Text to image
Impainting
Super-resolution
Depth-to-image

⚒ Understand the theory and learn to build pipelines

Understand the theory behind stable diffusion models and learn how to code a simple diffusion model from scratch in this notebook.
Become familiar with the stable diffusion pipeline and the diffusers 🧨 library in this notebook.

To run the notebooks you need to have several libraries installed. You can do that by installing Anaconda (or Miniconda) and then create the environment using the provided env files.

First, try to create the environment using environment.yml:

conda env create -f environment.yml

If it doesn't work out, try with env_flex.yml that allows for a more flexible installation.

conda env create -f env_flex.yml

The risk here is that it will install more recent versions of the software packages and the notebooks might give some errors. You might need to this more flexible install also if you are on Windows.

2. 💡 Prompt engineering guide

Let's say you want to draw an image of lion. The raw promt, lion will give you images that are usually a bit chaotic or worse quality.

To obtain better results, the prompt should be egineered. A basic recipe is the following:

raw prompt + style + artist + details

Examples of style are: Portrait, Realistic, Oil painting, Pencil drawing, Concept art
Examples of artist are: Jan van Eyck (when style = Portrait), Vincent Van Gogh (when style = Oil painting), Leonardo Da Vinci (when style = Pencil drawing), and so on. Note that you can also mix artists, to get original results.
Examples of details are Unreal Engine if you want to add realistic lightining, 8 k if you want to add more details, artstation if you want to make your image more artistic, and so on.

Example of elaborated prompts:

"Professional photograph of a lion with a black mane, high quality, highly detailed, award-winning, hd, 8k, awe-inspirin"

"retrofuturistic portrait of a lion in astro suit, space graphics art in background, close up, wlop, dan mumford, artgerm, liam brazier, peter mohrbacher, raw, featured in artstation, octane render, cinematic, elegant, intricate, 8 k"

To see more examples of prompts and get inspirations, check here. To find a prompt for a specific image, you can use this image classifier notebook.

3. 📚 Resources

Repositories

A web-interface with tons of advanced features that runs locally - WebUI.
A WebUI extension to generate videos - Deforum WebUI

Colab notebooks (demo)

text2img and img2img with advanced features
Generate video animations (you need to download the weights from here and upload them to your Google Drive)
Find prompts with the interrogator
Stable Diffusion in Tensorflow/Keras
Image2Image pipeline for Stable Diffusion

Colab notebooks (tutorials)

Introduction to diffusers 🧨, the Hugging Face 🤗 library for diffusion models
Introduction to Stable Diffusion with diffusers 🧨
Training a diffusion model with diffusers 🧨
Denoising Diffusion Implicit Models in Tensorflow/Keras

Blogs

What are Diffusion Models? introduction to Diffusion models and mathematical derivations.
The Annotated Diffusion Model step-by-step tutorial for building a Diffusion model from scratch in Pytorch
Generative Modeling by Estimating Gradients of the Data Distribution Introduction to score-based generative models.

Papers

[1] Rombach, Robin, et al. "High-resolution image synthesis with latent diffusion models." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
[2] Ho, Jonathan, Ajay Jain, and Pieter Abbeel. "Denoising diffusion probabilistic models." Advances in Neural Information Processing Systems, 2020.
[3] Song, Yang, and Stefano Ermon. "Generative modeling by estimating gradients of the data distribution." Advances in Neural Information Processing Systems, 2019.

FilippoMB/Diffusion_models_tutorial