/2024-DGM-AIMasters-course

Deep Generative Models course, 2024

Primary LanguageJupyter Notebook

Deep Generative Models course, AIMasters, 2024

Description

The course is devoted to modern generative models (mostly in the application to computer vision).

We will study the following types of generative models:

  • autoregressive models,
  • latent variable models,
  • normalization flow models,
  • adversarial models,
  • diffusion models.

Special attention is paid to the properties of various classes of generative models, their interrelationships, theoretical prerequisites and methods of quality assessment.

The aim of the course is to introduce the student to widely used advanced methods of deep learning.

The course is accompanied by practical tasks that allow you to understand the principles of the considered models.

Contact the author to join the course or for any other questions :)

Materials

# Date Description Slides
1 February, 7 Lecture 1: Logistics. Generative models overview and motivation. Problem statement. Divergence minimization framework. Autoregressive models (PixelCNN). slides
Seminar 1: Introduction. Maximum likelihood estimation. Histograms. Bayes theorem. slides
2 February, 14 Lecture 2: Normalizing Flow (NF) intuition and definition. Forward and reverse KL divergence for NF. Linear NF. Gaussian autoregressive NF. slides
Seminar 2: PixelCNN. slides
3 February, 21 Lecture 3: Coupling layer (RealNVP). Continuous-in-time NF and neural ODE. Kolmogorov-Fokker-Planck equation for NF log-likelihood. FFJORD and Hutchinson's trace estimator. slides
Seminar 3: Planar and Radial Flows. Forward vs Reverse KL. slides
4 February, 28 Lecture 4: Adjoint method for continuous-in-time NF. Latent Variable Models (LVM). Variational lower bound (ELBO). slides
Seminar 4: RealNVP. slides
5 March, 6 Lecture 5: Variational EM-algorithm. Amortized inference, ELBO gradients, reparametrization trick. Variational Autoencoder (VAE). NF as VAE model. slides
Seminar 5: Gaussian Mixture Model (GMM). GMM and MLE. ELBO and EM-algorithm. GMM via EM-algorithm. Variational EM algorithm for GMM. slides
6 March, 20 Lecture 6: Discrete VAE latent representations. Vector quantization, straight-through gradient estimation (VQ-VAE). Gumbel-softmax trick (DALL-E). ELBO surgery and optimal VAE prior. slides
Seminar 6: VAE: Implementation hints. Vanilla 2D VAE coding. VAE on Binarized MNIST visualization. slides
7 March, 27 Lecture 7: NF-based VAE prior. Likelihood-free learning. GAN optimality theorem. slides
Seminar 7: Posterior collapse. Beta VAE on MNIST. slides
8 April, 3 Lecture 8: Wasserstein distance. Wasserstein GAN (WGAN). WGAN with gradient penalty (WGAN-GP). f-divergence minimization. slides
Seminar 8: KL vs JS divergences. Vanilla GAN in 1D coding. Mode collapse and vanishing gradients. Non-saturating GAN. slides
9 April, 10 Lecture 9: GAN evaluation. FID, MMD, Precision-Recall, truncation trick. Langevin dynamic. Score matching. slides
Seminar 9: WGAN and WGAN-GP on 1D data. slides
10 April, 17 Lecture 10: Denoising score matching. Noise Conditioned Score Network (NCSN). Gaussian diffusion process: forward + reverse. slides
Seminar 10: StyleGAN. slides
11 April, 24 Lecture 11: Gaussian diffusion model as VAE, derivation of ELBO. Reparametrization of gaussian diffusion model. slides
Seminar 11: Noise Conditioned Score Network (NCSN). Gaussian diffusion model as VAE. slides
12 May, 8 Lecture 12: Denoising diffusion probabilistic model (DDPM): overview. Denoising diffusion as score-based generative model. Model guidance: classifier guidance, classfier-free guidance. slides
Seminar 12: Denoising diffusion probabilistic model (DDPM). Denoising Diffusion Implicit Models (DDIM). slides
13 May, 15 Lecture 13: SDE basics. Kolmogorov-Fokker-Planck equation. Probability flow ODF. Reverse SDE. Variance Preserving and Variance Exploding SDEs. slides
Seminar 13: Guidance. CLIP, GLIDE, DALL-E 2, Imagen, Latent Diffusion Model. slides

Homeworks

Homework Date Deadline Description Link
1 February, 14 February, 28
  1. Theory (Kernel density estimation, alpha-divergences, curse of dimensionality).
  2. PixelCNN (receptive field, autocomplete) on MNIST.
  3. ImageGPT on MNIST.
Open In Github
Open In Colab
2 February, 28 March, 13
  1. Theory (Sylvester flows, NF expressivity, Neural ODE Pontryagin theorem).
  2. RealNVP on 2D data.
  3. RealNVP on CIFAR10.
Open In Github
Open In Colab
3 March, 13 March, 27
  1. Theory (IWAE theory, MI in ELBO surgery, Gumbel-Max trick).
  2. ResNetVAE on CIFAR10.
  3. VQ-VAE with PixelCNN prior.
Open In Github
Open In Colab
4 March, 27 April, 17
  1. Theory (Least Squares GAN, Conjugate functions, FID for Normal distributions).
  2. WGAN/WGAN-GP on CIFAR10.
  3. Inception Score and FID.
Open In Github
Open In Colab
5 April, 17 May, 8
  1. Theory (Gaussian diffusion, Implicit score matching).
  2. Denoising score matching on 2D data.
  3. NCSN on MNIST.
Open In Github
Open In Colab
6 May, 8 May, 22
  1. Theory (Classifier guidance, spaced diffusion, KFP theorem).
  2. DDPM on 2d data.
  3. DDPM on MNIST.
Open In Github
Open In Colab

Game rules

  • 6 homeworks each of 13 points = 78 points
  • oral cozy exam = 26 points
  • maximum points: 78 + 26 = 104 points

Final grade: floor(relu(#points/8 - 2))

Prerequisities

  • probability theory + statistics
  • machine learning + basics of deep learning
  • python + basics of one of DL frameworks (pytorch/tensorflow/etc)

Previous episodes