2022-1 Deep Learning and Applications

In this lecture, we will be learning about two different topics in deep learning: self-supervised learning (SSL) and generative models.

Syllabus

  1. Historical Review (AlexNet, DQN, Attention, Adam, GAN, ResNet, Transformer, Pretrained Model, SSL)
  2. Good Old Fashioned SSL (Jigsaw, BiGAN, RotNet, Auto-Encoding Transform, DeepCluster, Single Image SSL)
  3. Convnet-based SSL (DrLIM, Contrastive Predictive Coding, SimCLR, MoCo, BYOL, SimCLRv2, SwAV, Barlow Twins)
  4. Transformer-based SSL (Transformer, ViT, Swin Transformer, DINO, EsViT)
  5. Language-domain SSL (GPT, GPT-2, BERT, RoBERTa, ALBERT, GPT-3)
  6. Generative Model 1 (NADE,PixelRNN,PixelCNN)
  7. Generative Model 2 (VAE, WAE, GAN, PlanarFlow)
  8. Generative Model 3 (DDPM)
  9. Generative Model 4 (DDIM)
  10. Generative Model 5 (InfoGAN, VQ-VAE, VQ-VAE2)
  11. Generative Model 6 (ADM, CFG, GLIDE, DALL-E2)

Paper Lists

  • Jigsaw: "Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles," 2017
  • BiGAN: "ADVERSARIAL FEATURE LEARNING," 2017
  • RotNet: "UNSUPERVISED REPRESENTATION LEARNING BY PREDICTING IMAGE ROTATIONS," 2018
  • Auto-Encoding Transform: "AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations rather than Data," 2019
  • DeepCluster: "Deep Clustering for Unsupervised Learning of Visual Features," 2019
  • Single Image SSL: "A CRITICAL ANALYSIS OF SELF-SUPERVISION, WHAT WE CAN LEARN FROM A SINGLE IMAGE," 2020
  • DrLIM: "Dimensionality Reduction by Learning an Invariant Mapping," 2006
  • Contrastive Predictive Coding: "Representation Learning with Contrastive Predictive Coding," 2019
  • SimCLR: "A Simple Framework for Contrastive Learning of Visual Representations," 2020
  • MoCo: "Momentum Contrast for Unsupervised Visual Representation Learning," 2020
  • BYOL: "Bootstrap Your Own Latent A New Approach to Self-Supervised Learning," 2020
  • SimCLRv2: "Big Self-Supervised Models are Strong Semi-Supervised Learners," 2020
  • SwAV: "Unsupervised Learning of Visual Features by Contrasting Cluster Assignments," 2021
  • Barlow Twins: "Barlow Twins: Self-Supervised Learning via Redundancy Reduction," 2021
  • Transformer: "Attention is All You Need," 2017
  • ViT: "AN IMAGE IS WORTH 16 X 16 WORDS :TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE," 2021
  • Swin Transformer: "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows," 2021
  • DINO: "Emerging Properties in Self-Supervised Vision Transformers," 2021
  • EsViT: "Efficient Self-supervised Vision Transformers for Representation Learning," 2021
  • GPT: "Improving Language Understanding by Generative Pre-Training," 2018
  • GPT-2: "Language Models are Unsupervised Multitask Learners," 2018
  • BERT: "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," 2019
  • RoBERTa: "RoBERTa: A Robustly Optimized BERT Pretraining Approach," 2019
  • ALBERT: "ALBERT: A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS," 2020
  • GPT-3: "Language Models are Few-Shot Learners," 2020
  • NADE: "Neural Autoregressive Distribution Estimation." 2016
  • PixelRNN: "Pixel Recurrent Neural Networks," 2016
  • PixelCNN: "Conditional Image Generation with PixelCNN Decoders," 2016
  • VAE: "Auto-Encoding Variational Bayes," 2013
  • WAE: "Wasserstein Auto-Encoders," 2017
  • GAN: "Generative Adversarial Networks," 2014
  • PlanarFlow: "Variational Inference with Normalizing Flows," 2016
  • DDPM: "Denoising Diffusion Probabilistic Models," 2020
  • InfoGAN: "InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets," 2016
  • VQ-VAE: "Neural Discrete Representation Learning," 2018
  • VQ-VAE2: "Generating Diverse High-Fidelity Images with VQ-VAE-2," 2019
  • DDIM: "DENOISING DIFFUSION IMPLICIT MODELS," 2020
  • IDDPM: "Improved Denoising Diffusion Probabilistic Models," 2021
  • ADM: "Diffusion Models Beat GANs on Image Synthesis," 2021
  • CFG: "Classifier-Free Diffusion Guidance," 2021
  • BART: "ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis," 2021
  • DiffusionGAN: "TACKLING THE GENERATIVE LEARNING TRILEMMA WITH DENOISING DIFFUSION GANS," 2021
  • GLIDE: "GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models," 2022
  • DALL-E2: "Hierarchical Text-Conditional Image Generation with CLIP Latents," 2022
This syllabus is subject to further change or revision, as needed, to best realize the educational goals of the course.