Advanced Deep Learning @ KAIST

Course Information

Instructor: Sung Ju Hwang (sjhwang82@kaist.ac.kr)

Teaching Assistants:
Haebeom Lee (haebeom.lee@kaist.ac.kr)
Moonsu Han (mshan92@kaist.ac.kr)
Sunghyun Park (psh01087@kaist.ac.kr)

Office: E3-1, Room 1427 (Instructor) Room 1435 (TAs)
Office hours: By appointment only.

Grading Policy

Absolute Grading
Paper Presentation: 20%
Attendance and Participation: 20%
Project: 60%

Tentative Schedule

Dates	Topic
3/17	Course Introduction
3/19	Review of Deep Learning Basics
3/24	Review of Deep Learning Basics
3/26	Bayesian Deep Learning (Lecture)
3/31	Bayesian Deep Learning (Lecture)
4/2	Bayesian Deep Learning (Presentation)
4/7	Generative Adversarial Networks (Lecture)
4/9	Generative Adversarial Networks (Presentation)
4/14	Autoregressive and Flow-Based Generative Models (Lecture)
4/16	Autoregressive and Flow-Based Generative Models (Presentation), Project Proposal Due 4/17
4/23	Deep Reinforcement Learning (Lecture)
4/28	Deep Reinforcement Learning (Lecture)
4/30	Deep Reinforcement Learning (Presentation)
5/7	Mid-term Presentation
5/12	Memory- and Computation-Efficient Deep Learning (Lecture), Project Meetings
5/14	Memory- and Computation-Efficient Deep Learning (Presentation), Project Meetings
5/19	Meta-Learning (Lecture)
5/21	Meta-Learning (Presentation)
5/26	Continual Learning (Lecture)
5/28	Continual Learning (Presentation)
6/9	Interpretable Deep Learning (Lecture)
6/11	Interpretable Deep Learning (Presentation)
6/16	Reliable Deep Learning (Lecture), Project Meetings
6/18	Reliable Deep Learning (Presentation), Project Meetings, Final Paper Due 6/19
6/19	Adversarial Deep Learning (Video Lecture)
6/23	Graph Neural Networks (Lecture)
6/25	Graph Neural Networks (Presentation)
6/26	(Online) Workshop

Reading List

Bayesian Deep Learning

[Kingma and Welling 14] Auto-Encoding Variational Bayes, ICLR 2014.
[Kingma et al. 15] Variational Dropout and the Local Reparameterization Trick, NIPS 2015.
[Blundell et al. 15] Weight Uncertainty in Neural Networks, ICML 2015.
[Gal and Ghahramani 16] Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, ICML 2016.
[Gal et al. 17] Concrete Dropout, NIPS 2017.
[Gal et al. 17] Deep Bayesian Active Learning with Image Data, ICML 2017.
[Kendal and Gal 17] What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?, ICML 2017.
[Teye et al. 18] Bayesian Uncertainty Estimation for Batch Normalized Deep Networks, ICML 2018.
[Garnelo et al. 18] Conditional Neural Process, ICML 2018.
[Kim et al. 19] Attentive Neural Processes, ICLR 2019.

[Sun et al. 19] Functional Variational Bayesian Neural Networks, ICLR 2019.
[Louizos et al. 19] The Functional Neural Process, NeurIPS 2019.
[Maddox et al. 19] A Simple Baseline for Bayesian Uncertainty in Deep Learning, NeurIPS 2019.

Generative Adversarial Networks

[Goodfellow et al. 14] Generative Adversarial Nets, NIPS 2014.
[Radford et al. 15] Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, ICLR 2016.
[Chen et al. 16] InfoGAN: Interpreting Representation Learning by Information Maximizing Generative Adversarial Nets, NIPS 2016.
[Arjovsky et al. 17] Wasserstein Generative Adversarial Networks, ICML 2017.
[Zhu et al. 17] Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, ICCV 2017.
[Zhang et al. 17] Adversarial Feature Matching for Text Generation, ICML 2017.
[Karras et al. 18] Progressive Growing of GANs for Improved Quality, Stability, and Variation, ICLR 2018.

[Brock et al. 19] Large Scale GAN Training for High-Fidelity Natural Image Synthesis, ICLR 2019.
[Karras et al. 19] A Style-Based Generator Architecture for Generative Adversarial Networks, CVPR 2019.
[Xu et al. 19] Modeling Tabular Data using Conditional GAN, NeurIPS 2019.

Autoregressive and Flow-Based Generative Models

[Rezende and Mohamed 15] Variational Inference with Normalizing Flows, ICML 2015.
[Germain et al. 15] MADE: Masked Autoencoder for Distribution Estimation, ICML 2015.
[Kingma et al. 16] Improved Variational Inference with Inverse Autoregressive Flow, NIPS 2016.
[Oord et al. 16] Pixel Recurrent Neural Networks, ICML 2016.
[Dinh et al. 17] Density Estimation Using Real NVP, ICLR 2017.
[Papamakarios et al. 17 Masked Autoregressive Flow for Density Estimation, NIPS 2017.
[Huang et al.18] Neural Autoregressive Flows, ICML 2018.
[Kingma and Dhariwal 18] Glow: Generative Flow with Invertible 1x1 Convolutions, NeurIPS 2018.

[Ho et al. 19] Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design, ICML 2019.
[Chen et al. 19] Residual Flows for Invertible Generative Modeling, NeurIPS 2019.
[Kumar et al. 20] VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation, ICLR 2020.

Deep Reinforcement Learning

[Mnih et al. 13] Playing Atari with Deep Reinforcement Learning, NIPS Deep Learning Workshop 2013.
[Silver et al. 14] Deterministic Policy Gradient Algorithms, ICML 2014.
[Schulman et al. 15] Trust Region Policy Optimization, ICML 2015.
[Lillicrap et al. 16] Continuous Control with Deep Reinforcement Learning, ICLR 2016.
[Schaul et al. 16] Prioritized Experience Replay, ICLR 2016.
[Wang et al. 16] Dueling Network Architectures for Deep Reinforcement Learning, ICML 2016.
[Mnih et al. 16] Asynchronous Methods for Deep Reinforcement Learning, ICML 2016.
[Schulman et al. 17] Proximal Policy Optimization Algorithms, arXiv preprint, 2017.
[Nachum et al. 18] Data-Efficient Hierarchical Reinforcement Learning, NeurIPS 2018.
[Ha et al. 18] Recurrent World Models Facilitate Policy Evolution, NeurIPS 2018.

[Burda et al. 19] Large-Scale Study of Curiosity-Driven Learning, ICLR 2019.
[Vinyals et al. 19] Grandmaster level in StarCraft II using multi-agent reinforcement learning, Nature, 2019.
[Bellemare et al. 19] A Geometric Perspective on Optimal Representations for Reinforcement Learning, NeurIPS 2019.
[Janner et al. 19] When to Trust Your Model: Model-Based Policy Optimization, NeurIPS 2019.
[Fellows et al. 19] VIREL: A Variational Inference Framework for Reinforcement Learning, NeurIPS 2019.
[Hafner et al. 20] Dream to Control: Learning Behaviors by Latent Imagination, ICLR 2020.
[Kaiser et al. 20] Model Based Reinforcement Learning for Atari, ICLR 2020.

Memory and Computation-Efficient Deep Learning

[Han et al. 15] Learning both Weights and Connections for Efficient Neural Networks, NIPS 2015.
[Wen et al. 16] Learning Structured Sparsity in Deep Neural Networks, NIPS 2016
[Han et al. 16] Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, ICLR 2016
[Molchanov et al. 17] Variational Dropout Sparsifies Deep Neural Networks, ICML 2017
[Luizos et al. 17] Bayesian Compression for Deep Learning, NIPS 2017.
[Luizos et al. 18] Learning Sparse Neural Networks Through L0 Regularization, ICLR 2018.
[Howard et al. 18] MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, CVPR 2018.
[Frankle and Carbin 19] The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks, ICLR 2019.

[Lee et al. 19] SNIP: Single-Shot Network Pruning Based On Connection Sensitivity, ICLR 2019.
[Liu et al. 19] Rethinking the Value of Network Pruning, ICLR 2019.
[Jung et al. 19] Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss, CVPR 2019.
[Morcos et al. 19] One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers, NeurIPS 2019.
[Renda et al. 20] Comparing Rewinding and Fine-tuning in Neural Network Pruning, ICLR 2020.

Meta Learning

[Santoro et al. 16] Meta-Learning with Memory-Augmented Neural Networks, ICML 2016
[Vinyals et al. 16] Matching Networks for One Shot Learning, NIPS 2016
[Edwards and Storkey 17] Towards a Neural Statistician, ICLR 2017
[Finn et al. 17] Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, ICML 2017
[Snell et al. 17] Prototypical Networks for Few-shot Learning, NIPS 2017.
[Nichol et al. 18] On First-Order Meta-learning Algorithms, arXiv preprint, 2018.
[Lee and Choi 18] Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace, ICML 2018.
[Liu et al. 19] Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning, ICLR 2019

[Gordon et al. 19] Meta-Learning Probabilistic Inference for Prediction, ICLR 2019
[Ravi and Beatson 19] Amortized Bayesian Meta-Learning, ICLR 2019.
[Rakelly et al. 19] Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables, ICML 2019.
[Lee et al. 20] Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks, ICLR 2020

Continual Learning

[Rusu et al. 16] Progressive Neural Networks, arXiv preprint, 2016
[Kirkpatrick et al. 17] Overcoming catastrophic forgetting in neural networks, PNAS 2017
[Lee et al. 17] Overcoming Catastrophic Forgetting by Incremental Moment Matching, NIPS 2017
[Shin et al. 17] Continual Learning with Deep Generative Replay, NIPS 2017.
[Lopez-Paz and Ranzato 17] Gradient Episodic Memory for Continual Learning, NIPS 2017.
[Yoon et al. 18] Lifelong Learning with Dynamically Expandable Networks, ICLR 2018.
[Nguyen et al. 18] Variational Continual Learning, ICLR 2018.
[Schwarz et al. 18] Progress & Compress: A Scalable Framework for Continual Learning, ICML 2018.

[Chaudhry et al. 19] Efficient Lifelong Learning with A-GEM, ICLR 2019.
[Rao et al. 19] Continual Unsupervised Representation Learning, NeurIPS 2019.
[Rolnick et al. 19] Experience Replay for Continual Learning, NeurIPS 2019.
[Yoon et al. 20] Scalable and Order-robust Continual Learning with Additive Parameter Decomposition, ICLR 2020.

Interpretable Deep Learning

[Ribeiro et al. 16] "Why Should I Trust You?" Explaining the Predictions of Any Classifier, KDD 2016
[Kim et al. 16] Examples are not Enough, Learn to Criticize! Criticism for Interpretability, NIPS 2016
[Choi et al. 16] RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism, NIPS 2016
[Koh et al. 17] Understanding Black-box Predictions via Influence Functions, ICML 2017
[Bau et al. 17] Network Dissection: Quantifying Interpretability of Deep Visual Representations, CVPR 2017
[Selvaraju et al. 17] Grad-CAM: Visual Explanation from Deep Networks via Gradient-based Localization, ICCV 2017.
[Kim et al. 18] Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV), ICML 2018.
[Heo et al. 18] Uncertainty-Aware Attention for Reliable Interpretation and Prediction, NeurIPS 2018.

[Bau et al. 19] GAN Dissection: Visualizing and Understanding Generative Adversarial Networks, ICLR 2019.
[Guan et al. 19] Towards a Deep and Unified Understanding of Deep Neural Models in NLP, ICML 2019.
[Ghorbani et al. 19] Towards Automatic Concept-based Explanations, NeurIPS 2019.
[Chen et al. 19] This Looks Like That: Deep Learning for Interpretable Image Recognition, NeurIPS 2019.

Reliable Deep Learning

[Guo et al. 17] On Calibration of Modern Neural Networks, ICML 2017.
[Lakshminarayanan et al. 17] Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017.
[Liang et al. 18] Enhancing the Reliability of Out-of-distrubition Image Detection in Neural Networks, ICLR 2018.
[Lee et al. 18] Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples, ICLR 2018.
[Kuleshov et al. 18] Accurate Uncertainties for Deep Learning Using Calibrated Regression, ICML 2018.
[Jiang et al. 18] To Trust Or Not To Trust A Classifier, NeurIPS 2018.
[Madras et al. 18] Predict Responsibly: Improving Fairness and Accuracy by Learning to Defer, NeurIPS 2018.

[Kull et al. 19] Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration, NeurIPS 2019.
[Thulasidasan et al. 19] On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks, NeurIPS 2019.
[Ovadia et al. 19] Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift, NeurIPS 2019.
[Hendrycks et al. 20] AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty, ICLR 2020.

Deep Adversarial Learning

[Szegedy et al. 14] Intriguing Properties of Neural Networks, ICLR 2014.
[Goodfellow et al. 15] Explaining and Harnessing Adversarial Examples, ICLR 2015.
[Kurakin et al. 17] Adversarial Machine Learning at Scale, ICLR 2017.
[Madry et al. 18] Toward Deep Learning Models Resistant to Adversarial Attacks, ICLR 2018.
[Eykholt et al. 18] Robust Physical-World Attacks on Deep Learning Visual Classification.
[Athalye et al. 18] Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples, ICML 2018.

[Zhang et al. 19] Theoretically Principled Trade-off between Robustness and Accuracy, ICML 2019.
[Carmon et al. 19] Unlabeled Data Improves Adversarial Robustness, NeurIPS 2019.
[Ilyas et al. 19] Adversarial Examples are not Bugs, They Are Features, NeurIPS 2019.
[Li et al. 19] Certified Adversarial Robustness with Additive Noise, NeurIPS 2019.
[Tramèr and Boneh 19] Adversarial Training and Robustness for Multiple Perturbations, NeurIPS 2019.
[Shafahi et al. 19] Adversarial Training for Free!, NeurIPS 2019.
[Wong et al. 20] Fast is Better Than Free: Revisiting Adversarial Training, ICLR 2020.

Graph Neural Networks

[Li et al. 16] Gated Graph Sequence Neural Networks, ICLR 2016.
[Hamilton et al. 17] Inductive Representation Learning on Large Graphs, NIPS 2017.
[Kipf and Welling 17] Semi-Supervised Classification with Graph Convolutional Networks, ICLR 2017.
[Velickovic et al. 18] Graph Attention Networks, ICLR 2018.
[Ying et al. 18] Hierarchical Graph Representation Learning with Differentiable Pooling, NeurIPS 2018.

[Yun et al. 19] Graph Transformer Neteworks, NeurIPS 2019.
[Hu et al. 20] Strategies for Pre-training Graph Neural Networks, ICLR 2020.
[Vashishth et al. 20] Composition-based Multi-Relational Graph Convolutional Networks, ICLR 2020.

Neural Architecture Search

[Zoph and Le 17] Neural Architecture Search with Reinforcement Learning, ICLR 2017.
[Baker et al. 17] Designing Neural Network Architectures using Reinforcement Learning, ICLR 2017.
[Real et al. 17] Large-Scale Evolution of Image Classifiers, ICML 2017.
[Liu et al. 18] Hierarchical Representations for Efficient Architecture Search, ICLR 2018.
[Pham et al. 18] Efficient Neural Architecture Search via Parameters Sharing, ICML 2018.
[Luo et al. 18] Neural Architecture Optimization, NeurIPS 2018.
[Liu et al. 19] DARTS: Differentiable Architecture Search, ICLR 2019.
[Tan et al. 19] MnasNet: Platform-Aware Neural Architecture Search for Mobile, CVPR 2019.

[Cai et al. 19] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware, ICLR 2019.
[Zhou et al. 19] BayesNAS: A Bayesian Approach for Neural Architecture Search, ICML 2019.
[Tan and Le 19] EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, ICML 2019.
[Guo et al. 19] NAT: Neural Architecture Transformer for Accurate and Compact Architectures, NeurIPS 2019.
[Chen et al. 19] DetNAS: Backbone Search for Object Detection, NeurIPS 2019.
[Dong and Yang 20] NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search, ICLR 2020.
[Zela et al. 20] Understanding and Robustifying Differentiable Architecture Search, ICLR 2020.

Federated Learning

[Konečný et al. 16] Federated Optimization: Distributed Machine Learning for On-Device Intelligence, arXiv Preprint, 2016.
[Konečný et al. 16] Federated Learning: Strategies for Improving Communication Efficiency, NIPS Workshop on Private Multi-Party Machine Learning 2016.
[McMahan et al. 17] Communication-Efficient Learning of Deep Networks from Decentralized Data, AISTATS 2017.
[Smith et al. 17] Federated Multi-Task Learning, NIPS 2017.
[Li et al. 20] Federated Optimization in Heterogeneous Networks, MLSys 2020.

[Mohri et al. 19] Agnostic Federated Learning, ICML 2019.
[Yurochkin et al. 19] Bayesian Nonparametric Federated Learning of Neural Networks, ICML 2019.
[Bonawitz et al. 19] Towards Federated Learning at Scale: System Design, MLSys 2019.
[Wang et al. 20] Federated Learning with Matched Averaging, ICLR 2020.
[Li et al. 20] On the Convergence of FedAvg on Non-IID data, ICLR 2020.

lsy617004926/AdvancedML