/AdvancedML

Reading list for the Advanced Machine Learning Course

Advanced Deep Learning @ KAIST

Course Information

Instructor: Sung Ju Hwang (sjhwang82@kaist.ac.kr)
TAs: Haebeom Lee (haebeom.lee@kaist.ac.kr), Jinheon Baek (jinheon.baek@kaist.ac.kr), Seul Lee (animecult@kaist.ac.kr), Wonyong Jeong (wyjeong@kaist.ac.kr)

Office: This is an online course. E3-1, Room 1427 (Instructor) Room 1435 (TAs)
Office hours: By appointment only.

Grading Policy

  • Absolute Grading
  • Paper Presentation: 20%
  • Attendance and Participation: 20%
  • Project: 60%

Tentative Schedule

Dates Topic
8/31 Course Introduction
9/2 Review of Deep Learning Basics (Video Lecture)
9/7 Bayesian Deep Learning - VAEs and BNNs (Lecture)
9/9 Bayesian Deep Learning - Bayesian Approximations, Modeling Uncertainty (Lecture) Review Due September 12th
9/14 Bayesian Deep Learning (Presentation)
9/16 Deep Generative Models - Generative Adversarial Networks (Lecture)
9/28 Deep Generative Models - Autoregressive and Flow-Based Models (Lecture) Review Due
9/30 Deep Generative Models (Presentation)
10/5 Deep Reinforcement Learning - Value-based RL (Lecture)
10/7 Deep Reinforcement Learning - Policy and Model-based RL (Lecture) Project Proposal and Review Due October 7th.
10/12 Deep Reinforcement Learning (Presentation)
10/14 Memory- and Computation-Efficient Deep Learning (Lecture) Review Due
10/19 Memory- and Computation-Efficient Deep Learning (Presentation), Project Meetings
10/26 Meta-Learning (Lecture) Review Due
10/28 Meta-Learning (Presentation)
11/2 Meta-Learning (Presentation), Continual Learning (Lecture) Review Due
11/4 Continual Learning (Lecture, Presentation)
11/5 Mid-term Presentation (at 7PM)
11/9 Continual Learning (Presentation)
11/11 Adversarially-Robust Deep Learning (Lecture), Review Due, Project Meetings
11/16 Adversarially-Robust Deep Learning (Presentation), Project Meetings
11/23 Graph Neural Networks (Lecture) Review Due
11/25 Graph Neural Networks (Presentation)
11/30 Self Supervised Learning (Lecture) Review Due
12/2 Self Supervised Learning (Presentation)
12/7 Federated Learning (Lecture) Review Due
12/9 Federated Learning (Presentation)
12/14 Neural Architecture Search (Lecture), Review Due, Final Paper Due December 16th
12/18 Final Presentation

Reading List

Bayesian Deep Learning

[Kingma and Welling 14] Auto-Encoding Variational Bayes, ICLR 2014.
[Kingma et al. 15] Variational Dropout and the Local Reparameterization Trick, NIPS 2015.
[Blundell et al. 15] Weight Uncertainty in Neural Networks, ICML 2015.
[Gal and Ghahramani 16] Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, ICML 2016.
[Liu et al. 16] Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm, NIPS 2016.
[Mandt et al. 17] Stochastic Gradient Descent as Approximate Bayesian Inference, JMLR 2017.
[Kendal and Gal 17] What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?, ICML 2017.
[Gal et al. 17] Concrete Dropout, NIPS 2017.
[Gal et al. 17] Deep Bayesian Active Learning with Image Data, ICML 2017.
[Teye et al. 18] Bayesian Uncertainty Estimation for Batch Normalized Deep Networks, ICML 2018.
[Garnelo et al. 18] Conditional Neural Process, ICML 2018.
[Kim et al. 19] Attentive Neural Processes, ICLR 2019.
[Sun et al. 19] Functional Variational Bayesian Neural Networks, ICLR 2019.
[Louizos et al. 19] The Functional Neural Process, NeurIPS 2019.
[Zhang et al. 20] Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning, ICLR 2020.
[Amersfoort et al. 20] Uncertainty Estimation Using a Single Deep Deterministic Neural Network, ICML 2020.
[Dusenberry et al. 20] Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors, ICML 2020.
[Wenzel et al. 20] How Good is the Bayes Posterior in Deep Neural Networks Really?, ICML 2020.


[Lee et al. 20] Bootstrapping Neural Processes, NeurIPS 2020.
[Wilson et al. 20] Bayesian Deep Learning and a Probabilistic Perspective of Generalization, NeurIPS 2020.
[Izmailov et al. 21] What Are Bayesian Neural Network Posteriors Really Like?, ICML 2021.
[Daxberger et al. 21] Bayesian Deep Learning via Subnetwork Inference, ICML 2021.

Deep Generative Models

VAEs, Autoregressive and Flow-Based Generative Models

[Rezende and Mohamed 15] Variational Inference with Normalizing Flows, ICML 2015.
[Germain et al. 15] MADE: Masked Autoencoder for Distribution Estimation, ICML 2015.
[Kingma et al. 16] Improved Variational Inference with Inverse Autoregressive Flow, NIPS 2016.
[Oord et al. 16] Pixel Recurrent Neural Networks, ICML 2016.
[Dinh et al. 17] Density Estimation Using Real NVP, ICLR 2017.
[Papamakarios et al. 17] Masked Autoregressive Flow for Density Estimation, NIPS 2017.
[Huang et al.18] Neural Autoregressive Flows, ICML 2018.
[Kingma and Dhariwal 18] Glow: Generative Flow with Invertible 1x1 Convolutions, NeurIPS 2018.
[Ho et al. 19] Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design, ICML 2019.
[Chen et al. 19] Residual Flows for Invertible Generative Modeling, NeurIPS 2019.
[Tran et al. 19] Discrete Flows: Invertible Generative Models of Discrete Data, NeurIPS 2019.
[Ping et al. 20] WaveFlow: A Compact Flow-based Model for Raw Audio, ICML 2020.


[Vahdat and Kautz 20] NVAE: A Deep Hierarchical Variational Autoencoder, NeurIPS 2020.
[Ho et al. 20] Denoising Diffusion Probabilistic Models, NeurIPS 2020.
[Song et al. 21] Score-Based Generative Modeling through Stochastic Differential Equations, ICLR 2021.
[Kosiorek et al. 21] NeRF-VAE: A Geometry Aware 3D Scene Generative Model, ICML 2021.

Generative Adversarial Networks

[Goodfellow et al. 14] Generative Adversarial Nets, NIPS 2014.
[Radford et al. 15] Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, ICLR 2016.
[Chen et al. 16] InfoGAN: Interpreting Representation Learning by Information Maximizing Generative Adversarial Nets, NIPS 2016.
[Arjovsky et al. 17] Wasserstein Generative Adversarial Networks, ICML 2017.
[Zhu et al. 17] Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, ICCV 2017.
[Zhang et al. 17] Adversarial Feature Matching for Text Generation, ICML 2017.
[Karras et al. 18] Progressive Growing of GANs for Improved Quality, Stability, and Variation, ICLR 2018.
[Choi et al. 18] StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation, CVPR 2018.
[Brock et al. 19] Large Scale GAN Training for High-Fidelity Natural Image Synthesis, ICLR 2019.
[Karras et al. 19] A Style-Based Generator Architecture for Generative Adversarial Networks, CVPR 2019.
[Karras et al. 20] Analyzing and Improving the Image Quality of StyleGAN, CVPR 2020.
[Sinha et al. 20] Small-GAN: Speeding up GAN Training using Core-Sets, ICML 2020.


[Karras et al. 20] Training Generative Adversarial Networks with Limited Data, NeurIPS 2020.
[Liu et al. 21] Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis, ICLR 2021.
[Hudson and Zitnick 21] Generative Adversarial Transformers, ICML 2021.
[Karras et al. 21] Alias-Free GAN, arXiv preprint, 2021.

Deep Reinforcement Learning

[Mnih et al. 13] Playing Atari with Deep Reinforcement Learning, NIPS Deep Learning Workshop 2013.
[Silver et al. 14] Deterministic Policy Gradient Algorithms, ICML 2014.
[Schulman et al. 15] Trust Region Policy Optimization, ICML 2015.
[Lillicrap et al. 16] Continuous Control with Deep Reinforcement Learning, ICLR 2016.
[Schaul et al. 16] Prioritized Experience Replay, ICLR 2016.
[Wang et al. 16] Dueling Network Architectures for Deep Reinforcement Learning, ICML 2016.
[Mnih et al. 16] Asynchronous Methods for Deep Reinforcement Learning, ICML 2016.
[Schulman et al. 17] Proximal Policy Optimization Algorithms, arXiv preprint, 2017.
[Nachum et al. 18] Data-Efficient Hierarchical Reinforcement Learning, NeurIPS 2018.
[Ha et al. 18] Recurrent World Models Facilitate Policy Evolution, NeurIPS 2018.
[Burda et al. 19] Large-Scale Study of Curiosity-Driven Learning, ICLR 2019.
[Vinyals et al. 19] Grandmaster level in StarCraft II using multi-agent reinforcement learning, Nature, 2019.
[Bellemare et al. 19] A Geometric Perspective on Optimal Representations for Reinforcement Learning, NeurIPS 2019.
[Janner et al. 19] When to Trust Your Model: Model-Based Policy Optimization, NeurIPS 2019.
[Fellows et al. 19] VIREL: A Variational Inference Framework for Reinforcement Learning, NeurIPS 2019.
[Kumar et al. 19] Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction, NeurIPS 2019.
[Kaiser et al. 20] Model Based Reinforcement Learning for Atari, ICLR 2020.
[Agarwal et al. 20] An Optimistic Perspective on Offline Reinforcement Learning, ICML 2020.
[Lee et al. 20] Batch Reinforcement Learning with Hyperparameter Gradients, ICML 2020.


[Kumar et al. 20] Conservative Q-Learning for Offline Reinforcement Learning, ICML 2020.
[Oh et al. 21] Learning to Sample with Local and Global Contexts in Experience Replay Buffer, ICLR 2021.
[Yarats et al. 21] Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels, ICLR 2021.
[Lee et al. 21] SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning, ICML 2021.

Memory and Computation-Efficient Deep Learning

[Han et al. 15] Learning both Weights and Connections for Efficient Neural Networks, NIPS 2015.
[Wen et al. 16] Learning Structured Sparsity in Deep Neural Networks, NIPS 2016
[Han et al. 16] Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, ICLR 2016
[Molchanov et al. 17] Variational Dropout Sparsifies Deep Neural Networks, ICML 2017
[Luizos et al. 17] Bayesian Compression for Deep Learning, NIPS 2017.
[Luizos et al. 18] Learning Sparse Neural Networks Through L0 Regularization, ICLR 2018.
[Howard et al. 18] MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, CVPR 2018.
[Frankle and Carbin 19] The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks, ICLR 2019.
[Lee et al. 19] SNIP: Single-Shot Network Pruning Based On Connection Sensitivity, ICLR 2019.
[Liu et al. 19] Rethinking the Value of Network Pruning, ICLR 2019.
[Jung et al. 19] Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss, CVPR 2019.
[Morcos et al. 19] One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers, NeurIPS 2019.
[Renda et al. 20] Comparing Rewinding and Fine-tuning in Neural Network Pruning, ICLR 2020.
[Frankle et al. 20] Linear Mode Connectivity and the Lottery Ticket Hypothesis, ICML 2020.


[Tanaka et al. 20] Pruning Neural Networks without Any Data by Iteratively Conserving Synaptic Flow, NeurIPS 2020.
[van Baalen et al. 20] Bayesian Bits: Unifying Quantization and Pruning, NeurIPS 2020.
[de Jorge et al. 21] Progressive Skeletonization: Trimming more fat from a network at initialization, ICLR 2021.
[Stock et al. 21] Training with Quantization Noise for Extreme Model Compression, ICLR 2021.
[Lee et al. 21] Semi-Relaxed Quantization with DropBits: Training Low-Bit Neural Networks via Bit-wise Regularization, ICCV 2021.

Meta Learning

[Santoro et al. 16] Meta-Learning with Memory-Augmented Neural Networks, ICML 2016
[Vinyals et al. 16] Matching Networks for One Shot Learning, NIPS 2016
[Edwards and Storkey 17] Towards a Neural Statistician, ICLR 2017
[Finn et al. 17] Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, ICML 2017
[Snell et al. 17] Prototypical Networks for Few-shot Learning, NIPS 2017.
[Nichol et al. 18] On First-Order Meta-learning Algorithms, arXiv preprint, 2018.
[Lee and Choi 18] Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace, ICML 2018.
[Liu et al. 19] Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning, ICLR 2019.
[Gordon et al. 19] Meta-Learning Probabilistic Inference for Prediction, ICLR 2019.
[Ravi and Beatson 19] Amortized Bayesian Meta-Learning, ICLR 2019.
[Rakelly et al. 19] Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables, ICML 2019.
[Shu et al. 19] Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting, NeurIPS 2019.
[Finn et al. 19] Online Meta-Learning, ICML 2019.
[Lee et al. 20] Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks, ICLR 2020.
[Yin et al. 20] Meta-Learning without Memorization, ICLR 2020.
[Raghu et al. 20] Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML, ICLR 2020.
[Iakovleva et al. 20] Meta-Learning with Shared Amortized Variational Inference, ICML 2020.
[Bronskill et al. 20] TaskNorm: Rethinking Batch Normalization for Meta-Learning, ICML 2020.


[Rajendran et al. 20] Meta-Learning Requires Meta-Augmentation, NeurIPS 2020.
[Lee et al. 21] Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning, ICLR 2021.
[Shin et al. 21] Large-Scale Meta-Learning with Continual Trajectory Shifting, ICML 2021.
[Acar et al. 21] Memory Efficient Online Meta Learning, ICML 2021.
[Bai et al. 21] How Important is the Train-Validation Split in Meta-Learning?, ICML 2021.

Continual Learning

[Rusu et al. 16] Progressive Neural Networks, arXiv preprint, 2016
[Kirkpatrick et al. 17] Overcoming catastrophic forgetting in neural networks, PNAS 2017
[Lee et al. 17] Overcoming Catastrophic Forgetting by Incremental Moment Matching, NIPS 2017
[Shin et al. 17] Continual Learning with Deep Generative Replay, NIPS 2017.
[Lopez-Paz and Ranzato 17] Gradient Episodic Memory for Continual Learning, NIPS 2017.
[Yoon et al. 18] Lifelong Learning with Dynamically Expandable Networks, ICLR 2018.
[Nguyen et al. 18] Variational Continual Learning, ICLR 2018.
[Schwarz et al. 18] Progress & Compress: A Scalable Framework for Continual Learning, ICML 2018.
[Chaudhry et al. 19] Efficient Lifelong Learning with A-GEM, ICLR 2019.
[Rao et al. 19] Continual Unsupervised Representation Learning, NeurIPS 2019.
[Rolnick et al. 19] Experience Replay for Continual Learning, NeurIPS 2019.
[Jerfel et al. 20] Reconciling Meta-Learning and Continual Learning with Online Mixtures of Tasks, NeurIPS 2019.
[Yoon et al. 20] Scalable and Order-robust Continual Learning with Additive Parameter Decomposition, ICLR 2020.
[Remasesh et al. 20] Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics, Continual Learning Workshop, ICML 2020.


[Borsos et al. 20] Coresets via Bilevel Optimization for Continual Learning and Streaming, NeurIPS 2020.
[Mirzadeh et al. 20] Understanding the Role of Training Regimes in Continual Learning, NeurIPS 2020.
[Saha et al. 21] Gradient Projection Memory for Continual Learning, ICLR 2021.
[Veinat et al. 21] Efficient Continual Learning with Modular Networks and Task-Driven Priors, ICLR 2021.
[Kumar et al. 21] Bayesian Structural Adaptation for Continual Learning, ICML 2021.

Interpretable Deep Learning

[Ribeiro et al. 16] "Why Should I Trust You?" Explaining the Predictions of Any Classifier, KDD 2016
[Kim et al. 16] Examples are not Enough, Learn to Criticize! Criticism for Interpretability, NIPS 2016
[Choi et al. 16] RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism, NIPS 2016
[Koh et al. 17] Understanding Black-box Predictions via Influence Functions, ICML 2017
[Bau et al. 17] Network Dissection: Quantifying Interpretability of Deep Visual Representations, CVPR 2017
[Selvaraju et al. 17] Grad-CAM: Visual Explanation from Deep Networks via Gradient-based Localization, ICCV 2017.
[Kim et al. 18] Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV), ICML 2018.
[Heo et al. 18] Uncertainty-Aware Attention for Reliable Interpretation and Prediction, NeurIPS 2018.
[Bau et al. 19] GAN Dissection: Visualizing and Understanding Generative Adversarial Networks, ICLR 2019.
[Ghorbani et al. 19] Towards Automatic Concept-based Explanations, NeurIPS 2019.


[Coenen et al. 19] Visualizing and Measuring the Geometry of BERT, NeurIPS 2019.
[Heo et al. 20] Cost-Effective Interactive Attention Learning with Neural Attention Processes, ICML 2020.
[Agarwal et al. 20] Neural Additive Models: Interpretable Machine Learning with Neural Nets, arXiv preprint, 2020.

Reliable Deep Learning

[Guo et al. 17] On Calibration of Modern Neural Networks, ICML 2017.
[Lakshminarayanan et al. 17] Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017.
[Liang et al. 18] Enhancing the Reliability of Out-of-distrubition Image Detection in Neural Networks, ICLR 2018.
[Lee et al. 18] Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples, ICLR 2018.
[Kuleshov et al. 18] Accurate Uncertainties for Deep Learning Using Calibrated Regression, ICML 2018.
[Jiang et al. 18] To Trust Or Not To Trust A Classifier, NeurIPS 2018.
[Madras et al. 18] Predict Responsibly: Improving Fairness and Accuracy by Learning to Defer, NeurIPS 2018.
[Maddox et al. 19] A Simple Baseline for Bayesian Uncertainty in Deep Learning, NeurIPS 2019. [Kull et al. 19] Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration, NeurIPS 2019.
[Thulasidasan et al. 19] On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks, NeurIPS 2019.
[Ovadia et al. 19] Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift, NeurIPS 2019.


[Hendrycks et al. 20] AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty, ICLR 2020.
[Filos et al. 20] Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?, ICML 2020.

Adversarially-Robust Deep Learning

[Szegedy et al. 14] Intriguing Properties of Neural Networks, ICLR 2014.
[Goodfellow et al. 15] Explaining and Harnessing Adversarial Examples, ICLR 2015.
[Kurakin et al. 17] Adversarial Machine Learning at Scale, ICLR 2017.
[Madry et al. 18] Toward Deep Learning Models Resistant to Adversarial Attacks, ICLR 2018.
[Eykholt et al. 18] Robust Physical-World Attacks on Deep Learning Visual Classification.
[Athalye et al. 18] Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples, ICML 2018.
[Zhang et al. 19] Theoretically Principled Trade-off between Robustness and Accuracy, ICML 2019.
[Carmon et al. 19] Unlabeled Data Improves Adversarial Robustness, NeurIPS 2019.
[Ilyas et al. 19] Adversarial Examples are not Bugs, They Are Features, NeurIPS 2019.
[Li et al. 19] Certified Adversarial Robustness with Additive Noise, NeurIPS 2019.
[Tramèr and Boneh 19] Adversarial Training and Robustness for Multiple Perturbations, NeurIPS 2019.
[Shafahi et al. 19] Adversarial Training for Free!, NeurIPS 2019.
[Wong et al. 20] Fast is Better Than Free: Revisiting Adversarial Training, ICLR 2020.
[Madaan et al. 20] Adversarial Neural Pruning with Latent Vulnerability Suppression, ICML 2020.
[Croce and Hein 20] Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-free Attacks, ICML 2020.


[Maini et al. 20] Adversarial Robustness Against the Union of Multiple Perturbation Models, ICML 2020.
[Kim et al. 20] Adversarial Self-Supervised Contrastive Learning, NeurIPS 2020.
[Wu et al. 20] Adversarial Weight Perturbation Helps Robust Generalization, NeurIPS 2020.
[Laidlaw et al. 21] Perceptual Adversarial Robustness: Defense Against Unseen Threat Models, ICLR 2021.
[Pang et al. 21] Bag of Tricks for Adversarial Training, ICLR 2021.
[Madaan et al. 21] Learning to Generate Noise for Multi-Attack Robustness, ICML 2021.

Graph Neural Networks

[Li et al. 16] Gated Graph Sequence Neural Networks, ICLR 2016.
[Hamilton et al. 17] Inductive Representation Learning on Large Graphs, NIPS 2017.
[Kipf and Welling 17] Semi-Supervised Classification with Graph Convolutional Networks, ICLR 2017.
[Velickovic et al. 18] Graph Attention Networks, ICLR 2018.
[Ying et al. 18] Hierarchical Graph Representation Learning with Differentiable Pooling, NeurIPS 2018.
[Xu et al. 19] How Powerful are Graph Neural Networks?, ICLR 2019.
[Maron et al. 19] Provably Powerful Graph Networks, NeurIPS 2019.
[Yun et al. 19] Graph Transformer Neteworks, NeurIPS 2019.


[Loukas 20] What Graph Neural Networks Cannot Learn: Depth vs Width, ICLR 2020.
[Bianchi et al. 20] Spectral Clustering with Graph Neural Networks for Graph Pooling, ICML 2020.
[Xhonneux et al. 20] Continuous Graph Neural Networks, ICML 2020.
[Garg et al. 20] Generalization and Representational Limits of Graph Neural Networks, ICML 2020.
[Baek et al. 21] Accurate Learning of Graph Representations with Graph Multiset Pooling, ICLR 2021.
[Liu et al. 21] Elastic Graph Neural Networks, ICML 2021.

Self-Supervised Learning

[Dosovitskiy et al. 14] Discriminative Unsupervised Feature Learning with Convolutional Neural Networks, NIPS 2014.
[Pathak et al. 16] Context Encoders: Feature Learning by Inpainting, CVPR 2016.
[Norrozi and Favaro et al. 16] Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles, ECCV 2016.
[Gidaris et al. 18] Unsupervised Representation Learning by Predicting Image Rotations, ICLR 2018.
[He et al. 20] Momentum Contrast for Unsupervised Visual Representation Learning, CVPR 2020.
[Chen et al. 20] A Simple Framework for Contrastive Learning of Visual Representations, ICML 2020.
[Mikolov et al. 13] Efficient Estimation of Word Representations in Vector Space, ICLR 2013.
[Devlin et al. 19] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, NAACL 2019.
[Clark et al. 20] ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators, ICLR 2020.
[Hu et al. 20] Strategies for Pre-training Graph Neural Networks, ICLR 2020.
[Chen et al. 20] Generative Pretraining from Pixels, ICML 2020.
[Laskin et al. 20] CURL: Contrastive Unsupervised Representations for Reinforcement Learning, ICML 2020. [Grill et al. 20] Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning, NeurIPS 2020.
[Chen et al. 20] Big Self-Supervised Models are Strong Semi-Supervised Learners, NeurIPS, 2020.


[Chen and He. 21] Exploring Simple Siamese Representation Learning, CVPR 2021.
[Tian et al. 21] Understanding Self-Supervised Learning Dynamics without Contrastive Pairs, ICML 2021.
[Caron et al. 21] Emerging Properties in Self-Supervised Vision Transformers, ICCV 2021.

Federated Learning

[Konečný et al. 16] Federated Optimization: Distributed Machine Learning for On-Device Intelligence, arXiv Preprint, 2016.
[Konečný et al. 16] Federated Learning: Strategies for Improving Communication Efficiency, NIPS Workshop on Private Multi-Party Machine Learning 2016.
[McMahan et al. 17] Communication-Efficient Learning of Deep Networks from Decentralized Data, AISTATS 2017.
[Smith et al. 17] Federated Multi-Task Learning, NIPS 2017.
[Li et al. 20] Federated Optimization in Heterogeneous Networks, MLSys 2020.
[Yurochkin et al. 19] Bayesian Nonparametric Federated Learning of Neural Networks, ICML 2019.
[Bonawitz et al. 19] Towards Federated Learning at Scale: System Design, MLSys 2019.
[Wang et al. 20] Federated Learning with Matched Averaging, ICLR 2020.
[Li et al. 20] On the Convergence of FedAvg on Non-IID data, ICLR 2020.
[Karimireddy et al. 20] SCAFFOLD: Stochastic Controlled Averaging for Federated Learning, ICML 2020.
[Hamer et al. 20] FedBoost: Communication-Efficient Algorithms for Federated Learning, ICML 2020.
[Rothchild et al. 20] FetchSGD: Communication-Efficient Federated Learning with Sketching, ICML 2020.


[Pathak and Wainwright 20] FedSplit: An Algorithminc Framework for Fast Federated Optimization, NeurIPS 2020.
[Fallah et al. 21] Personalized Federated Learning with Theoretical Guarantees: A Model-Agnostic Meta-Learning Approach, NeurIPS 2020.
[Reddi et al. 21] Adaptive Federated Optimization, ICLR 2021.
[Jeong et al. 21] Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint Learning, ICLR 2021.
[Yoon et al. 21] Federated Continual Learning with Weighted Inter-client Transfer, ICML 2021.
[Li et al. 21] Ditto: Fair and Robust Federated Learning Through Personalization, ICML 2021.

Neural Architecture Search

[Zoph and Le 17] Neural Architecture Search with Reinforcement Learning, ICLR 2017.
[Baker et al. 17] Designing Neural Network Architectures using Reinforcement Learning, ICLR 2017.
[Real et al. 17] Large-Scale Evolution of Image Classifiers, ICML 2017.
[Liu et al. 18] Hierarchical Representations for Efficient Architecture Search, ICLR 2018.
[Pham et al. 18] Efficient Neural Architecture Search via Parameters Sharing, ICML 2018.
[Luo et al. 18] Neural Architecture Optimization, NeurIPS 2018.
[Liu et al. 19] DARTS: Differentiable Architecture Search, ICLR 2019.
[Tan et al. 19] MnasNet: Platform-Aware Neural Architecture Search for Mobile, CVPR 2019.
[Cai et al. 19] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware, ICLR 2019.
[Zhou et al. 19] BayesNAS: A Bayesian Approach for Neural Architecture Search, ICML 2019.
[Tan and Le 19] EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, ICML 2019.
[Guo et al. 19] NAT: Neural Architecture Transformer for Accurate and Compact Architectures, NeurIPS 2019.
[Chen et al. 19] DetNAS: Backbone Search for Object Detection, NeurIPS 2019.
[Dong and Yang 20] NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search, ICLR 2020.
[Zela et al. 20] Understanding and Robustifying Differentiable Architecture Search, ICLR 2020.
[Cai et al. 20] Once-for-All: Train One Network and Specialize it for Efficient Deployment, ICLR 2020.
[Such et al. 20] Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data, ICML 2020.


[Liu et al. 20] Are Labels Necessary for Neural Architecture Search?, ECCV 2020.
[Dudziak et al. 20] BRP-NAS: Prediction-based NAS using GCNs, NeurIPS 2020.
[Li et al. 20] Neural Architecture Search in A Proxy Validation Loss Landscape, ICML 2020.
[Lee et al. 21] Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets, ICLR 2021.
[Mellor et al. 21] Neural Architecture Search without Training, ICML 2021.