/ReadingList

Reading list for ML (List from AI602 by professor Hwang

Reading List for Machine Learning

Reading list for ML (List from AI602. Advanced Deep Learning (KAIST) by professor Sung Ju Hwang (sjhwang82@kaist.ac.kr)

Reading List

Bayesian Deep Learning

[Kingma and Welling 14] Auto-Encoding Variational Bayes, ICLR 2014.
[Kingma et al. 15] Variational Dropout and the Local Reparameterization Trick, NIPS 2015.
[Blundell et al. 15] Weight Uncertainty in Neural Networks, ICML 2015.
[Gal and Ghahramani 16] Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, ICML 2016.
[Liu et al. 16] Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm, NIPS 2016.
[Mandt et al. 17] Stochastic Gradient Descent as Approximate Bayesian Inference, JMLR 2017.
[Kendal and Gal 17] What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?, ICML 2017.
[Gal et al. 17] Concrete Dropout, NIPS 2017.
[Gal et al. 17] Deep Bayesian Active Learning with Image Data, ICML 2017.
[Teye et al. 18] Bayesian Uncertainty Estimation for Batch Normalized Deep Networks, ICML 2018.
[Garnelo et al. 18] Conditional Neural Process, ICML 2018.
[Kim et al. 19] Attentive Neural Processes, ICLR 2019.
[Sun et al. 19] Functional Variational Bayesian Neural Networks, ICLR 2019.


[Louizos et al. 19] The Functional Neural Process, NeurIPS 2019.
[Amersfoort et al. 20] Uncertainty Estimation Using a Single Deep Deterministic Neural Network, ICML 2020.
[Dusenberry et al. 20] Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors, ICML 2020.
[Wenzel et al. 20] How Good is the Bayes Posterior in Deep Neural Networks Really?, ICML 2020.
[Lee et al. 20] Bootstrapping Neural Processes, arXiv preprint 2020.

Deep Generative Models

VAEs, Autoregressive and Flow-Based Generative Models

[Rezende and Mohamed 15] Variational Inference with Normalizing Flows, ICML 2015.
[Germain et al. 15] MADE: Masked Autoencoder for Distribution Estimation, ICML 2015.
[Kingma et al. 16] Improved Variational Inference with Inverse Autoregressive Flow, NIPS 2016.
[Oord et al. 16] Pixel Recurrent Neural Networks, ICML 2016.
[Dinh et al. 17] Density Estimation Using Real NVP, ICLR 2017.
[Papamakarios et al. 17 Masked Autoregressive Flow for Density Estimation, NIPS 2017.
[Huang et al.18] Neural Autoregressive Flows, ICML 2018.
[Kingma and Dhariwal 18] Glow: Generative Flow with Invertible 1x1 Convolutions, NeurIPS 2018.
[Ho et al. 19] Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design, ICML 2019.


[Chen et al. 19] Residual Flows for Invertible Generative Modeling, NeurIPS 2019.
[Tran et al. 19] Discrete Flows: Invertible Generative Models of Discrete Data, NeurIPS 2019.
[Ping et al. 20] WaveFlow: A Compact Flow-based Model for Raw Audio, ICML 2020.
[Vahdat and Kautz 20] NVAE: A Deep Hierarchical Variational Autoencoder, arXiv preprint, 2020.

Generative Adversarial Networks

[Goodfellow et al. 14] Generative Adversarial Nets, NIPS 2014.
[Radford et al. 15] Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, ICLR 2016.
[Chen et al. 16] InfoGAN: Interpreting Representation Learning by Information Maximizing Generative Adversarial Nets, NIPS 2016.
[Arjovsky et al. 17] Wasserstein Generative Adversarial Networks, ICML 2017.
[Zhu et al. 17] Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, ICCV 2017.
[Zhang et al. 17] Adversarial Feature Matching for Text Generation, ICML 2017.
[Karras et al. 18] Progressive Growing of GANs for Improved Quality, Stability, and Variation, ICLR 2018.
[Brock et al. 19] Large Scale GAN Training for High-Fidelity Natural Image Synthesis, ICLR 2019.
[Karras et al. 19] A Style-Based Generator Architecture for Generative Adversarial Networks, CVPR 2019.
[Xu et al. 19] Modeling Tabular Data using Conditional GAN, NeurIPS 2019.


[Karras et al. 20] Analyzing and Improving the Image Quality of StyleGAN, CVPR 2020.
[Zhao et al. 20] Feature Quantization Improves GAN Training, ICML 2020.
[Sinha et al. 20] Small-GAN: Speeding up GAN Training using Core-Sets, ICML 2020.

Deep Reinforcement Learning

[Mnih et al. 13] Playing Atari with Deep Reinforcement Learning, NIPS Deep Learning Workshop 2013.
[Silver et al. 14] Deterministic Policy Gradient Algorithms, ICML 2014.
[Schulman et al. 15] Trust Region Policy Optimization, ICML 2015.
[Lillicrap et al. 16] Continuous Control with Deep Reinforcement Learning, ICLR 2016.
[Schaul et al. 16] Prioritized Experience Replay, ICLR 2016.
[Wang et al. 16] Dueling Network Architectures for Deep Reinforcement Learning, ICML 2016.
[Mnih et al. 16] Asynchronous Methods for Deep Reinforcement Learning, ICML 2016.
[Schulman et al. 17] Proximal Policy Optimization Algorithms, arXiv preprint, 2017.
[Nachum et al. 18] Data-Efficient Hierarchical Reinforcement Learning, NeurIPS 2018.
[Ha et al. 18] Recurrent World Models Facilitate Policy Evolution, NeurIPS 2018.
[Burda et al. 19] Large-Scale Study of Curiosity-Driven Learning, ICLR 2019.
[Vinyals et al. 19] Grandmaster level in StarCraft II using multi-agent reinforcement learning, Nature, 2019.
[Bellemare et al. 19] A Geometric Perspective on Optimal Representations for Reinforcement Learning, NeurIPS 2019.
[Janner et al. 19] When to Trust Your Model: Model-Based Policy Optimization, NeurIPS 2019.
[Fellows et al. 19] VIREL: A Variational Inference Framework for Reinforcement Learning, NeurIPS 2019.


[Kaiser et al. 20] Model Based Reinforcement Learning for Atari, ICLR 2020.
[Agarwal et al. 20] An Optimistic Perspective on Offline Reinforcement Learning, ICML 2020.
[Fedus et al. 20] Revisiting Fundamentals of Experience Replay, ICML 2020.
[Lee et al. 20] Batch Reinforcement Learning with Hyperparameter Gradients, ICML 2020.
[Raileanu et al. 20] Automatic Data Augmentation for Generalization in Deep Reinforcement Learning, arXiv preprint, 2020.

Memory and Computation-Efficient Deep Learning

[Han et al. 15] Learning both Weights and Connections for Efficient Neural Networks, NIPS 2015.
[Wen et al. 16] Learning Structured Sparsity in Deep Neural Networks, NIPS 2016
[Han et al. 16] Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, ICLR 2016
[Molchanov et al. 17] Variational Dropout Sparsifies Deep Neural Networks, ICML 2017
[Luizos et al. 17] Bayesian Compression for Deep Learning, NIPS 2017.
[Luizos et al. 18] Learning Sparse Neural Networks Through L0 Regularization, ICLR 2018.
[Howard et al. 18] MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, CVPR 2018.
[Frankle and Carbin 19] The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks, ICLR 2019.
[Lee et al. 19] SNIP: Single-Shot Network Pruning Based On Connection Sensitivity, ICLR 2019.
[Liu et al. 19] Rethinking the Value of Network Pruning, ICLR 2019.
[Jung et al. 19] Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss, CVPR 2019.
[Morcos et al. 19] One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers, NeurIPS 2019.


[Renda et al. 20] Comparing Rewinding and Fine-tuning in Neural Network Pruning, ICLR 2020.
[Ye et al. 20] Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection, ICML 2020.
[Frankle et al. 20] Linear Mode Connectivity and the Lottery Ticket Hypothesis, ICML 2020.
[Li et al. 20] Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers, ICML 2020.
[Nagel et al. 20] Up or Down? Adaptive Rounding for Post-Training Quantization, ICML 2020.
[Meng et al. 20] Training Binary Neural Networks using the Bayesian Learning Rule, ICML 2020.

Meta Learning

[Santoro et al. 16] Meta-Learning with Memory-Augmented Neural Networks, ICML 2016
[Vinyals et al. 16] Matching Networks for One Shot Learning, NIPS 2016
[Edwards and Storkey 17] Towards a Neural Statistician, ICLR 2017
[Finn et al. 17] Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, ICML 2017
[Snell et al. 17] Prototypical Networks for Few-shot Learning, NIPS 2017.
[Nichol et al. 18] On First-Order Meta-learning Algorithms, arXiv preprint, 2018.
[Lee and Choi 18] Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace, ICML 2018.
[Liu et al. 19] Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning, ICLR 2019.
[Gordon et al. 19] Meta-Learning Probabilistic Inference for Prediction, ICLR 2019.
[Ravi and Beatson 19] Amortized Bayesian Meta-Learning, ICLR 2019.
[Rakelly et al. 19] Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables, ICML 2019.
[Shu et al. 19] Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting, NeurIPS 2019.
[Finn et al. 19] Online Meta-Learning, ICML 2019.
[Lee et al. 20] Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks, ICLR 2020.


[Yin et al. 20] Meta-Learning without Memorization, ICLR 2020.
[Iakovleva et al. 20] Meta-Learning with Shared Amortized Variational Inference, ICML 2020.
[Bronskill et al. 20] TaskNorm: Rethinking Batch Normalization for Meta-Learning, ICML 2020.

Continual Learning

[Rusu et al. 16] Progressive Neural Networks, arXiv preprint, 2016
[Kirkpatrick et al. 17] Overcoming catastrophic forgetting in neural networks, PNAS 2017
[Lee et al. 17] Overcoming Catastrophic Forgetting by Incremental Moment Matching, NIPS 2017
[Shin et al. 17] Continual Learning with Deep Generative Replay, NIPS 2017.
[Lopez-Paz and Ranzato 17] Gradient Episodic Memory for Continual Learning, NIPS 2017.
[Yoon et al. 18] Lifelong Learning with Dynamically Expandable Networks, ICLR 2018.
[Nguyen et al. 18] Variational Continual Learning, ICLR 2018.
[Schwarz et al. 18] Progress & Compress: A Scalable Framework for Continual Learning, ICML 2018.
[Chaudhry et al. 19] Efficient Lifelong Learning with A-GEM, ICLR 2019.


[Rao et al. 19] Continual Unsupervised Representation Learning, NeurIPS 2019.
[Rolnick et al. 19] Experience Replay for Continual Learning, NeurIPS 2019.
[Jerfel et al. 20] Reconciling Meta-Learning and Continual Learning with Online Mixtures of Tasks, NeurIPS 2019.
[Yoon et al. 20] Scalable and Order-robust Continual Learning with Additive Parameter Decomposition, ICLR 2020.
[Knoblauch et al. 20] Optimal Continual Learning has Perfect Memory and is NP-HARD, ICML 2020.
[Remasesh et al. 20] Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics, Continual Learning Workshop, ICML 2020.

Interpretable Deep Learning

[Ribeiro et al. 16] "Why Should I Trust You?" Explaining the Predictions of Any Classifier, KDD 2016
[Kim et al. 16] Examples are not Enough, Learn to Criticize! Criticism for Interpretability, NIPS 2016
[Choi et al. 16] RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism, NIPS 2016
[Koh et al. 17] Understanding Black-box Predictions via Influence Functions, ICML 2017
[Bau et al. 17] Network Dissection: Quantifying Interpretability of Deep Visual Representations, CVPR 2017
[Selvaraju et al. 17] Grad-CAM: Visual Explanation from Deep Networks via Gradient-based Localization, ICCV 2017.
[Kim et al. 18] Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV), ICML 2018.
[Heo et al. 18] Uncertainty-Aware Attention for Reliable Interpretation and Prediction, NeurIPS 2018.
[Bau et al. 19] GAN Dissection: Visualizing and Understanding Generative Adversarial Networks, ICLR 2019.


[Ghorbani et al. 19] Towards Automatic Concept-based Explanations, NeurIPS 2019.
[Coenen et al. 19] Visualizing and Measuring the Geometry of BERT, NeurIPS 2019.
[Heo et al. 20] Cost-Effective Interactive Attention Learning with Neural Attention Processes, ICML 2020.
[Agarwal et al. 20] Neural Additive Models: Interpretable Machine Learning with Neural Nets, arXiv preprint, 2020.

Reliable Deep Learning

[Guo et al. 17] On Calibration of Modern Neural Networks, ICML 2017.
[Lakshminarayanan et al. 17] Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017.
[Liang et al. 18] Enhancing the Reliability of Out-of-distrubition Image Detection in Neural Networks, ICLR 2018.
[Lee et al. 18] Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples, ICLR 2018.
[Kuleshov et al. 18] Accurate Uncertainties for Deep Learning Using Calibrated Regression, ICML 2018.
[Jiang et al. 18] To Trust Or Not To Trust A Classifier, NeurIPS 2018.
[Madras et al. 18] Predict Responsibly: Improving Fairness and Accuracy by Learning to Defer, NeurIPS 2018.
[Maddox et al. 19] A Simple Baseline for Bayesian Uncertainty in Deep Learning, NeurIPS 2019.


[Kull et al. 19] Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration, NeurIPS 2019.
[Thulasidasan et al. 19] On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks, NeurIPS 2019.
[Ovadia et al. 19] Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift, NeurIPS 2019.
[Hendrycks et al. 20] AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty, ICLR 2020.
[Filos et al. 20] Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?, ICML 2020.

Deep Adversarial Learning

[Szegedy et al. 14] Intriguing Properties of Neural Networks, ICLR 2014.
[Goodfellow et al. 15] Explaining and Harnessing Adversarial Examples, ICLR 2015.
[Kurakin et al. 17] Adversarial Machine Learning at Scale, ICLR 2017.
[Madry et al. 18] Toward Deep Learning Models Resistant to Adversarial Attacks, ICLR 2018.
[Eykholt et al. 18] Robust Physical-World Attacks on Deep Learning Visual Classification.
[Athalye et al. 18] Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples, ICML 2018.
[Zhang et al. 19] Theoretically Principled Trade-off between Robustness and Accuracy, ICML 2019.
[Carmon et al. 19] Unlabeled Data Improves Adversarial Robustness, NeurIPS 2019.
[Ilyas et al. 19] Adversarial Examples are not Bugs, They Are Features, NeurIPS 2019.
[Li et al. 19] Certified Adversarial Robustness with Additive Noise, NeurIPS 2019.
[Tramèr and Boneh 19] Adversarial Training and Robustness for Multiple Perturbations, NeurIPS 2019.
[Shafahi et al. 19] Adversarial Training for Free!, NeurIPS 2019.


[Wong et al. 20] Fast is Better Than Free: Revisiting Adversarial Training, ICLR 2020.
[Madaan et al. 20] Adversarial Neural Pruning with Latent Vulnerability Suppression, ICML 2020.
[Maini et al. 20] Adversarial Robustness Against the Union of Multiple Perturbation Models, ICML 2020.

Graph Neural Networks

[Li et al. 16] Gated Graph Sequence Neural Networks, ICLR 2016.
[Hamilton et al. 17] Inductive Representation Learning on Large Graphs, NIPS 2017.
[Kipf and Welling 17] Semi-Supervised Classification with Graph Convolutional Networks, ICLR 2017.
[Velickovic et al. 18] Graph Attention Networks, ICLR 2018.
[Ying et al. 18] Hierarchical Graph Representation Learning with Differentiable Pooling, NeurIPS 2018.
[Xu et al. 19] How Powerful are Graph Neural Networks?, ICLR 2019.
[Maron et al. 19] Provably Powerful Graph Networks, NeurIPS 2019.
[Yun et al. 19] Graph Transformer Neteworks, NeurIPS 2019.


[Loukas 20] What Graph Neural Networks Cannot Learn: Depth vs Width, ICLR 2020.
[Bianchi et al. 20] Spectral Clustering with Graph Neural Networks for Graph Pooling, ICML 2020.
[Xhonneux et al. 20] Continuous Graph Neural Networks, ICML 2020.
[Garg et al. 20] Generalization and Representational Limits of Graph Neural Networks, ICML 2020.
[Bécigneul et al. 20] Optimal Transport Graph Neural Networks, arXiv preprint 2020.

Neural Architecture Search

[Zoph and Le 17] Neural Architecture Search with Reinforcement Learning, ICLR 2017.
[Baker et al. 17] Designing Neural Network Architectures using Reinforcement Learning, ICLR 2017.
[Real et al. 17] Large-Scale Evolution of Image Classifiers, ICML 2017.
[Liu et al. 18] Hierarchical Representations for Efficient Architecture Search, ICLR 2018.
[Pham et al. 18] Efficient Neural Architecture Search via Parameters Sharing, ICML 2018.
[Luo et al. 18] Neural Architecture Optimization, NeurIPS 2018.
[Liu et al. 19] DARTS: Differentiable Architecture Search, ICLR 2019.
[Tan et al. 19] MnasNet: Platform-Aware Neural Architecture Search for Mobile, CVPR 2019.
[Cai et al. 19] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware, ICLR 2019.
[Zhou et al. 19] BayesNAS: A Bayesian Approach for Neural Architecture Search, ICML 2019.
[Tan and Le 19] EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, ICML 2019.
[Guo et al. 19] NAT: Neural Architecture Transformer for Accurate and Compact Architectures, NeurIPS 2019.
[Chen et al. 19] DetNAS: Backbone Search for Object Detection, NeurIPS 2019.
[Dong and Yang 20] NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search, ICLR 2020.
[Zela et al. 20] Understanding and Robustifying Differentiable Architecture Search, ICLR 2020.


[Such et al. 20] Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data, ICML 2020.
[Li et al. 20] Neural Architecture Search in A Proxy Validation Loss Landscape, ICML 2020.

Federated Learning

[Konečný et al. 16] Federated Optimization: Distributed Machine Learning for On-Device Intelligence, arXiv Preprint, 2016.
[Konečný et al. 16] Federated Learning: Strategies for Improving Communication Efficiency, NIPS Workshop on Private Multi-Party Machine Learning 2016.
[McMahan et al. 17] Communication-Efficient Learning of Deep Networks from Decentralized Data, AISTATS 2017.
[Smith et al. 17] Federated Multi-Task Learning, NIPS 2017.
[Li et al. 20] Federated Optimization in Heterogeneous Networks, MLSys 2020.

[Yurochkin et al. 19] Bayesian Nonparametric Federated Learning of Neural Networks, ICML 2019.
[Bonawitz et al. 19] Towards Federated Learning at Scale: System Design, MLSys 2019.
[Wang et al. 20] Federated Learning with Matched Averaging, ICLR 2020.
[Li et al. 20] On the Convergence of FedAvg on Non-IID data, ICLR 2020.


[Karimireddy et al. 20] SCAFFOLD: Stochastic Controlled Averaging for Federated Learning, ICML 2020.
[Yu et al. 20] Federated Learning with Only Positive Labels, ICML 2020.
[Hamer et al. 20] FedBoost: Communication-Efficient Algorithms for Federated Learning, ICML 2020.
[Rothchild et al. 20] FetchSGD: Communication-Efficient Federated Learning with Sketching, ICML 2020.
[Pathak and Wainwright 20] FedSplit: An Algorithminc Framework for Fast Federated Optimization, NeurIPS 2020.

Self-Supervised Learning

[Dosovitskiy et al. 14] Discriminative Unsupervised Feature Learning with Convolutional Neural Networks, NIPS 2014.
[Pathak et al. 16] Context Encoders: Feature Learning by Inpainting, CVPR 2016.
[Norrozi and Favaro et al. 16] Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles, ECCV 2016.
[Gidaris et al. 18] Unsupervised Representation Learning by Predicting Image Rotations, ICLR 2018.
[He et al. 20] Momentum Contrast for Unsupervised Visual Representation Learning, CVPR 2020.
[Chen et al. 20] A Simple Framework for Contrastive Learning of Visual Representations, ICML 2020.
[Mikolov et al. 13] Efficient Estimation of Word Representations in Vector Space, ICLR 2013.
[Devlin et al. 19] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, NAACL 2019.
[Clark et al. 20] ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators, ICLR 2020.
[Hu et al. 20] Strategies for Pre-training Graph Neural Networks, ICLR 2020.


[Chen et al. 20] Generative Pretraining from Pixels, ICML 2020.
[Grill et al. 20] Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning, arXiv preprint, 2020.
[Chen et al. 20] Big Self-Supervised Models are Strong Semi-Supervised Learners, arXiv preprint, 2020.
[Laskin et al. 20] CURL: Contrastive Unsupervised Representations for Reinforcement Learning, ICML 2020.