Inspired by Adrian Colyer and Denny Britz.

This contains my notes for research papers that I've read. Papers are arranged according to three broad categories and then further numbered on a (1) to (5) scale where a (1) means I have only barely skimmed it, while a (5) means I feel confident that I understand almost everything about the paper. Within a single year, these papers should be organized according to publication date. The links here go to my paper summaries if I have them, otherwise those papers are on my TODO list.

Reinforcement Learning and Imitation Learning

2018

Meta Learning Shared Hierarchies, ... (3)
Parameterized Hierarchical Procedures for Neural Programming, ... (2)

2017

NIPS, CoRL, IROS, etc.

One-Shot Imitation Learning, NIPS 2017 (3)
#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning, NIPS 2017 (4)
Robust Imitation of Diverse Behaviors, NIPS 2017 (3)
Bridging the Gap Between Value and Policy Based Reinforcement Learning, NIPS 2017 (2)
Inferring The Latent Structure of Human Decision-Making from Raw Visual Inputs, NIPS 2017 (5)
Distral: Robust Multitask Reinforcement Learning, NIPS 2017 (1)
Imagination-Augmented Agents for Deep Reinforcement Learning, NIPS 2017 (1)
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments, NIPS 2017 (1)
Hindsight Experience Replay, NIPS 2017 (1)
DART: Noise Injection for Robust Imitation Learning, CoRL 2017 (3)
Learning Deep Policies for Robot Bin Picking by Simulating Robust Grasping Sequences, CoRL, 2017 (3)
DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations, CoRL (4)
End-to-End Learning of Semantic Grasping, CoRL 2017 (1)
One-Shot Visual Imitation Learning via Meta-Learning, CoRL 2017 (1)
Visual Semantic Planning using Deep Successor Representations, ICCV 2017 (4)
Proximal Policy Optimization Algorithms, arXiv (4)
Learning Human Behaviors From Motion Capture by Adversarial Imitation, arXiv (3)
The Uncertainty Bellman Equation and Exploration, arXiv (1)
Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets, arXiv (1)
UCB and InfoGain Exploration via Q-Ensembles, arXiv (1)
Equivalence Between Policy Gradients and Soft Q-Learning, arXiv (1)
Automatic Goal Generation for Reinforcement Learning Agents, arXiv (2)

ICML, UAI, IROS, etc.

Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World, IROS 2017 (4)
Virtual to Real Reinforcement Learning for Autonomous Driving, BMVC 2017 (3)
ReasoNet: Learning to Stop Reading in Machine Comprehension, KDD 2017 (3)
Inverse Reinforcement Learning via Deep Gaussian Process, UAI 2017 (2)
Reinforcement Learning with Deep Energy-Based Policies, ICML 2017 (2)
FeUdal Networks for Hierarchical Reinforcement Learning, ICML 2017 (1)
A Distributional Perspective on Reinforcement Learning, ICML 2017 (2)
Robust Adversarial Reinforcement Learning, ICML 2017 (5)
Modular Multitask Reinforcement Learning with Policy Sketches, ICML 2017 (2)
End-to-End Differentiable Adversarial Imitation Learning, ICML 2017 (4)
Constrained Policy Optimization, ICML 2017 (2)
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, ICML 2017 (4)
Curiosity-Driven Exploration by Self-Supervised Prediction, ICML 2017 (3)
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access, ACL 2017 (3)
Loss is its own Reward: Self-Supervision for Reinforcement Learning, arXiv (2)
Evolution Strategies as a Scalable Alternative to Reinforcement Learning, arXiv (5)

ICRA, ICLR, etc.

Imitating Driver Behavior with Generative Adversarial Networks, Intelligent Vehicles (IV), 2017 (4)
Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data, ICLR 2017 (1)
Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening, ICLR 2017 (1)
Reinforcement Learning with Unsupervised Auxiliary Tasks, ICLR 2017 (3)
Learning to Act by Predicting the Future, ICLR 2017 (4)
Learning Visual Servoing with Deep Features and Fitted Q-Iteration, ICLR 2017 (2)
Q-Prop: Sample-Efficient Policy Gradient with an Off-Policy Critic, ICLR 2017 (2)
Stochastic Neural Networks for Hierarchical Reinforcement Learning, ICLR 2017 (4)
Third-Person Imitation Learning, ICLR 2017 (3)
Target-Driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning, ICRA 2017 (5)
Supervision via Competition: Robot Adversaries for Learning Tasks, ICRA 2017 (4)
Deep Visual Foresight for Planning Robot Motion, ICRA 2017 (3)
Multilateral Surgical Pattern Cutting in 2D Orthotropic Gauze with Deep Reinforcement Learning Policies for Tensioning, ICRA 2017 (5)
Comparing Human-Centric and Robot-Centric Sampling for Robot Deep Learning from Demonstrations, ICRA 2017 (4)
RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning, arXiv (3)
Learning to Predict Where to Look in Interactive Environments Using Deep Recurrent Q-Learning, arXiv (3)

2016

Value Iteration Networks, NIPS 2016 (4)
Generative Adversarial Imitation Learning, NIPS 2016 (3)
VIME: Variational Information Maximizing Exploration, NIPS 2016 (3)
Unsupervised Learning for Physical Interaction through Video Prediction, NIPS 2016 (1)
Deep Exploration via Bootstrapped DQN, NIPS 2016 (3)
Unifying Count-Based Exploration and Intrinsic Motivation, NIPS 2016 (1)
Principled Option Learning in Markov Decision Processes, EWRL 2016 (4)
Taming the Noise in Reinforcement Learning via Soft Updates, UAI 2016 (4)
Deep Successor Reinforcement Learning, arXiv 2016 (4)
Asynchronous Methods for Deep Reinforcement Learning, ICML 2016 (4)
Benchmarking Deep Reinforcement Learning for Continuous Control, ICML 2016 (4)
Model-Free Imitation Learning with Policy Optimization, ICML 2016 (4)
Graying the Black Box: Understanding DQNs, ICML 2016 (4)
Control of Memory, Active Perception, and Action in Minecraft, ICML 2016 (2)
Dueling Network Architectures for Deep Reinforcement Learning, ICML 2016 (4)
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, ICML 2016 (2)
Learning Deep Neural Network Policies with Continuous Memory States, ICRA 2016 (2)
Prioritized Experience Replay, ICLR 2016 (4)
High-Dimensional Continuous Control Using Generalized Advantage Estimation, ICLR 2016 (4)
Continuous Control with Deep Reinforcement Learning, ICLR 2016 (4)
Deep Spatial Autoencoders for Visuomotor Learning, ICRA 2016 (3)
End-to-End Training of Deep Visuomotor Policies, JMLR 2016 (2)
Learning the Variance of the Reward-To-Go, JMLR 2016 (3)
Deep Reinforcement Learning with Double Q-learning, AAAI 2016 (3)
Mastering the Game of Go with Deep Neural Networks and Tree Search, Nature 2016 (1)

2015

Gradient Estimation Using Stochastic Computation Graphs, NIPS 2015 (1)
Learning Continuous Control Policies by Stochastic Value Gradients, NIPS 2015 (1)
Deep Attention Recurrent Q-Network, NIPS Workshop 2015 (3)
Deep Recurrent Q-Learning for Partially Observable MDPs, AAAI-SDMIA 2015 (5)
Trust Region Policy Optimization, ICML 2015 (4)
Probabilistic Inference for Determining Options in Reinforcement Learning, ICML Workshop 2015 (3)
Massively Parallel Methods for Deep Reinforcement Learning, ICML Workshop 2015 (2)
Human-Level Control Through Deep Reinforcement Learning, Nature 2015 (5)

2014

Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, NIPS 2014 (3)
Learning Neural Network Policies with Guided Policy Search Under Unknown Dynamics, NIPS 2014 (1)
Deterministic Policy Gradient Algorithms, ICML 2014 (2)

2013

(More) Efficient Reinforcement Learning via Posterior Sampling, NIPS 2013 (1)
Playing Atari with Deep Reinforcement Learning, NIPS Workshop 2013 (5)
A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning, Foundations and Trends in Machine Learning 2013 (4)

2001 to 2012

A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning, AISTATS 2011 (3)
Maximum Entropy Inverse Reinforcement Learning, AAAI 2008 (4)
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning, JMLR 2004 (1)

2000 and Earlier

Improving Generalisation for Temporal Difference Learning the Successor Representation, Neural Computation 1993 (2)
Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning, Machine Learning 1992 (2)
Active Perception and Reinforcement Learning, Neural Computation 1990 (3)

Deep Learning

(Not counting deep RL and deep IL.)

2018

TBD...

2017

Dynamic Routing Between Capsules, NIPS 2017 (1)
Tensor Regression Networks, arXiv 2017 (2)
Improved Training of Wasserstein GANs, arXiv 2017 (1)
Wasserstein GAN, ICML 2017 (3)
PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications, ICLR 2017 (1)
FractalNet: Ultra-Deep Neural Networks Without Residuals, ICLR 2017 (1)
Making Neural Programming Architectures Generalize via Recursion, ICLR 2017 (1)
The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables, ICLR 2017 (1)
Categorical Reparameterization with Gumbel-Softmax, ICLR 2017 (1)
Energy-Based Generative Adversarial Networks, ICLR 2017 (1)
Towards Principled Methods for Training Generative Adversarial Networks, ICLR 2017 (1)
Unrolled Generative Adversarial Networks, ICLR 2017 (3)
Understanding Deep Learning Requires Rethinking Generalization, ICLR 2017 (5)
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer, ICLR 2017 (2)
Least Squares Generative Adversarial Networks, arXiv (1)

2016

NIPS 2016 Tutorial: Generative Adversarial Networks, arXiv (4)
Improving Variational Autoencoders with Inverse Autoregressive Flow, NIPS 2016 (1)
Conditional Image Generation with PixelCNN Decoders, NIPS 2016 (2)
Using Fast Weights to Attend to the Recent Past, NIPS 2016 (2)
Improved Techniques for Training GANs, NIPS 2016 (3)
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets, NIPS 2016 (2)
WaveNet: A Generative Model for Raw Audio, arXiv (1)
Tutorial on Variational Autoencoders, arXiv (3)
Deep Residual Learning for Image Recognition, CVPR 2016 (1)
Pixel Recurrent Neural Networks, ICML 2016 (2)
Visualizing Deep Convolutional Neural Networks Using Natural Pre-Images, IJCV 2016 (1)
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, ICLR 2016 (2)
A Note on the Evaluation of Generative Models, ICLR 2016 (1)
Neural Programmer-Interpreters, ICLR 2016 (1)
Visualizing and Understanding Recurrent Networks, ICLR Workshop 2016 (1)
Attention and Augmented Recurrent Neural Networks, Distill (3)

2015

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, ICCV 2015 (2)
Spatial Transformer Networks, NIPS 2015 (4)
Deep Generative Image Models Using a Laplacian Pyramid of Adversarial Networks, NIPS 2015 (1)
Training Very Deep Networks, NIPS 2015 (2)
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, ICML 2015 (4)
DRAW: A Recurrent Neural Network For Image Generation, ICML 2015 (2)
Going Deeper with Convolutions, CVPR 2015 (1)
The Loss Surfaces of Multilayer Networks, AISTATS 2015 (3)
Very Deep Convolutional Networks for Large-Scale Image Recognition, ICLR 2015 (1)
ADAM: A Method for Stochastic Optimization, ICLR 2015 (2)
Explaining and Harnessing Adversarial Examples, ICLR 2015 (2)

2014

Conditional Generative Adversarial Nets, arXiv 2014 (5)
Recurrent Neural Network Regularization, arXiv 2014 (1)
Generative Adversarial Nets, NIPS 2014 (5)
Recurrent Models of Visual Attention, NIPS 2014 (4)
Deep Learning in Neural Networks: An Overview, arXiv (1)
Visualizing and Understanding Convolutional Networks, ECCV 2014 (3)
Revisiting Natural Gradient for Deep Networks, ICLR 2014 (1)
Auto-Encoding Variational Bayes, ICLR 2014 (3)

2013

On the Difficulty of Training Recurrent Neural Networks, ICML 2013 (1)
On the Importance of Initialization and Momentum in Deep Learning, ICML 2013 (2)
Better Mixing via Deep Representations, ICML 2013 (1)
Maxout Networks, ICML 2013 (1)

2012

ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012 (5)
Large Scale Distributed Deep Networks, NIPS 2012 (1)
Training Deep and Recurrent Networks With Hessian-Free Optimization, Neural Networks: Tricks of the Trade, 2012 (1)

2011 and Earlier

Deep Learning via Hessian-Free Optimization, ICML 2010 (2)
Curriculum Learning, ICML 2009 (2)
A Fast Learning Algorithm for Deep Belief Nets, Neural Computation 2006 (1)

Miscellaneous

(Mostly about MCMC, Machine Learning, and/or Robotics.)

2018

2017

A Vision-Guided Multi-Robot Cooperation Framework for Learning-by-Demonstration and Task Reproduction, IROS 2017 (1)
Mini-batch Tempered MCMC, arXiv 2017 (3)
Using dVRK Teleoperation to Facilitate Deep Learning of Automation Tasks for an Industrial Robot, CASE 2017 (4)
Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics, RSS 2017 (3)
In-Datacenter Performance Analysis of a Tensor Processing Unit, ISCA 2017 (1)
Autonomous Suturing Via Surgical Robot: An Algorithm for Optimal Selection of Needle Diameter, Shape, and Path, ICRA 2017 (1)
Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, IJRR (5)
On Markov Chain Monte Carlo Methods for Tall Data, JMLR 2017 (3)
A Conceptual Introduction to Hamiltonian Monte Carlo, arXiv (1)

2016

SWIRL: A Sequential Windowed Inverse Reinforcement Learning Algorithm for Robot Tasks With Delayed Rewards, WAFR 2016 (2)
Minimum-Information LQG Control Part I: Memoryless Controllers, CDC 2016 (2)
Minimum-Information LQG Control Part II: Retentive Controllers, CDC 2016 (1)
Bayesian Optimization with Robust Bayesian Neural Networks, NIPS 2016 (2)
Tumor Localization using Automated Palpation with Gaussian Process Adaptive Sampling, CASE 2016 (3)
Robot Grasping in Clutter: Using a Hierarchy of Supervisors for Learning from Demonstrations, CASE 2016 (4)
Gradient Descent Converges to Minimizers, COLT 2016 (3)
Scalable Discrete Sampling as a Multi-Armed Bandit Problem, ICML 2016 (1)
Dex-Net 1.0: A Cloud-Based Network of 3D Objects for Robust Grasp Planning Using a Multi-Armed Bandit Model with Correlated Rewards, ICRA 2016 (4)
TSC-DL: Unsupervised Trajectory Segmentation of Multi-Modal Surgical Demonstrations with Deep Learning, ICRA 2016 (3)
Automating Multi-Throw Multilateral Surgical Suturing with a Mechanical Needle Guide and Sequential Convex Optimization, ICRA 2016 (4)

2015

A Complete Recipe for Stochastic Gradient MCMC, NIPS 2015 (2)
Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC, KDD 2015 (1)
The Fundamental Incompatibility of Scalable Hamiltonian Monte Carlo and Naive Data Subsampling, ICML 2015 (2)
Learning by Observation for Surgical Subtasks: Multilateral Cutting of 3D Viscoelastic and 2D Orthotropic Tissue Phantoms, ICRA 2015 (4)

2014

Learning Accurate Kinematic Control of Cable-Driven Surgical Robots Using Data Cleaning and Gaussian Process Regression, CASE 2014 (5)
Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget, ICML 2014 (4)
Stochastic Gradient Hamiltonian Monte Carlo, ICML 2014 (3)
Towards Scaling up Markov Chain Monte Carlo: An Adaptive Subsampling Approach, ICML 2014 (4)
Autonomous Multilateral Debridement with the Raven Surgical Robot, ICRA 2014 (5)
RRE: A Game-Theoretic Intrusion Response and Recovery Engine, IEEE Transactions on Parallel and Distributed Systems 2014 (4)

2013

A Case Study of Trajectory Transfer Through Non-Rigid Registration for a Simplified Suturing Scenario, IROS 2013 (2)
Finding Locally Optimal, Collision-Free Trajectories with Sequential Convex Optimization, RSS 2013 (3)
Learning Task Error Models for Manipulation, ICRA 2013 (4)

2012 and Earlier

Bayesian Learning via Stochastic Gradient Langevin Dynaimcs, ICML 2011 (4)
MCMC Using Hamiltonian Dynamics, Handbook of Markov Chain Monte Carlo 2010 (2)
Active Perception: Interactive Manipulation for Improving Object Detection, Technical Report 2010 (3)
Superhuman Performance of Surgical Tasks by Robots using Iterative Learning from Human-Guided Demonstrations, ICRA 2010 (2)
An Introduction to the Conjugate Gradient Method Without the Agonizing Pain, Technical Report, 1994 (3)
Active Perception, Proceedings of the IEEE 1988 (2)

WilliamBUG/Paper_Notes

Reinforcement Learning and Imitation Learning

2018

2017

2016

2015

2014

2013

2001 to 2012

2000 and Earlier

Deep Learning

2018

2017

2016

2015

2014

2013

2012

2011 and Earlier

Miscellaneous

2018

2017

2016

2015

2014

2013

2012 and Earlier