Inspired by Adrian Colyer and Denny Britz.
This contains my notes for research papers that I've read. Papers are arranged according to three broad categories and then further numbered on a (1) to (5) scale where a (1) means I have only barely skimmed it, while a (5) means I feel confident that I understand almost everything about the paper. Within a single year, these papers should be organized according to publication date. The links here go to my paper summaries if I have them, otherwise those papers are on my TODO list.
- Meta Learning Shared Hierarchies, ... (3)
- Parameterized Hierarchical Procedures for Neural Programming, ... (2)
NIPS, CoRL, IROS, etc.
- One-Shot Imitation Learning, NIPS 2017 (3)
- #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning, NIPS 2017 (4)
- Robust Imitation of Diverse Behaviors, NIPS 2017 (3)
- Bridging the Gap Between Value and Policy Based Reinforcement Learning, NIPS 2017 (2)
- Inferring The Latent Structure of Human Decision-Making from Raw Visual Inputs, NIPS 2017 (5)
- Distral: Robust Multitask Reinforcement Learning, NIPS 2017 (1)
- Imagination-Augmented Agents for Deep Reinforcement Learning, NIPS 2017 (1)
- Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments, NIPS 2017 (1)
- Hindsight Experience Replay, NIPS 2017 (1)
- DART: Noise Injection for Robust Imitation Learning, CoRL 2017 (3)
- Learning Deep Policies for Robot Bin Picking by Simulating Robust Grasping Sequences, CoRL, 2017 (3)
- DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations, CoRL (4)
- End-to-End Learning of Semantic Grasping, CoRL 2017 (1)
- One-Shot Visual Imitation Learning via Meta-Learning, CoRL 2017 (1)
- Visual Semantic Planning using Deep Successor Representations, ICCV 2017 (4)
- Proximal Policy Optimization Algorithms, arXiv (4)
- Learning Human Behaviors From Motion Capture by Adversarial Imitation, arXiv (3)
- The Uncertainty Bellman Equation and Exploration, arXiv (1)
- Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets, arXiv (1)
- UCB and InfoGain Exploration via Q-Ensembles, arXiv (1)
- Equivalence Between Policy Gradients and Soft Q-Learning, arXiv (1)
- Automatic Goal Generation for Reinforcement Learning Agents, arXiv (2)
ICML, UAI, IROS, etc.
- Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World, IROS 2017 (4)
- Virtual to Real Reinforcement Learning for Autonomous Driving, BMVC 2017 (3)
- ReasoNet: Learning to Stop Reading in Machine Comprehension, KDD 2017 (3)
- Inverse Reinforcement Learning via Deep Gaussian Process, UAI 2017 (2)
- Reinforcement Learning with Deep Energy-Based Policies, ICML 2017 (2)
- FeUdal Networks for Hierarchical Reinforcement Learning, ICML 2017 (1)
- A Distributional Perspective on Reinforcement Learning, ICML 2017 (2)
- Robust Adversarial Reinforcement Learning, ICML 2017 (5)
- Modular Multitask Reinforcement Learning with Policy Sketches, ICML 2017 (2)
- End-to-End Differentiable Adversarial Imitation Learning, ICML 2017 (4)
- Constrained Policy Optimization, ICML 2017 (2)
- Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, ICML 2017 (4)
- Curiosity-Driven Exploration by Self-Supervised Prediction, ICML 2017 (3)
- Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access, ACL 2017 (3)
- Loss is its own Reward: Self-Supervision for Reinforcement Learning, arXiv (2)
- Evolution Strategies as a Scalable Alternative to Reinforcement Learning, arXiv (5)
ICRA, ICLR, etc.
- Imitating Driver Behavior with Generative Adversarial Networks, Intelligent Vehicles (IV), 2017 (4)
- Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data, ICLR 2017 (1)
- Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening, ICLR 2017 (1)
- Reinforcement Learning with Unsupervised Auxiliary Tasks, ICLR 2017 (3)
- Learning to Act by Predicting the Future, ICLR 2017 (4)
- Learning Visual Servoing with Deep Features and Fitted Q-Iteration, ICLR 2017 (2)
- Q-Prop: Sample-Efficient Policy Gradient with an Off-Policy Critic, ICLR 2017 (2)
- Stochastic Neural Networks for Hierarchical Reinforcement Learning, ICLR 2017 (4)
- Third-Person Imitation Learning, ICLR 2017 (3)
- Target-Driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning, ICRA 2017 (5)
- Supervision via Competition: Robot Adversaries for Learning Tasks, ICRA 2017 (4)
- Deep Visual Foresight for Planning Robot Motion, ICRA 2017 (3)
- Multilateral Surgical Pattern Cutting in 2D Orthotropic Gauze with Deep Reinforcement Learning Policies for Tensioning, ICRA 2017 (5)
- Comparing Human-Centric and Robot-Centric Sampling for Robot Deep Learning from Demonstrations, ICRA 2017 (4)
- RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning, arXiv (3)
- Learning to Predict Where to Look in Interactive Environments Using Deep Recurrent Q-Learning, arXiv (3)
- Value Iteration Networks, NIPS 2016 (4)
- Generative Adversarial Imitation Learning, NIPS 2016 (3)
- VIME: Variational Information Maximizing Exploration, NIPS 2016 (3)
- Unsupervised Learning for Physical Interaction through Video Prediction, NIPS 2016 (1)
- Deep Exploration via Bootstrapped DQN, NIPS 2016 (3)
- Unifying Count-Based Exploration and Intrinsic Motivation, NIPS 2016 (1)
- Principled Option Learning in Markov Decision Processes, EWRL 2016 (4)
- Taming the Noise in Reinforcement Learning via Soft Updates, UAI 2016 (4)
- Deep Successor Reinforcement Learning, arXiv 2016 (4)
- Asynchronous Methods for Deep Reinforcement Learning, ICML 2016 (4)
- Benchmarking Deep Reinforcement Learning for Continuous Control, ICML 2016 (4)
- Model-Free Imitation Learning with Policy Optimization, ICML 2016 (4)
- Graying the Black Box: Understanding DQNs, ICML 2016 (4)
- Control of Memory, Active Perception, and Action in Minecraft, ICML 2016 (2)
- Dueling Network Architectures for Deep Reinforcement Learning, ICML 2016 (4)
- Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, ICML 2016 (2)
- Learning Deep Neural Network Policies with Continuous Memory States, ICRA 2016 (2)
- Prioritized Experience Replay, ICLR 2016 (4)
- High-Dimensional Continuous Control Using Generalized Advantage Estimation, ICLR 2016 (4)
- Continuous Control with Deep Reinforcement Learning, ICLR 2016 (4)
- Deep Spatial Autoencoders for Visuomotor Learning, ICRA 2016 (3)
- End-to-End Training of Deep Visuomotor Policies, JMLR 2016 (2)
- Learning the Variance of the Reward-To-Go, JMLR 2016 (3)
- Deep Reinforcement Learning with Double Q-learning, AAAI 2016 (3)
- Mastering the Game of Go with Deep Neural Networks and Tree Search, Nature 2016 (1)
- Gradient Estimation Using Stochastic Computation Graphs, NIPS 2015 (1)
- Learning Continuous Control Policies by Stochastic Value Gradients, NIPS 2015 (1)
- Deep Attention Recurrent Q-Network, NIPS Workshop 2015 (3)
- Deep Recurrent Q-Learning for Partially Observable MDPs, AAAI-SDMIA 2015 (5)
- Trust Region Policy Optimization, ICML 2015 (4)
- Probabilistic Inference for Determining Options in Reinforcement Learning, ICML Workshop 2015 (3)
- Massively Parallel Methods for Deep Reinforcement Learning, ICML Workshop 2015 (2)
- Human-Level Control Through Deep Reinforcement Learning, Nature 2015 (5)
- Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, NIPS 2014 (3)
- Learning Neural Network Policies with Guided Policy Search Under Unknown Dynamics, NIPS 2014 (1)
- Deterministic Policy Gradient Algorithms, ICML 2014 (2)
- (More) Efficient Reinforcement Learning via Posterior Sampling, NIPS 2013 (1)
- Playing Atari with Deep Reinforcement Learning, NIPS Workshop 2013 (5)
- A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning, Foundations and Trends in Machine Learning 2013 (4)
- A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning, AISTATS 2011 (3)
- Maximum Entropy Inverse Reinforcement Learning, AAAI 2008 (4)
- Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning, JMLR 2004 (1)
- Improving Generalisation for Temporal Difference Learning the Successor Representation, Neural Computation 1993 (2)
- Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning, Machine Learning 1992 (2)
- Active Perception and Reinforcement Learning, Neural Computation 1990 (3)
(Not counting deep RL and deep IL.)
TBD...
- Dynamic Routing Between Capsules, NIPS 2017 (1)
- Tensor Regression Networks, arXiv 2017 (2)
- Improved Training of Wasserstein GANs, arXiv 2017 (1)
- Wasserstein GAN, ICML 2017 (3)
- PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications, ICLR 2017 (1)
- FractalNet: Ultra-Deep Neural Networks Without Residuals, ICLR 2017 (1)
- Making Neural Programming Architectures Generalize via Recursion, ICLR 2017 (1)
- The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables, ICLR 2017 (1)
- Categorical Reparameterization with Gumbel-Softmax, ICLR 2017 (1)
- Energy-Based Generative Adversarial Networks, ICLR 2017 (1)
- Towards Principled Methods for Training Generative Adversarial Networks, ICLR 2017 (1)
- Unrolled Generative Adversarial Networks, ICLR 2017 (3)
- Understanding Deep Learning Requires Rethinking Generalization, ICLR 2017 (5)
- Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer, ICLR 2017 (2)
- Least Squares Generative Adversarial Networks, arXiv (1)
- NIPS 2016 Tutorial: Generative Adversarial Networks, arXiv (4)
- Improving Variational Autoencoders with Inverse Autoregressive Flow, NIPS 2016 (1)
- Conditional Image Generation with PixelCNN Decoders, NIPS 2016 (2)
- Using Fast Weights to Attend to the Recent Past, NIPS 2016 (2)
- Improved Techniques for Training GANs, NIPS 2016 (3)
- InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets, NIPS 2016 (2)
- WaveNet: A Generative Model for Raw Audio, arXiv (1)
- Tutorial on Variational Autoencoders, arXiv (3)
- Deep Residual Learning for Image Recognition, CVPR 2016 (1)
- Pixel Recurrent Neural Networks, ICML 2016 (2)
- Visualizing Deep Convolutional Neural Networks Using Natural Pre-Images, IJCV 2016 (1)
- Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, ICLR 2016 (2)
- A Note on the Evaluation of Generative Models, ICLR 2016 (1)
- Neural Programmer-Interpreters, ICLR 2016 (1)
- Visualizing and Understanding Recurrent Networks, ICLR Workshop 2016 (1)
- Attention and Augmented Recurrent Neural Networks, Distill (3)
- Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, ICCV 2015 (2)
- Spatial Transformer Networks, NIPS 2015 (4)
- Deep Generative Image Models Using a Laplacian Pyramid of Adversarial Networks, NIPS 2015 (1)
- Training Very Deep Networks, NIPS 2015 (2)
- Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, ICML 2015 (4)
- DRAW: A Recurrent Neural Network For Image Generation, ICML 2015 (2)
- Going Deeper with Convolutions, CVPR 2015 (1)
- The Loss Surfaces of Multilayer Networks, AISTATS 2015 (3)
- Very Deep Convolutional Networks for Large-Scale Image Recognition, ICLR 2015 (1)
- ADAM: A Method for Stochastic Optimization, ICLR 2015 (2)
- Explaining and Harnessing Adversarial Examples, ICLR 2015 (2)
- Conditional Generative Adversarial Nets, arXiv 2014 (5)
- Recurrent Neural Network Regularization, arXiv 2014 (1)
- Generative Adversarial Nets, NIPS 2014 (5)
- Recurrent Models of Visual Attention, NIPS 2014 (4)
- Deep Learning in Neural Networks: An Overview, arXiv (1)
- Visualizing and Understanding Convolutional Networks, ECCV 2014 (3)
- Revisiting Natural Gradient for Deep Networks, ICLR 2014 (1)
- Auto-Encoding Variational Bayes, ICLR 2014 (3)
- On the Difficulty of Training Recurrent Neural Networks, ICML 2013 (1)
- On the Importance of Initialization and Momentum in Deep Learning, ICML 2013 (2)
- Better Mixing via Deep Representations, ICML 2013 (1)
- Maxout Networks, ICML 2013 (1)
- ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012 (5)
- Large Scale Distributed Deep Networks, NIPS 2012 (1)
- Training Deep and Recurrent Networks With Hessian-Free Optimization, Neural Networks: Tricks of the Trade, 2012 (1)
- Deep Learning via Hessian-Free Optimization, ICML 2010 (2)
- Curriculum Learning, ICML 2009 (2)
- A Fast Learning Algorithm for Deep Belief Nets, Neural Computation 2006 (1)
(Mostly about MCMC, Machine Learning, and/or Robotics.)
- Learning Robust Bed Making using Deep Imitation Learning with Dart, ..., (4)
- Derivative-Free Failure Avoidance Control for Manipulation using Learned Support Constraints, ..., (1)
- A Vision-Guided Multi-Robot Cooperation Framework for Learning-by-Demonstration and Task Reproduction, IROS 2017 (1)
- Mini-batch Tempered MCMC, arXiv 2017 (3)
- Using dVRK Teleoperation to Facilitate Deep Learning of Automation Tasks for an Industrial Robot, CASE 2017 (4)
- Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics, RSS 2017 (3)
- In-Datacenter Performance Analysis of a Tensor Processing Unit, ISCA 2017 (1)
- Autonomous Suturing Via Surgical Robot: An Algorithm for Optimal Selection of Needle Diameter, Shape, and Path, ICRA 2017 (1)
- Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, IJRR (5)
- On Markov Chain Monte Carlo Methods for Tall Data, JMLR 2017 (3)
- A Conceptual Introduction to Hamiltonian Monte Carlo, arXiv (1)
- SWIRL: A Sequential Windowed Inverse Reinforcement Learning Algorithm for Robot Tasks With Delayed Rewards, WAFR 2016 (2)
- Minimum-Information LQG Control Part I: Memoryless Controllers, CDC 2016 (2)
- Minimum-Information LQG Control Part II: Retentive Controllers, CDC 2016 (1)
- Bayesian Optimization with Robust Bayesian Neural Networks, NIPS 2016 (2)
- Tumor Localization using Automated Palpation with Gaussian Process Adaptive Sampling, CASE 2016 (3)
- Robot Grasping in Clutter: Using a Hierarchy of Supervisors for Learning from Demonstrations, CASE 2016 (4)
- Gradient Descent Converges to Minimizers, COLT 2016 (3)
- Scalable Discrete Sampling as a Multi-Armed Bandit Problem, ICML 2016 (1)
- Dex-Net 1.0: A Cloud-Based Network of 3D Objects for Robust Grasp Planning Using a Multi-Armed Bandit Model with Correlated Rewards, ICRA 2016 (4)
- TSC-DL: Unsupervised Trajectory Segmentation of Multi-Modal Surgical Demonstrations with Deep Learning, ICRA 2016 (3)
- Automating Multi-Throw Multilateral Surgical Suturing with a Mechanical Needle Guide and Sequential Convex Optimization, ICRA 2016 (4)
- A Complete Recipe for Stochastic Gradient MCMC, NIPS 2015 (2)
- Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC, KDD 2015 (1)
- The Fundamental Incompatibility of Scalable Hamiltonian Monte Carlo and Naive Data Subsampling, ICML 2015 (2)
- Learning by Observation for Surgical Subtasks: Multilateral Cutting of 3D Viscoelastic and 2D Orthotropic Tissue Phantoms, ICRA 2015 (4)
- Learning Accurate Kinematic Control of Cable-Driven Surgical Robots Using Data Cleaning and Gaussian Process Regression, CASE 2014 (5)
- Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget, ICML 2014 (4)
- Stochastic Gradient Hamiltonian Monte Carlo, ICML 2014 (3)
- Towards Scaling up Markov Chain Monte Carlo: An Adaptive Subsampling Approach, ICML 2014 (4)
- Autonomous Multilateral Debridement with the Raven Surgical Robot, ICRA 2014 (5)
- RRE: A Game-Theoretic Intrusion Response and Recovery Engine, IEEE Transactions on Parallel and Distributed Systems 2014 (4)
- A Case Study of Trajectory Transfer Through Non-Rigid Registration for a Simplified Suturing Scenario, IROS 2013 (2)
- Finding Locally Optimal, Collision-Free Trajectories with Sequential Convex Optimization, RSS 2013 (3)
- Learning Task Error Models for Manipulation, ICRA 2013 (4)
- Bayesian Learning via Stochastic Gradient Langevin Dynaimcs, ICML 2011 (4)
- MCMC Using Hamiltonian Dynamics, Handbook of Markov Chain Monte Carlo 2010 (2)
- Active Perception: Interactive Manipulation for Improving Object Detection, Technical Report 2010 (3)
- Superhuman Performance of Surgical Tasks by Robots using Iterative Learning from Human-Guided Demonstrations, ICRA 2010 (2)
- An Introduction to the Conjugate Gradient Method Without the Agonizing Pain, Technical Report, 1994 (3)
- Active Perception, Proceedings of the IEEE 1988 (2)