Paper Notes

  • Enjoy yourself :D

RL Related

  • Tags: RL, IL, meta-learning, HRL, policy-based, value-based, model-based, model-free, on-policy, off-policy, etc.
Name Conf Arxiv Tags
Trust Region Policy Optimization ICML2015 1502.05477 policy-based
The Option-Critic Architecture AAAI2017 1609.05140 HRL, option-critic
Learning to Act by Predicting the Future ICLR2017 1611.01779 VizDoom
Meta Networks ICML2017 1703.00837 meta-learning, MetaNet, few-shot classification
FeUdal Networks for Hierarchical Reinforcement Learning ICML2017 1703.01161 FeUDalNet, HRL
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks ICML2017 1703.03400 meta-learning, MAML
Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World IROS2017 1703.06907 sim to real, domain randomization
One-Shot Imitation Learning NIPS2017 1703.07326 imitation, demonstration
Multi-Level Discovery of Deep Options - 1703.08294 DDO, HRL
DART - Noise Injection for Robust Imitation Learning CoRL2017 1703.09327 imitation learning , add noise -> more robust
Stochastic Neural Networks for Hierarchical Reinforcement Learning ICLR2017 1704.03012 HRL, StocasticNN
Deep Q-learning from Demonstrations AAAI2018 1704.03732 DQfD : imitation + RL, discrete
Parameter Space Noise for Exploration ICLR2018 1706.01905 OpenAI NoisyNet
Noisy Networks for Exploration ICLR2018 1706.10295 DeepMind NoisyNet, part of Rainbow
Deep Reinforcement Learning from Human Preferences NIPS2017 1706.03741 RL + human feedback (easier than demonstration)
Hindsight Experience Replay NIPS2017 1707.01495 HER, goal-based env, sparse reward, learn from fail
Emergence of Locomotion Behaviours in Rich Environments - 1707.02286 PPO
Robust Imitation of Diverse Behaviors NIPS2017 1707.02747 imitation learning : VAE (behavioral cloning) + GAIL
Imitation from Observation - Learning to Imitate Behaviors from Raw Video via Context Translation ICRA2018 1707.03374 imitation learning from obs, context translation
Reverse Curriculum Generation for Reinforcement Learning CoRL2017 1707.05300 reverse curriculum
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards - 1707.08817 DDPGfD : DDPG + DQfD, off-policy imitation, continuous goal-based env
When Waiting is not an Option - Learning Options with a Deliberation Cost AAAI2018 1709.04571 HRL, A2OC : A3C + OC + deliberation cost
Autonomous Extracting a Hierarchical Structure of Tasks in Reinforcement Learning and Multi-task Reinforcement Learning - 1709.04579 HRL, association rule
One-Shot Visual Imitation Learning via Meta-Learning CoRL2017 1709.04905 MIL : meta learning (MAML) + imitation learning (BC)
Overcoming Exploration in Reinforcement Learning with Demonstrations ICRA2018 1709.10089 Similar to DDPGfD, imitation + DDPG + HER

Speech


To read

  • [ICML 2017] 1703.02702 - Robust Adversarial Reinforcement Learning
  • [ICML 2017] 1706.05064 - Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
  • 1708.05866 - A Brief Survey of Deep Reinforcement Learning
  • [NIPS 2017] 1710.03592 - Meta Inverse Reinforcement Learning via Maximum Reward Sharing for Human Motion Analysis
  • 1802.03596 - Deep Meta-Learning: Learning to Learn in the Concept Space
  • [ICLR 2018] 1802.09081 - Temporal Difference Models: Model-Free Deep RL for Model-Based Control
  • 1802.10567 - Learning by Playing - Solving Sparse Reward Tasks from Scratch
  • [ICLR 2018] 1803.00933 - Distributed Prioritized Experience Replay
  • [ICLR 2018] Extending Robust Adversarial Reinforcement Learning Considering Adaptation and Diversity
  • [ICLR 2018] Learning to Teach
  • [ICLR 2018] Learning an Embedding Space for Transferable Robot Skills

Currently no notes

  • 1706.09529 - Learning to Learn: Meta-Critic Networks for Sample Efficient Learning
  • 1710.03463 - Learning to Generalize: Meta-Learning for Domain Generalization
  • 1712.00948 - Hierarchical Actor-Critic

Skipped

  • [NIPS 2017] 1712.08266 - Federated Control with Hierarchical Multi-Agent Deep Reinforcement Learning
  • [ICLR 2018] 1801.08930 - Recasting Gradient-Based Meta-Learning as Hierarchical Bayes
  • 1802.07245 - Meta-Reinforcement Learning of Structured Exploration Strategies
  • 1802.09564 - Reinforcement and Imitation Learning for Diverse Visuomotor Skills
  • [ICLR 2018] Zero-Shot Visual Imitation

DL, ML, CV, etc.