Awesome Deep Reinforcement Learning project
updated Landscape of DRL
Landscape of DRL
This project is built for people who are learning and researching on latest deep reinforcement learning methods.
Illustrations:
Recommendations and suggestions are welcome.
General guidances
General Benchmark Testing Frameworks
- S-RL Toolbox: Environments, Datasets and Evaluation Metrics for State Representation Learning 25 Sept 2018
- dopamine
- StarCraft II
- tfrl
- chainerrl
Value based methods
- TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning 8 Mar 2018
- DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY 2 Mar 2018
- Rainbow: Combining Improvements in Deep Reinforcement Learning 6 Oct 2017
- Learning from Demonstrations for Real World Reinforcement Learning 12 Apr 2017
- Dueling Network Architecture
- Double DQN
- Prioritized Experience
- Deep Q-Networks
Policy gradient methods
- Clipped Action Policy Gradient 22 June 2018
- Expected Policy Gradients for Reinforcement Learning 10 Jan 2018
- Proximal Policy Optimization Algorithms 20 July 2017
- Emergence of Locomotion Behaviours in Rich Environments 7 July 2017
- Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning 1 Jun 2017
- Equivalence Between Policy Gradients and Soft Q-Learning
- Trust Region Policy Optimization
- Reinforcement Learning with Deep Energy-Based Policies
- Q-PROP: SAMPLE-EFFICIENT POLICY GRADIENT WITH AN OFF-POLICY CRITIC
Explorations in DRL
- The Uncertainty Bellman Equation and Exploration 15 Sep 2017
- Noisy Networks for Exploration 30 Jun 2017 implementation
- Count-Based Exploration in Feature Space for Reinforcement Learning 25 Jun 2017
- Count-Based Exploration with Neural Density Models 14 Jun 2017
- UCB and InfoGain Exploration via Q-Ensembles 11 Jun 2017
- Minimax Regret Bounds for Reinforcement Learning 16 Mar 2017
- Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models
- EX2: Exploration with Exemplar Models for Deep Reinforcement Learning
Actor-Critic methods
- The Reactor: A Sample-Efficient Actor-Critic Architecture 15 Apr 2017
- SAMPLE EFFICIENT ACTOR-CRITIC WITH EXPERIENCE REPLAY
- REINFORCEMENT LEARNING WITH UNSUPERVISED AUXILIARY TASKS
- Continuous control with deep reinforcement learning
Model-based methods
- Model-Based Stabilisation of Deep Reinforcement Learning 6 Sep 2018
- Learning model-based planning from scratch 19 July 2017
Model-free + Model-based
Option
- Variational Option Discovery Algorithms 26 July 2018
- A Laplacian Framework for Option Discovery in Reinforcement Learning 16 Jun 2017
Connection with other methods
- Robust Imitation of Diverse Behaviors
- Learning human behaviors from motion capture by adversarial imitation
- Connecting Generative Adversarial Networks and Actor-Critic Methods
Connecting value and policy methods
- Bridging the Gap Between Value and Policy Based Reinforcement Learning
- Policy gradient and Q-learning
Reward design
Unifying
Faster DRL
Apply RL to other domains
Multiagent Settings
- Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning 4 Nov 2018
- INTRINSIC SOCIAL MOTIVATION VIA CAUSAL INFLUENCE IN MULTI-AGENT RL 19 Oct 2018
- Modeling Others using Oneself in Multi-Agent Reinforcement Learning 26 Feb 2018
- The Mechanics of n-Player Differentiable Games 15 Feb 2018
- Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments 10 Oct 2017
- Learning with Opponent-Learning Awareness 13 Sep 2017
- Counterfactual Multi-Agent Policy Gradients
- Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments 7 Jun 2017
- Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games 29 Mar 2017
New design
- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures 9 Feb 2018
- Reverse Curriculum Generation for Reinforcement Learning
- Trial without Error: Towards Safe Reinforcement Learning via Human Intervention
- Learning to Design Games: Strategic Environments in Deep Reinforcement Learning 5 July 2017
Multitask
- Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning 7 Nov 2017
- Distral: Robust Multitask Reinforcement Learning 13 July 2017
Observational Learning
- Observational Learning by Reinforcement Learning 20 Jun 2017
Meta Learning
Distributional
- GAN Q-learning 20 July 2018
- Implicit Quantile Networks for Distributional Reinforcement Learning 14 Jun 2018
- Nonlinear Distributional Gradient Temporal-Difference Learning 20 May 2018
- DISTRIBUTED DISTRIBUTIONAL DETERMINISTIC POLICY GRADIENTS 23 Apr 2018
- An Analysis of Categorical Distributional Reinforcement Learning 22 Feb 2018
- Distributional Reinforcement Learning with Quantile Regression 27 Oct 2017
- A Distributional Perspective on Reinforcement Learning 21 July 2017