awesome-reinforcement-learning

强化学习的相关学习资源、链接

网址教程资源

教程网址

Book

Video Course

博客网址

领域专家

Awesome

Algorithm Repos

强化学习实战资源

Implementation of Algorithms

Project

论文

DQN-arxiv (Deep Q-Networks ): Mnih et al, 2013
- DQN-nature(Deep Q-Network ); Mnih et al, 2015
- Double DQN (Double Q Network) : Hasselt et al, 2015
- Dueling DQN (Duling Q Network) : Ziyu Wang et al, 2015
- QR-DQN (Quantile Regression DQN): Dabney et al, 2017
Alpha Go(Mastering the game of Go with deep neural networks and tree search)
- AlphaZero-arxiv (Mastering Chess and Shogi by Self-Play) :Silver et al, 2017
- AlphaZero-nature (Go without human knowledge) :Silver et al, 2017
SAC (Off-Policy Maximum Entropy): Haarnoja et al, 2018
- SAC (Algorithms and Applications) : Haarnoja, et al 2018
A2C / A3C (Asynchronous Advantage Actor-Critic): Mnih et al, 2016
PPO (Proximal Policy Optimization): Schulman et al, 2017
TRPO (Trust Region Policy Optimization): Schulman et al, 2015
DPG (Deterministic Policy Gradient) : DavidSilver et al, 2014
DDPG (Deep Deterministic Policy Gradient): Lillicrap et al, 2015
TD3 (Twin Delayed DDPG): Fujimoto et al, 2018
NAF (Normalized adantage functions) : ShixiangGu et al, 2016
C51 (Categorical 51-Atom DQN): Bellemare et al, 2017
HER (Hindsight Experience Replay): Andrychowicz et al, 2017
World Models Ha and Schmidhuber, 2018
I2A (Imagination-Augmented Agents): Weber et al, 2017
MBMF (Model-Based RL with Model-Free Fine-Tuning): Nagabandi et al, 2017
MBVE (Model-Based Value Expansion): Feinberg et al, 2018
PathNet(Evolution Channels Gradient Descent): Fernando et al, 2017
plannet(Learning Latent Dynamics) : Hafner, et al, 2018
TCN (Time-Contrastive Networks):Sermanet, et al, 2017
Reinforcement and Imitation Learning : Yuke Zhu†, et al 2018
Prioritized experience replay:Schaul, et al 2015
Policy distillation : Rusu, et al 2015
Unifying Count-Based Exploration and Intrinsic Motivation : Bellemare, et al 2015
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models : Stadie, et al 2015
Action-Conditional Video Prediction using Deep Networks in Atari Games : JunhyukOh, et al 2015
Control of Memory, Active Perception, and Action in Minecraft : JunhyukOh, et al 2015

swjtuwy/awesome-reinforcement-learning

awesome-reinforcement-learning

网址教程资源

教程网址

Book

Video Course

博客网址

领域专家

Awesome

Algorithm Repos

强化学习实战资源

Implementation of Algorithms

Project

论文