Curling Robot @ TongYe
- Policy gradient
- DDPG
- A3C
- PPO
- Q-learning
- DQN
- A3C
- PPO
[1] https://blog.csdn.net/kenneth_yu/article/details/78478356 DDPG
[2] https://www.ibm.com/developerworks/cn/analytics/library/ba-lo-deep-introduce-policy-gradient/index.html Policy Gradient (PG)
[6] https://medium.com/@jonathan_hui/rl-policy-gradients-explained-9b13b688b146 Jonathan Hui, Medium Blog 最好的blog
[5] https://www.bilibili.com/video/av35757082/?p=28 李宏毅深度学习视频
[6] https://blog.csdn.net/LagrangeSK/article/details/81010195 强化学习教程Blog,讲的很详细,结合大神的PPTT
ps:李宏毅牛逼,需要过一遍讲义