这是我个人学习强化学习的时候收集的比较经典的学习资料、笔记和代码,分享给所有人。
为了直接在GitHub上用markdown文件看公式,推荐安装chrome插件:MathJax Plugin for Github
-
Sutton 的 Reinforcement Learning: An Introduction书本学习笔记
- 1. Introduction
- 2. Multi-armed Bandits
- 3. Finite Markov DecisionProcesses
- 4. Dynamic Programming
- 5. Monte Carlo Methods
- 6. Temporal-Difference Learning
- 7. n-step Bootstrapping
- 8. Planning and Learning with Tabular Methods
- 9. On-policy Prediction with Approximation
- 10. On-policy Control with Approximation
- 11. Off-policy Methods with Approximation
- 12. Eligibility Traces
- 13. Policy Gradient Methods
- 14. Psychology
- 15. Neuroscience
- 16. Applications and Case Studies
- 17. Frontiers
所有的实验源代码都在lib
目录下,来自dennybritz。在原先代码的基础上,增加了对实验背景的具体介绍、代码和公式的对照。
- Gridworld:对应MDP的Dynamic Programming
- Blackjack:对应Model Free的Monte Carlo的Planning和Controlling
- Windy Gridworld:对应Model Free的Temporal Difference的On-Policy Controlling:SARSA。
- Cliff Walking:对应Model Free的Temporal Difference的Off-Policy Controlling:Q-learning。
- Mountain Car:对应Q表格很大无法处理(state空间连续)的Q-Learning with Linear Function Approximation。
- Atari:对应Deep-Q Learning。