The course is based on David Silver's RL Course. During the 8-week course, you will become versed in basic knowledge and algorithm implementation for RL. Additionally, the in-class quizzes will help you practice existing knowlegde. Through multiple hands-on assignments and two final course projects, you will acquire the programming details and practical engineering tricks for training in DRL.
Video: [Youtube HD], [Youtube with CC subtitle] and [Bilibili with chinese subtitles]
P24 An example about markov process (MP). Code:ipynb file
P50 The gridworld example will be implemented in lecture 3.
- direct solution by matrix calculation.
- The iterative method.
- direct solution by matrix calculation.
- The iterative method.
Code: Gridworld-ipynb
- policy evaluation for random policy
- policy iteration
- in-place policy itertion
- value iteration
- in-place value iteration
- Comparison about the convergence rate of above method
TO BE CONTINUED.