RL-Course

The course is based on David Silver's RL Course. During the 8-week course, you will become versed in basic knowledge and algorithm implementation for RL. Additionally, the in-class quizzes will help you practice existing knowlegde. Through multiple hands-on assignments and two final course projects, you will acquire the programming details and practical engineering tricks for training in DRL.

Video: [Youtube HD], [Youtube with CC subtitle] and [Bilibili with chinese subtitles]

Week 1: Introduction to Reinforcement Learning

P24 An example about markov process (MP). Code:ipynb file
P50 The gridworld example will be implemented in lecture 3.

Week 2: Markov Decision Processes

Code: ipynb or md

Lecture2 Slide P19 Example:State-Value Function for Student MRP (2) gamma=0.9

direct solution by matrix calculation.
The iterative method.

Lecture2 Slide P32 Example: State-Value Function for Student MDP

direct solution by matrix calculation.
The iterative method.

Week 3: Planning by Dynamic Programming

Code: Gridworld-ipynb

policy evaluation for random policy
policy iteration
in-place policy itertion
value iteration
in-place value iteration
Comparison about the convergence rate of above method