/RL_Course-slides_hw

python implementation of the example in David Silver's RL Course

Primary LanguageJupyter Notebook

RL-Course

The course is based on David Silver's RL Course. During the 8-week course, you will become versed in basic knowledge and algorithm implementation for RL. Additionally, the in-class quizzes will help you practice existing knowlegde. Through multiple hands-on assignments and two final course projects, you will acquire the programming details and practical engineering tricks for training in DRL.

Video: [Youtube HD], [Youtube with CC subtitle] and [Bilibili with chinese subtitles]

Week 1: Introduction to Reinforcement Learning

P24 An example about markov process (MP). Code:ipynb file
P50 The gridworld example will be implemented in lecture 3.

Week 2: Markov Decision Processes

Code: ipynb or md

Lecture2 Slide P19 Example:State-Value Function for Student MRP (2) gamma=0.9

  1. direct solution by matrix calculation.
  2. The iterative method.

Lecture2 Slide P32 Example: State-Value Function for Student MDP

  1. direct solution by matrix calculation.
  2. The iterative method.

Week 3: Planning by Dynamic Programming

Code: Gridworld-ipynb

  • policy evaluation for random policy
  • policy iteration
  • in-place policy itertion
  • value iteration
  • in-place value iteration
  • Comparison about the convergence rate of above method

TO BE CONTINUED.