/reinforcement-learning-an-introduction

Python Implementation of Reinforcement Learning: An Introduction

Primary LanguagePython

Reinforcement Learning: An Introduction

Python implementation for Sutton & Barto's Reinforcement Learning: An Introduction (2nd Edition)

Declare: Most of codes are modified from ShangtongZhang, but rewrite the codes to make it easy to understand. I not only write the codes for figures, but also complete some exercises in the book.

Contents

Chapter 2 Multi-armed Bandits

  1. Figure 2.1: An example bandit problem from the 10-armed testbed.
  2. Figure 2.2: Average performance of epsilon-greedy action-value methods on the 10-armed testbed.
  3. Figure 2.3: The effect of optimistic initial action-value estimates on the 10-armed testbed.
  4. Figure 2.4: Average performance of UCB action selection on the 10-armed testbed.
  5. Figure 2.5: Average performance of the gradient bandit algorithm.
  6. Figure 2.6: A parameter study of the various bandit algorithms.
  7. Exercise 2.5
  8. Exercise 2.11

Chapter 3 Finite Markov Decision Processes

  1. Figure 3.2: Gridworld example.
  2. Figure 3.5: Optimal solutions to the gridworld example.

Environment

Reference

Feel free to discuss with me if you have any questions !【Homepage: http://guohai.tech Email: xuguohai7@163.com