Example codes to implement the examples in Richard's book, Reinforcement Learning: An Introduction.
-
1TenArmedBandits.py: code scrip regarding to the k-armed bandit problem in Chapter 2.
- Bandit class
- Agent class
- epsilon-greedy algorithm
- optimistic initial values
- ucb
- gradient algorithm
-
2GridWorld_Ch3.py: code script regarding to example in Chapter 3
- GridWorld
- values estimated based on Bellman equation
- values estimated based on Bellman optimal equation
-
3Carrental_Ch4.py: code script for car rental problem in Chapter4. However, there still exists bug in maybe get_expected_return function.
- JackRentalCompany, class simulating the car rental company
- policy_evaluate in Agent class
- policy_improve in Agent class
- debug!!!
-
6CliffWalk_Ch6.py: code script for handling cliff walking problem in Chapter 6 by SARSA and Q-LEANING.
- CliffWalk, lass simulaitng the cliff walking game
- Q_net, class to store q-values
- train_sarsa, function to estimate q values by SARSA
- train_q_learning, function to estimate q values by Q-learning
-
13CliffWalking_Ch13.py: code script for cliff walking problem in Chapter 13 with various methods.
- CliffWalk, class simulaitng the cliff walking game
- REINFORCE
- REINFORCE with Baseline
- Actor-Critic w/o eligibility trace
- Car Rental in Chapter 4
- Various methods for solving Blackjack in Chapter 5
- Q-learning for Cliff Walking in Chapter 6
- Policy Gradient Methods for Cliff Walking problem in Chapter 13