ShangtongZhang/reinforcement-learning-an-introduction
Python Implementation of Reinforcement Learning: An Introduction
PythonMIT
Issues
- 12
Release date - 2nd Edition
#7 opened by andrewcz - 1
break ties in Gambler's Problem
#83 opened by hansweytjens - 5
- 4
- 1
Chapter 6: Random Walk --> Infinite loop
#72 opened by xenomeno - 5
Chapter 5: Monte Carlo ES initial policy
#77 opened by jerome-white - 2
- 1
CHAPTER1 ,TicTacToe.py: Purpose of reshape function?
#79 opened by pk97 - 2
Chapter 3: GridWorld
#78 opened by ychong - 3
chapter13
#66 opened by liiiiiiiiil - 1
Chapter4 - Suggestion
#74 opened by JustinNie - 1
- 5
- 6
np.argmax may lead to unexpected behavior
#51 opened by ShangtongZhang - 1
- 4
Policy evaluation for GridWorld
#67 opened by cbrom - 4
- 2
- 2
Chapter 4, Gambler's problem incorrect output
#62 opened by sharmavedic - 1
- 1
Is there any boundary for policies with off-policy Q-learning using the tree backup algorithm?
#57 opened - 1
The link to the book is broken
#58 opened - 1
One question posted on SOF
#56 opened by cinqs - 0
Make the value function invariant under rotation and mirror of the board for Tic-Tac-Toe
#4 opened by ShangtongZhang - 2
- 1
- 3
- 1
questions on Q-learning applied for racetrack
#47 opened by xubo92 - 1
- 4
About the action selection in Double Q-Learning
#46 opened by ewanlee - 3
- 8
- 7
ImportError: No module named utils.utils
#37 opened by SJTUGuofei - 2
- 2
exercise 4.8
#40 opened by persistforever - 4
gambler's problem
#39 opened by datahaki - 2
- 6
should there be some changes about the code in chapter 12 "Eligibility Traces", RandomWalk.py, class "OffLineLambdaReturn" function "nStepReturnFromTime"
#35 opened by xiaogengyaokeyan - 2
Ch02 TenArmedTestbed sampleAverage
#34 opened by wugh - 1
Some explanation of tictactoe is required
#33 opened by mohanr - 4
Policy Iteration in Chapter4 for RentalCar
#28 opened by loopinvariant4 - 1
debug errors for missing modules
#27 opened by huiwenzhang - 4
Maybe Error in Chapter03/GridWorld.py
#26 opened by ZiJianZhao - 1
Licensing: MIT?
#20 opened by aoboturov - 1
- 1
Observation & Offer To Help
#10 opened by atki4564 - 7
- 1
Add a link to learning resource
#2 opened by dmulitsa - 1
Ch2, line 62: What does the operator '+ \' do?
#8 opened by atki4564