mpatacchiola/dissecting-reinforcement-learning

Python code, PDFs and resources for the series of posts on Reinforcement Learning which I published on my personal blog

PythonMIT

Issues

Part 3, TD(lambda): trace_matrix should be reset to zeroes at the beginning of each epoch
#22 opened 6 months ago by johanwiden
0
Part.1 Modified Policy Iteration with Simplified Bellman Equation and Linear Algebra Policy Evaluation Infinite Loop
#20 opened 2 years ago by CesarAndresRojas
1
mdp linear algebra approach cannot stop
#6 opened 6 years ago by zdarktknight
4
Missing brackets
#18 opened 4 years ago by DoDzilla-ai
0
Print statement causing issue in Python 3.x
#15 opened 4 years ago by DoDzilla-ai
0
Two undefined variables
#16 opened 4 years ago by DoDzilla-ai
1
The clean robot example on chapter 1 ?
#14 opened 5 years ago by ngthanhtin
2
11X11 grid
#13 opened 5 years ago by Andlibmehndi
2
about greedy agent in multi-armed bandit
#12 opened 5 years ago by ZichaoHuang
2
Looking forward to post #8
#5 opened 5 years ago by BKJackson
1
adding optimal policy calculation in the value iteration algorithm
#3 opened 6 years ago by ivan-v-kush
2
Problem in executing: "Montecarlo_control.py"
#7 opened 6 years ago
1
typo in policy iteration algorithm on the site
#2 opened 7 years ago by ivan-v-kush
1
Alternative to Numpy
#1 opened 7 years ago by abencomo
1