Richard S. Sutton and Andrew G. Barto 2018 RLBook Exercises

Completion status:

	Chapter	Done
✅	1: Introduction	5/5
	2: Multi-armed Bandits	7/10
	3: Finite Markov Decision Processes	14/29
	4: Dynamic Programming	8/10
	5: Monte Carlo Methods	8/14
	6: Temporal-Difference Learning	1/14
	7: n-step Bootstrapping	0/10
	8: Planning and Learning with Tabular Methods	4/8
	9: On-policy Prediction with Approximation	1/8

New solutions should be submitted through pull requests, base file formats for markdown and notebooks are available at the stubs folder.

Files should be placed on the proper chapter folder following the naming scheme for markdown and Jupyter notebooks.

Ex_X.XX.md

Ex_X.XX.ipynb

Fillipedem/RLBookExercises