/TD-learning-in-random-walk-environment

This project aims to replicate figure 3, 4 and 5 from the Richard Sutton’s 1988 paper “Learning to Predict by the Methods of Temporal Differences.”

Primary LanguageJupyter Notebook

Watchers

No one’s watching this repository yet.