/value-iteration-dp

Primary LanguagePythonMIT LicenseMIT

Value Iteration: Dynamic Programming

Based on Example 4.3: Gambler's Problem from as described in chapter 4 of the textbook, Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto (2nd edition). This repo contains the example, value iteration algorithm, and exercise 4.9-related code.

The code has been tested with Python 3, though it could also work with Python 2 with some minor tweaks.