averysi224/blackjack

An implementation of various solutions for a simplified version of blackjack, as described in 'Reinforcement Learning: An Introduction (Richard S. Sutton, Andrew G. Barto)'

Python

blackjack

An implementation of various solutions for a simplified version of blackjack, as described in 'Reinforcement Learning: An Introduction (Richard S. Sutton, Andrew G. Barto)'

Solutions implemented:

Monte Carlo with ES (Exploring Starts)
On-policy first-visit Monte Carlo control (for epsilon-soft policies)
Off-policy Monte Carlo control

All solutions converge to the optimal policy shown below,

Examples