Playing Blackjack using Monte Carlo learning with the epsilon greedy strategy.
Primary LanguageJupyter Notebook
Optimal policy compared to any Random policy:
Optimal policy reached by Monte Carlo Method: