/Reinforcement-Learning

Using WoLF (win or learn fast) PHC (policy hill climbing) algorithm to implement stochastic games

Primary LanguagePythonMIT LicenseMIT

Stargazers