/MCTS-variants

Demo of how "mcts as regularized policy optimization" works

Primary LanguagePython

Watchers