/mcts_python

Primary LanguagePython

MCTS (actually PUCT) in Python

Demo Games

Supports

  • Custom environment with clear APIs
    • Examples are in /games
  • Arbitrary number of agents with per-agent rewards

Instructions

  • run run_mcts.py to start
  • look up config.py to change game/configurations

Key parameters of MCTS(PUCT)

  • n_iters: the larger the more clever neural network will be, will increase training time linearly.
  • n_eps: the larger the more robust the training will be, will increase training time linearly
  • n_mcts: the larger the larger the more brute-force search samples will be, will increase training time and testing time polynomially

Possible Improvements

  • add a Q head
  • add \alpha for Dirichlet noise
  • cyclic learning rate
  • PPO for policy
  • episodic memory for value
  • population based training

Interesting Literature:

Requirement

  • Python 3.8 +
  • Refer to requirement.txt