Chinese Chess Xboard engine using MCTS and DNN
AlphaGo has achieved a high winning rate against other Go programs and defeated the top human player Lee Sedol from South Korea. This inspired us to design BetaElephant, a Chinese Chess AI, to confirm whether the framework of AlphaGo can be properly applied to other domains.
BetaElephant is mainly a combination of Monte Carlo Tree Search and several Deep Neutral Networks. MCTS finds the move with the highest winning rate by expanding the search tree, while Policy and Value DNNs provide MCTS with prior probabilities of each move and the valuation of board position. In each circulation of MCTS, it determines a path by prior probabilities and previous searching results, adds a new leaf node, and updates the path by the board valuation and a playing-through result. The DNNs are trained by a novel combination of supervised learning and reinforcement learning, taking data from human expert games and self-playing results.
All the codes are still under development ...
-
mcts-xboard/ contains the c++ code of the main program. To compile BetaElephant, run
./compile
-
util/ contains some most used functions such as dataset and tensorflow models
-
policy_experiments/ records all the trial we conducted to find optimal policy network
-
train_policy/ is the optimized policy network, run
python3 model.py && tensorboard --logdir .
to open tensorboard http server, and you can see the architecture of our policy network -
rl_train/ contains the unfinished reinforcement learning framework.
-
export_nets/ provide tools to export trained models, which will be loaded in main program by tensorflow c++ api
-
chess_rule/ contains the python package which take FEN as input and return legal moves