AzGomoku

Alpha (Go) (Zero) algorithm on Gomoku (Five In A Row) game. The code is largely based on the pseudo-code provided by Alpha Zero paper. Alpha Go Zero paper
Alpha Zero paper

Play against the engine

Use hmplGUI.py (Python), hmplGUIC.py (C++, fastest), or go to https://yuanshengzhao.github.io/AlphaZero_Gomoku/ (Javascript, similar speed with Python).

Network

Reference to: Lczero.

Board: 15x15x2. maps: [stones-self, stones-opponent]. No need for color or history.
Init: Conv2D, 3x3 64 filters
Body: 10 block SE-ResNet tower, 3x3, 64 filters, 32 SE_channels.
Body followed by:
- P-Head: Conv2D(3,64) -- Conv2D(3,1)
- V-Head: Conv2D(3,32) -- Dense(128) --Dense(1)
other value for nblock or nfilter is also supported.

Parameters

WDL encoding: 1/0.5/0 (not +1/0/-1 stated in the paper)
cpuct_base: 19652
cpuct: 1.25
FPU: reduction 0.3|0.1 for training

Training

Run autotrc.py (set appropriate parameters (hash size, n_prec, etc.) first).

num_simul: 800
dirichlet noise: alpha 0.05 (~10/225), fraction 0.25
temperature: 1 for 30 ply, 0 from 16th ply
opening book: none, 1 random move or 26 canonical opening transforned
batch size: 1024
optimizer: Adam
learning rate: 1e-3 to 1e-5

For pretrained weights, go to releases.

YuanshengZhao/AlphaZero_Gomoku

AzGomoku

Play against the engine

Network

Parameters

Training