AlphaGOZero (python tensorflow implementation)
This is a trial implementation of DeepMind's Oct19th publication: Mastering the Game of Go without Human Knowledge.
DeepMind release AlphaZero Teaching Go. It's a lot of fun!
From Paper
Pure RL has outperformed supervised learning+RL agent
SL evaluation
Download trained model
-
https://drive.google.com/drive/folders/1Xs8Ly3wjMmXjH2agrz25Zv2e5-yqQKaP?usp=sharing
-
Place under ./savedmodels/large20/
Set up
Install requirement
python 3.6 tensorflow/tensorflow-gpu
pip install -r requirement.txt
Download Dataset (kgs 4dan)
Under repo's root dir
cd data/download
chmod +x download.sh
./download.sh
Preprocess Data
It is only an example, feel free to assign your local dataset directory
python preprocess.py preprocess ./data/SGFs/kgs-*
Train A Model
python main.py --mode=train
Play Against An A.I.
python main.py --mode=gtp —-gtp_poliy=greedypolicy --model_path='./savedmodels/your_model.ckpt'
Play in Sabaki
- In console:
which python
add result to the headline of main.py
with #!
prefix.
- Add the path of
main.py
to Sabaki's manage Engine with argument--mode=gtp
TODO:
- AlphaGo Zero Architecture
- Supervised Training
- Self Play pipeline
- Go Text Protocol
- Sabaki Engine enabled
- Tabula rasa (failed)
- Distributed learning
Credit (orderless):
*Brain Lee *Ritchie Ng *Samuel Graván *森下 健 *yuanfengpang