richemslie/galvanise_zero

Stuck training in server.py

zhaoqxu-eth opened this issue · 3 comments

I have installed all of the prerequisites. However, when I started running python server.py hex.conf command for training models, it's stuck after the following two lines:

2019-05-18 20:47:53,750592 [INFO    ]  checking if generation data available
2019-05-18 20:47:53,750715 [INFO    ]  Not such file for generation: [Errno 2] No such file or directory: u'/root/ggpzero/galvanise_zero/data/hex/d2/gendata_hex_0.json.gz'

Could you please give me any guidance?

That's ok. Just means no self play data exists yet for that generation. Once train for a while will save the file in that location. If you restart the server will pick up from last point it was saved.

I don't know if something is wrong with my installation. But there isn't any error displaying and just continually output the same sentence every ten minutes. It seems they're not training and no self-play data is stored.

2019-05-19 09:11:47,561940 [VERBOSE ]  entering checkpoint with 0 sample accumulated
2019-05-19 09:21:47,661234 [VERBOSE ]  entering checkpoint with 0 sample accumulated
2019-05-19 09:31:47,760546 [VERBOSE ]  entering checkpoint with 0 sample accumulated

Hi. Bear with me, I am trying to write some docs to get you going.

In the meantime I pushed a revamped test in src/test/player/test_player.py.

The first thing to do after installing should be to run these tests. (and I can use the below for docs!)

First you'll need to checkout gzero_data repo. The test will be using breakthroughSmall, you can copy the rulesheet into ggplib/data/rulesheets. And test with
python perftest.py breakthroughSmall in ggplib/src/ggplib/scripts

Then copy the breakthroughSmall directory in gzero_data into the data directory to the galavanise_zero repo.

Then can run the 3 tests in src/test/player/test_player.py

$ py.test test_player.py -s -k test_random
This will test a random neural network against a simplemcts player. It will lose!

$ py.test test_player.py -s -k test_trained
This will test a trained neural network against a simplemcts player. The model is very strong. It will win easily.

$ py.test test_player.py -s -k test_puct_v2

This will test two gzero players against each other. They use a reasonable strong network, but one uses the puct1 player and the other is puct2 player. puct1 player is what is used for self play, and puct2 is used for match mode and has many more (experimental) features - not least batching on the GPU, so is much faster.

Note, I aim to merge puct1/puct2 at some point.

Let me know how that goes please, and I will work on some decent instructions for training.
Feel free to play around with options etc. :)
PS a number of unit tests have rotted over time. I will also aim to fix those.