shindavid/AlphaZeroArcade

Modularize the Python code

Closed this issue · 2 comments

Mainly, the AlphaZeroManager, which has many responsibiliies (logging, training, cleaning, initializing state, etc)
Split those responsibilities accordingly to make code reuse and improvment easier.

My suggestions:

  • Make a separate util/tee_printing.py file, that has a TeePrinter class, and a free function init_tee_printing(filename: str). Calling that function will make it so that print() statements print both to stdout and to the passed-in filename. Then replace the init_logging() call in the run() method with a call to init_tee_printing().
  • Separate the file/directory organizer role (class AlphaZeroOrganizer?) from the role of managing the self-play cmds and the role of doing the model training. Change scripts like py/alphazero/compute_ratings.py that only need the organizer role to construct the organizer, not the manager.
  • Make separate classes for the training role and the self-play-process management role. The sensitive part is making a clear interface between these classes and the organizer.

All this has been completed. The alphazero loop logic is now well-separated into different servers/modules.