shindavid/AlphaZeroArcade

Virtual loss parameter exploration

Opened this issue · 0 comments

In some implementations, the default virtual loss amount is greater than the +1 value that represents a game win. For example, the AlphaGo Master paper used a virtual loss of 3. This increases the incentive for other threads to explore different regions of the tree, which impacts both evaluation throughput (since there should be less bottlenecking at the same tree node) and search quality (for better or worse).

We should:

  1. See what other implementations use (KataGo, Leela).
  2. Experiment with different choices.