shindavid/AlphaZeroArcade

Near-Uniform Random Openings

Opened this issue · 1 comments

Implement the following idea from Appendix D of the KataGo paper:

In 5% of games, the game is branched after the first r turns where r is drawn from an
exponential distribution with mean 0.025 ∗ b^2. Between 3 and 10 moves are chosen uniformly
at random, each given a single neural net evaluation, and the best one is played. Komi is
adjusted to be fair. The game is then played to completion as normal. This ensures that
there is always a small percentage of games with highly unusual openings.

Some thought is needed on how to generalize this for games besides go. The komi-adjustment in particular has no clear analog in other games. It might be the case that there is no good way to generalize this.

Note: it is likely that this comment from the KataGo paper applies to this idea:

Except for introducing a minimum necessary amount of entropy, the above settings very likely have
only a limited effect on overall learning efficiency and strength. They were used primarily so that
KataGo would have experience with alternate rules, komi values, handicap openings, and positions
where both sides have played highly suboptimally in ways that would never normally occur in
high-level play, making it more effective as a tool for human amateur game analysis.