shindavid/AlphaZeroArcade

LCB method

Closed this issue · 1 comments

In an email to me, David Wu wrote:

...you already can do much better than just selecting purely based on visits by incorporating Q in a conservative way, such as via Leela Zero's LCB method (which KataGo also uses). Also LCB helps a lot at low visits too.

I briefly researched this LCB method, and found some leela-zero discussiones:

leela-zero/leela-zero#860
leela-zero/leela-zero#883

Read through those links and implement the LCB method.

Implemented: a317c7d

Noticeably improves overall progress graph in both c4 and othello.