lichess-org/fishnet

Don't clear the hash after every move

dav1312 opened this issue · 9 comments

fishnet/src/stockfish.rs

Lines 244 to 245 in 05dd4b7

// Clear hash.
stdin.write_all(b"ucinewgame\n").await?;

  • Much better analysis since the engine won't be surprised by a move (e.g https://lichess.org/p09l31JG#61)
  • Stockfish is deterministic even without clearning the hash so it would still be possible to check if the analysis is correct if necessary (#233)

Stockfish is deterministic even without clearning the hash

Not with fishnet's concurrency model, where all cores are used to analyse positions of a single game in parallel.

image

-- https://lichess.org/blog/X9uXyxUAANCqN1OF/stockfish-12-on-lichess

It does leave a lot of strength on the table, so doing something more advanced could be a nice improvement. Perhaps games can be sliced into groups of positions based on some heuristic.

all cores are used to analyse positions of a single game in parallel.

Per game that is much faster, yeah 🤔

Perhaps the increase in strength can compensate for the longer wait times by using less nodes per move?

Using lichess-org/lila#8862 would also increase the analysis quality without sacrificing analysis speed (it may in fact increase the analysis speed since only-moves can be skipped entirely). Maybe store both the analysis eval and the consistent eval so its possible to check in the future?

Not with fishnet's concurrency model, where all cores are used to analyse positions of a single game in parallel.

There is only one upside to this kind of parallel analysis, and that's to deliver a full analysis quicker after the "request analysis" button or the automatic analysis triggers (titled players, etc.).

At equal analysis quality, it wastes resources (because with pure backward analysis reusing hash, you get better quality). That is, more games could be analysed at the same quality with the same resources.

At equal resources spent per game, it loses analysis quality.

So I guess the key question is, how much time would a non-parallel analysis (at same depth/higher quality but also at lower depth/same quality) take compared to the current model, and what are acceptable wait times for an analysis ?

This might not be possible if the results need to be deterministic (although it seems old v1 fishnet wasn't), but otherwise threads=1 jobs are not the only possibility. SMP scaling from 1 thread to 2 threads is excellent (>1.9x effective speed up), so running 2 threads on 1 job speeds up how quickly a single game is fully analysed.

Using lichess-org/lila#8862 would also increase the analysis quality without sacrificing analysis speed (it may in fact increase the analysis speed since only-moves can be skipped entirely). Maybe store both the analysis eval and the consistent eval so its possible to check in the future?

Yes, it should in theory both benefits analysis quality (more consistent evals and less engine time spent searching things that cannot change the final analysis) and analysis speed (by skipping only moves entirely).

Some of the analysis quality gain could be traded for more analysis speed, either through a lower depth in general or through a more complex "selective lower depth". For the latter, the idea is simple but requires additional code.

Say the target search depth is N. The position is searched while excluding the move played in the game. If at depths N-1/N-2/N-3 the evaluation of the best non-excluded move is worse than that of the played move by thresholds t1/t2/t3, abort the search and (using consistency rules) carry over the evaluation from the next ply. This would be a dramatic search speed up when there is a forced tactical move (usually a recapture) that has correctly been played in the game. Even in low quality games, this happens several times so this should be a substantial speed-up at almost no quality cost.

Optimised t1/t2/t3 would require analysing a large dataset of evaluation at N-X compared to evaluation at N (how often would each value lead to mistakes and how sizable), however naïve conservative thresholds would be very safe and would still speed up all the usual "knight must be recaptured" moves.

As an example of why not clearing the hash is better we have the recent WCC game https://lichess.org/broadcast/fide-world-chess-championship-2023/round-6/Teae9PdP/DYYUwnnc

The analysis misses key mistakes by the players that would be highlighted if hash wasn't cleared (which also affects the acpl and accuracy).

These analysis (studies/broadcasts) doesn't need to be fast, it needs to be accurate.

This issue was mentioned in the responses to a Lichess blog post from Chess-Network, where he notes that Stockfish doesn't see Kasparov's 27.h5 in this game: https://lichess.org/forum/community-blog-discussions/ublog-w8sVLwqU. It seems like this cache-clearing approach is the cause for the bad server analysis, and running stockfish in the browser analysis for the study instantly finds the correct move, as noted by several commenters.

To make an informed decision, I ran some experiments: https://backscattering.de/chess/backwards-analysis/.

The tl;dr of it is that

  • fishnet 2.7.0 leaves a lot of quality on table.
    [...]
  • At equal resources used, chunking, even with overlap, can close the gap between fishnet 2.7.0 and sequential backwards-analysis [without sacrificing determinism].
bftjoe commented

Why was determinism suddenly a requirement when this was ported to Rust? It is probably better to just go back to the better analysis quality Python version if no one is going to fix the Rust version. I have never heard of anyone using a chess engine by clearing the hash between every move.

since there was some discussion about this in discord. In playing games, clearing the hash (ucinewgame actually) between the moves has a very large effect on strength:

Score of master vs newgame: 55 - 0 - 40  [0.789] 95
Elo difference: 229.6 +/- 52.6, LOS: 100.0 %, DrawRatio: 42.1 %

Which is easily the equivalent of 6x time odds or more.

fishnet v2.8.1 now analyses positions in overlapping chunks. It keeps determinism and parallelism, at the cost of some nodes being wasted on the overlap. Possible parallelism is of course limited for short games, but those are short anyway.