guidopetri/chess-pipeline

Win probabilities CSV has improper numbers

Closed this issue · 2 comments

The CSV for win probabilities uses improper numbers. E.g.:

-47.59999999997319,1.2584390499731036e-06,0.0

Postgres can't understand scientific notation, and it's probably also best to truncate the first number (which is the stockfish eval value) to 2 decimal places. Wherever scientific notation was used, Postgres is currently importing a NaN value (which also doesn't seem to be captured by is NULL). Obviously this breaks any possible analyses. Fixing this should be pretty easy: just change the .ipynb that generates the file to output it correctly.

It seems like psql did import the numbers correctly, but somehow they get selected improperly via Redash/psycopg2. The hotfix is:

update win_probabilities set eval = round(eval::numeric, 2);
update win_probabilities set probability_lr = round(probability_lr::numeric, 6);
update win_probabilities set probability_bayes = round(probability_bayes::numeric, 6);

Evals are rounded to 2 decimal places since that was the original list of values I used, probabilities were rounded to 6 decimal places so that they can be expressed as percentages with 4 decimal places.

Fixed in 13bb16b.