guidopetri/chess-pipeline

Add position FEN per move

Closed this issue · 3 comments

I just realized I'm not storing position FENs whenever a game is pulled. This should probably go in a column in the game_moves table. Is this easily doable without having to re-run the entire pipeline for all time periods?

What I'm thinking for this is re-running the pipeline with a new table - something like game_positions, that has all the positions as FEN - and moving the game_evals table to be a position_evals kind of table. This will involve some work moving stuff around but it should be doable, and that way we'll (hopefully...) not re-evaluate identical positions all the time. This also avoids re-running the pipeline just to add the position column on game_evals - we just have to be prudent about the SQL.

After that, a view can also be created for game_evals that is just a join of game_positions and position_evals.

Here's the game plan:

  • write a Task that writes to a game_positions(id, game_link, half_move, fen)
  • write a .sql file to create the above table
  • rename the game_evals table to something else temporarily
  • rerun the pipeline going all the way back, for several players.... just to get the FENs.
  • write a .sql to get the positions and game evals by game_link and half_move
  • write a .sql to create position_evals(id, fen, evaluation, eval_depth)
  • copy these positions/evals to position_evals
  • modify the StockfishVisitor and EvalsVisitor visits so that they work in conjunction with PositionVisitor to write to position_evals
  • write a Task that gets the new positions, gets the positions in position_evals, and runs Stockfish over the new positions, then copies them over to position_evals
  • write a .sql file to create a view for game_evals (with win probability?)
  • edit the setup .sql to create the right tables
  • edit the DAG to require the right tasks

I believe this is done. What a confusing issue.