How to parse from PGN into .json for analysys chess party
Closed this issue ยท 8 comments
Hello ebemunk,
I'm trying to do visualization from your work for AI, but stopped at file wrc.json. I store all chess parties moves in a new row in Database so every new row I have recorded moves like this one:
- d4 Nf6 2. c4 e6 3. Nf3 d5 4. g3 dxc4 5. Qa4+ Nc6 6. Bg2 Be7 7. O-O O-O 8. Qxc4 Bd6 9. Nc3 Bd7 10. Qb3 h6 11. Qxb7 Nb4 12. Ne5 Rb8 13. Qxa7 Nbd5 14. Nxd7 Qxd7 15. Qa6 c5 16. dxc5 Bxc5 17. Nxd5 Nxd5 18. e3 Nb4 19. Qe2 Nd3 20. Rd1 Rfd8 21. Bf3 e5 22. a3 e4 23. Bg4 Qa4 24. Rb1 Ne5 25. b3 Rxd1+ 26. Qxd1 Qa8 27. b4 Rd8 28. Qe2 Bb6 29. Bb2 Nd3 30. Bh3 Bc7 31. Rd1 Be5 32. Bxe5 Nxe5 33. Rxd8+ Qxd8 34. Bg2 Qd5 35. Qc2 Nf3+ 36. Bxf3 exf3 37. Qb1 Qb5 38. h3 h5 39. Qd1 Qe2 40. Qc1 h4 41. g4 Qa6 42. b5 Qa5 43. Qb1 Qb6 44. a4 g6 45. Qd1 Qa5 46. Qxf3 Qxa4 47. b6 Qd7 48. Qe4 Kh7 49. b7 Qc7 50. Qd5 f5 51. gxf5 Kh6 52. Qe4 Kg7 53. Qd4+ Kg8 54. Qd5+ Kg7 55. fxg6 Kh6 56. Qf7 Qc1+ 57. Kh2 Qb2 58. g7 Qxg7 59. Qf4+ Kh7 60. Qxh4+ Qh6 61. Qxh6+ Kxh6 62. b8=Q Kg6 63. Qf8 Kh7 64. Qf6 Kg8 65. h4 Kh7 66. h5 Kg8 67. h6 Kh7 68. Qg7#
How can get the .json file for visualisation by parsing from Database ? And can you more clarify how you get this wrc.json file from PGN. because for me it is totally unclear?
The structure of wrc.json
is one of a tree, specifically, it contains all the move variations seen in multiple games, so making this structure from 1 game doesn't make a lot of sense.
This is also called a hierarchy in d3 terms. The structure is:
{
san: 'the SAN notation of the current position',
count: 'number of times it appears in your games database',
children: [
{san: '...', count: '...', children: []}
]
}
for example a really shallow tree could be
{
san: 'start',
count: 10,
children: [
{san: 'e4', count: 5, children: [
{san: 'c4', count: 3, children: []},
{san: 'e5', count: 2, children: []},
]},
{san: 'd4', count: 5, children: [
{san: 'd5', count: 3, children: []},
{san: 'b7', count: 3, children: []},
]},
]
}
you might be able to use pgnstats to generate statistics similar to this. unfortunately i havent focused on this project or pgnstats in a while so i wouldn't be able to support you, but the general idea is:
- parse the SAN notation for all games in your database
- construct an array for each game, that looks like
['d4', 'd4 Nf6', 'd4 Nf6 c4', 'd4 Nf6 c4 e6', ...]
- combine these into the hierarchy structure mentioned above
So that pgnstats.go script can build that tree in json or it should be similar but not precisely file like wrc.json. So I have to make some reparations anyway? Is that some simple method how to create that file from different .PGN files or SAN notation from database?Because I've started to learn Go to make the task simpler but the task became even harder)
pgnstats will create almost exactly what you want (and more), but it will contain the key of title
instead of san
. You can change title
to san
programmatically or manually with a text editor.
i ran an example one for you:
$ which go
/usr/local/bin/go
$ go version
go version go1.15.5 darwin/amd64
$ curl -O https://www.pgnmentor.com/events/WijkaanZee2020.pgn
# curl output
$ git clone git@github.com:ebemunk/pgnstats.git
# git output
$ cd pgnstats
$ go run . -f path/to/WijkaanZee2020.pgn -o stats.json
2020/12/23 08:52:38 starting
2020/12/23 08:52:38 analyzed 91 games
2020/12/23 08:52:38 done!
in stats.json
you will see this under the Openings
key. pgnstats generates a lot more stats but if you look at Openings
you will see it has the exact structure, only the san
is instead title
which you can change either via a text editor, or programmatically by text replacement.
So in stats.json after replacement we have
"openings":{"size":91,"san":"","children":[{"size":14,"san":"c4","children":[{"size":6,"san":"Nf6","children":[{"size":6,"san":"Nc3","children":[{"size":2,"san":"e5","children":[{"size":2,"san":"e3","children":[{"size":2,"san":"Nc6","children":[{"size":2,"san":"Qb3","children":[{"size":2,"san":"g6","children":[{"size":1,"san":"Nf3","children":[]}]}]}]}]}]},{"size":4,"san":"e6","children":[{"size":4,"san":"e4","children":[{"size":2,"san":"c5","children":[{"size":2,"san":"e5","children"...
At the same time in wrc.json we have
{"openings":{"san":"start","children":[{"san":"c4","count":97,"children":[{"san":"Nf6","count":22,"children":[{"san":"Nc3","count":12,"children":[{"san":"e5","count":5,"children":[{"san":"Nf3","count":4,"children":[{"san":"Nc6","count":4,"children":[{"san":"e4","count":1,"children":[]},{"san":"e3","count":1,"children":[]},{"san":"a3","count":2,"children":[]}]}]},{"san":"g3","count":1,"children":[{"san":"Bb4","count":1,"children":[{"san":"Nf3","count":1,"children":[]}]}]}]},{"san":"g6","count":5,"children":[{"san":"e4","count":3,"children":...
I'm trying to highlight to the difference between this two output.
Can we fix pgnstats.go so the result can fit to sunburst visualisation?
In that zip 2 files (stats.json and wrc.json)
chess.zip
pgnstats.go almost have done all the job we need)))
as I mentioned in the other thread, pgnstats isn't meant to generate perfect outputs for the library; they have separate concerns.
as I mentioned earlier, you will need to postprocess the output of pgnstats to fit the above structure that d3.hierarchy
accepts. the "hardest" part is to put it into the tree structure, which pgnstats does. you will have to change the names of the keys (from size
to count
and from title
to san
).
you can also modify your local copy of pgnstats to create these keys by default by changing lines 5 and 6 in https://github.com/ebemunk/pgnstats/blob/master/openingmove.go#L5-L6
you might also want to look at https://github.com/ebemunk/chess-dataviz/tree/master/scripts which at some point generated output meant for this library, but I haven't used it in ages so can't really verify if they work anymore.
Thank you very much. Also we have to change "Openings" to "openings". I've made corrections inside pgnstats files as you recommended. But I couldn't understand how those scripts works: as I understood 1) we need to use pgn-extract.exe which transforms our .pgn file into .pgn with definitely structure? 2) we need to use node stats.js test.pgn, but here I've got next error:
internal/modules/cjs/loader.js:883
throw err;
^
Error: Cannot find module 'debug'
Require stack:
- C:\Users\Shams\Desktop\ChessProject\chess-dataviz-1.0.0\scripts\stats.js
โ[90m at Function.Module._resolveFilename (internal/modules/cjs/loader.js:880:15)โ[39m
โ[90m at Function.Module._load (internal/modules/cjs/loader.js:725:27)โ[39m
โ[90m at Module.require (internal/modules/cjs/loader.js:952:19)โ[39m
โ[90m at require (internal/modules/cjs/helpers.js:88:18)โ[39m
at Object.<anonymous> (C:\Users\Shams\Desktop\ChessProject\chess-dataviz-1.0.0\scripts\stats.js:6:15)
โ[90m at Module._compile (internal/modules/cjs/loader.js:1063:30)โ[39m
โ[90m at Object.Module._extensions..js (internal/modules/cjs/loader.js:1092:10)โ[39m
โ[90m at Module.load (internal/modules/cjs/loader.js:928:32)โ[39m
โ[90m at Function.Module._load (internal/modules/cjs/loader.js:769:14)โ[39m
โ[90m at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:72:12)โ[39m {
code: โ[32m'MODULE_NOT_FOUND'โ[39m,
requireStack: [
โ[32m'C:\\Users\\Shams\\Desktop\\ChessProject\\chess-dataviz-1.0.0\\scripts\\stats.js'โ[39m
]
}
might need to npm i debug
as the error mentions
i'm assuming this issue is fixed now, closing. let me know if you have further questions.