r/place dataset parser
Parses the r/place dataset from 2017 and 2022 into a sqlite database.
Obtaining the datasets
2017
You can find the original reddit post here.
The full dataset is located here.
2022
You can find the original reddit post here.
The full dataset is located here.
Compiling
Compiling should be as easy as running
cargo build --release
You can then run the executable like this
./target/release/place
Running
The program can be invoked with the -d
flag to specify the database to write to. When no database is specified, the program runs in dry mode, parsing the entries but not doing anything with them.
You can pipe data to the program, or specify files to read as arguments.
To parse the first 100k entries from 2017 dataset:
head -n 100000 ./place_tiles.csv | ./target/release/place -d test.sqlite
To parse the full datasets from 2017 and 2022:
./target/release/place ./place_tiles.csv ./2022_place_canvas_history.csv -d test.sqlite