r/place dataset parser

Parses the r/place dataset from 2017 and 2022 into a sqlite database.

Obtaining the datasets

2017

You can find the original reddit post here.

The full dataset is located here.

2022

You can find the original reddit post here.

The full dataset is located here.

Compiling

Compiling should be as easy as running

cargo build --release

You can then run the executable like this

./target/release/place

Running

The program can be invoked with the -d flag to specify the database to write to. When no database is specified, the program runs in dry mode, parsing the entries but not doing anything with them.

You can pipe data to the program, or specify files to read as arguments.

To parse the first 100k entries from 2017 dataset:

head -n 100000 ./place_tiles.csv | ./target/release/place -d test.sqlite

To parse the full datasets from 2017 and 2022:

./target/release/place ./place_tiles.csv ./2022_place_canvas_history.csv -d test.sqlite