anp/lolbench

Investigate best storage method for benchmark data

shssoichiro opened this issue · 1 comments

Presently, all of the benchmark data is stored in the https://github.com/anp/lolbench-data repository. New measurements are added as new JSON files. This repository is currently over 1 GB and as such, takes a long time to clone. I also expect this storage format to negatively impact performance of generating the website as more metrics and more benchmarks get added.

I wonder if it would be preferable and/or worthwhile to revise how data is stored; for instance, a document store like Cassandra would allow us to keep the current JSON data structure, or if we want to invest into more rework, we could convert the data to be relational and store it in Postgres.

Such a rework would also require rethinking how developers can access the data for testing. There would need to be some way to clone an archive of the database. A docker image containing the database, which is updated on some regular basis (weekly?) by CI, could be very easy to use.

anp commented

I very much agree about the current storage format's shortcomings. Personally I'd prefer to stay using a version-control-friendly format of some kind, it's cheap and easy and free hosting :). I'll write a few more thoughts when I have some more time.