A collection of scripts to prepare and process data published by Backblaze, to generate failure rate curves over the age of a disk. The scripts needed for the final generation of plots are kept in failure-analysis repo.
The steps described below assumes the presence of directories 2013
, 2014
etc. within data
directory, containing Hard Drive Test Data from
Backblaze in csv format (i.e. in extracted form).
-
make db-init
This will create a Postgres database named backblaze, process csv files nested in
data
directory and load it into the database. -
make plot-all
Dependency chain:
plot-all -> plot-metadata -> popular-models
popular-models
will query the database for the 20 most popular disk models and creates a file that lists them.plot-metadata
processes data for each model listed inpopular-models
file and generates csv files required for plotting as well as aplot-metadata
file that lists all these files needed.plot-all
uses theplot-metadata
file to actually generate plots.
-
make Results.md
Uses
plot-metadata
file to create a markdown file that embeds all generated plots.