Measure many replays in a row
Opened this issue · 1 comments
From firebuild/firebuild#763:
Let's make it possible to recompile plenty times in a row, and see how the time and the cache size changes.
I'd rather not build up something brand new for this, nor maintain two similar perftest things in parallel. This feature should be integrated into the current perftest.
The output CSV's current schema isn't suitable, as it's got hardwired 3 sets of columns (vanilla, first, second build times). We used to have 3 different rows for these entries, one of the columns containing 0, 1 or 2 for vanilla, first firebuild, second firebuild entries, respectively. We should revert to this old schema, allowing this number to be even bigger. This would also allow to skip the vanilla build but still measure the times of the firebuild builds. (Also the .csv would be easier to read for humans.)
Let's also add another column that is a unique identifier allowing the grouping of these rows. It could be e.g. a random number, or the timestamp when the whole corresponding set of compilations began.
Not sure how much work it is to update the dashboard for the new schema. Updating the generated Grafana dashboard config is a true pain. Instead, we could do some SQL JOIN magic that converts the new buildtimes.csv to the current schema (or define such a VIEW), skipping entries where the rebuild sequence number is 3 or more, or where the vanilla build's row is missing.
And we'd need a new dashboard, graphing the data along this new dimension as the X axis (possibly working from another SQL VIEW if that's how it's easy to implement).
I believe that adding new cache entries in the 2nd build is almost always due to a bug, for example when firebuild missed some input or stored an entry with random/time based input. As a consequence we need to understand why the cache grows in the 2nd build, but I don't think that checking the 3rd build brings much to the table.