The cmp_benchmarks.sh
provides a wrapper around google/benchmark/tools/compare.py
. It can be used to compare benchmarks between versions and to filter out significant differences in runtime.
The program is invoked like:
$ ./cmp_benchmarks.sh -m <mode> -n <name_tag> -b <baseline> -c <contender> \
[-s </source/code/path> [-r relative/benchm/suite/path] [-t threshold] [-o /results/path] [-g compiler]
- The arguments
<baseline>
and<contender>
are<mode>
dependend. <name_tag>
should be set to an identifier for the comparison, e.g.3.0.0-vs-3.0.1
.
There are three modes of operation:
- Compare benchmarks for two versions given their git tags: mode
build
. - Compare benchmarks given benchmark executables: mode
execs
. - Compare benchmarks based on stored benchmark results: mode
results
.
In build mode, <baseline>
and <contender>
should be the tags to use in git checkout <tag>
.
In this mode, setting the option -s
is required and should be set to /full/path/to/source/code
(git repository).
Build mode uses two additional, optional arguments: -g
and -r
.
-g
can be used to set the/path/to/executable/compiler
. Defaults to g++-7.-r
can be used to set therelative/path/from/-s/to/benchmark/suite
. Defaults to./test/performance
.
The respective tags are checked out (submodules are updated), build using Cmake
and make
, and run to obtain the results.
In execs mode, <baseline>
and <contender>
should be full/paths/to/built/benchmark/suites
(containing the executables).
Hence, the directories where the respective source_code/test/performance
were built.
The executables are run to obtain the results.
In results mode, <baseline>
and <contender>
should be full/paths/to/benchmark/results
.
These results should be JSON output files. The results will simply be loaded from the output file.
NOTE: This mode was not yet tested.
-f
can be used to set a threshold[0, 1]
for filtering significant results, e.g. 0.2 . Defaults to 0.05 (5% difference).-o
can be used to set the/full/path/to/save/results
. Defaults to the current directory.
compare.py
compares all identically named benchmarks in both runs. If the names differ between runs, the benchmark is omitted
from the diff. The diff values in Time and CPU are calculated as ( contender - baseline) / |baseline|
.
cmp_benchmark.sh
filters and stores all these diffs. Four types of files are created and all are
stored both as ;
-delimited .csv
file, and formatted as a readable table .txt
file:
- all_diffs, containing the diffs for all common benchmarks.
- significant_diffs, containing the benchmarks with a significant time diff.
all_diffs is filtered based on
-t <threshold>
which defaults to 0.05 (5% difference), see section Optional Arguments. - significant_decr, containing the benchmarks for which there was a significant decrease in time.
- significant_incr, containing the benchmarks for which there was a significant increase in time.
Example all_diffs.txt
output:
Benchmarkfile Benchmark Time CPU
bit_manipulation_benchmark is_power_of_two_popcount<unsigned> -0.0658 -0.0657
bit_manipulation_benchmark is_power_of_two_popcount<unsigned long> +0.0554 +0.0543
bit_manipulation_benchmark is_power_of_two_popcount<unsigned long long> +0.0047 +0.0040
bit_manipulation_benchmark is_power_of_two_arithmetic +0.0099 +0.0090
bit_manipulation_benchmark is_power_of_two_seqan3 -0.0186 -0.0186
bit_manipulation_benchmark next_power_of_two_seqan3 +0.0514 +0.0520
charconv_from_chars_benchmark from_char<int8_t> -0.0043 -0.0043
charconv_from_chars_benchmark from_char<uint8_t> -0.0135 -0.0130
...
The results are stored in the folder /result/path/<name_tag>
.
Additionally, for each benchmark file the full output of compare.py
is stored in the folder /result/path/<name_tag>/indiv_benchmrks
.
The cmp_benchmarks.sh
requires that:
- benchmarks are uniquely defined by their name
- benchmark executables do not have an extension
- benchmark names do not contain the character
;
. - python3 is in
PATH
as python
The compare.py
utility relies on the scipy package which can be installed using these instructions.
The cmp_benchmarks.sh
was tested on Ubuntu 18.04.3 LTS.