BenchExec

A Framework for Reliable Benchmarking and Resource Measurement

News:

An extended version of our paper on BenchExec and its background gets published in STTT, you can now read the preprint of Reliable Benchmarking: Requirements and Solutions. In addition to the previous version from SPIN'15, it also describes the container mode and how to present result data.
BenchExec 1.9 adds a container mode that isolates each run from the host system and from other runs (disabled by now, will become default in BenchExec 2.0).

BenchExec provides three major features:

execution of arbitrary commands with precise and reliable measurement and limitation of resource usage (e.g., CPU time and memory), and isolation against other running processes
an easy way to define benchmarks with specific tool configurations and resource limits, and automatically executing them on large sets of input files
generation of interactive tables and plots for the results

Contrary to other benchmarking frameworks, it is able to reliably measure and limit resource usage of the benchmarked tool even if it spawns subprocesses. In order to achieve this, it uses the cgroups feature of the Linux kernel to correctly handle groups of processes. For proper isolation of the benchmarks, it uses (if available) Linux user namespaces and an overlay filesystem to create a container that restricts interference of the executed tool with the benchmarking host. BenchExec is intended for benchmarking non-interactive tools on Linux systems. It measures CPU time, wall time, and memory usage of a tool, and allows to specify limits for these resources. It also allows to limit the CPU cores and (on NUMA systems) memory regions, and the container mode allows to restrict filesystem and network access. In addition to measuring resource usage, BenchExec can verify that the result of the tool was as expected, and extract further statistical data from the output. Results from multiple runs can be combined into CSV and interactive HTML tables, of which the latter provide scatter and quantile plots (have a look at our demo table).

BenchExec works only on Linux and needs a one-time setup of cgroups by the machine's administrator. The actual benchmarking can be done by any user and does not need root access.

BenchExec was originally developed for use with the software verification framework CPAchecker and is now developed as an independent project at the Software Systems Lab at the Ludwig-Maximilians-Universität München (LMU).

Authors

Maintainer: Philipp Wendler

Contributors:

Dirk Beyer
Montgomery Carter
Andreas Donig
Karlheinz Friedberger
Peter Häring
George Karpenkov
Mike Kazantsev
Thomas Lemberger
Sebastian Ott
Stefan Löwe
Stephan Lukasczyk
Alexander von Rhein
Alexander Schremmer
Andreas Stahlbauer
Thomas Stieglmaier
and lots of more people who integrated tools into BenchExec

Users of BenchExec

BenchExec was successfully used for benchmarking in all six instances of the International Competition on Software Verification (2012-2017) with a wide variety of benchmarked tools and hundreds of thousands benchmark runs.

The developers of the following tools use BenchExec:

CPAchecker, also for regression testing
SMACK

If you would like to be listed here, contact us.

MalteSchledjewski/benchexec

BenchExec

A Framework for Reliable Benchmarking and Resource Measurement

Links

Authors

Users of BenchExec