OCaml - Continuous benchmarks
Prototype for running predictable, IO-bound benchmarks in an ocurrent pipeline. This is work in progress. If you want to be on the allowlist for running benchmarks for your repository, please contact @gs0510, or you can open an issue.
Enroll your project
If you want to enroll your repository or setup this benchmark repository for your repository, we make the following assumptions.
- There is a
make bench
target which can run the benchmarks. - The benchmarks result is a JSON object of the following format:
{
"name": <optional-name-of-the-benchmark>,
"config": <optional-config-object>,
"results": [
{
"name": <name-of-the-test>,
"metrics": {
"<metric-1>": <numeric-value>,
"<metric-2>": [<numeric-value>, ...],
...
},
...
}
]
}
Here's an example from index with regards to what the format looks like.
The metadata about repo
, branch
and commit
is added by the pipeline.
Multiple benchmarks per project
Multiple concatenated JSON objects can be produced and will be interpreted as different benchmarks. The name of the benchmark is optional when there is only one output, but must be present if multiple result objects are produced.
Data dependencies in your project
If you have a data dependency, then currently we add the dependency to the docker volume called current-bench-data
.
The dependency lives in <org_name>/<repo_name>
folder so you can assume the depdency to live in current-bench-data/<org_name>/<repo_name>
folder.
Tuning the environment
See general instructions in ocaml-bench-scripts for configuring the benchmarking hardware. In particular, you need an isolated CPU to run the benchmarks on.
Use the —docker-cpu
parameter to pin the benchmark to a single CPU. This will pass the —cpuset-cpus
parameter to Docker behind the scenes to run the container on a single core.
The main difference from the scripts hosted in ocaml-bench-scripts and this ocurrent pipeline is that the tasks will be executed inside docker containers. This requires a few more adjustments to how the containers are launched. Most of this is handled automatically by the pipeline by passing parameters to Docker. Some additional details are documented below.
IO performance
The results of IO bound benchmarks can vary greatly between different device/storage types and how they are configured. For this prototype we’re aiming for predictable results so we are using an in-memory tmpfs
partition in /dev/shm
for all storage.
The —docker-shm-size
parameter can be passed to the pipeline to adjust the size of the tmpfs
partition. The default is 4G.
tmpfs
partitions are similar to ramfs
partitions in that the content will be stored entirely in internal kernel cache, but they have a size limitation and may trigger swapping. It is therefore important to make sure that the system is configured in such a way that swapping doesn’t occur while the benchmark is running. For more details about tmpfs
/ramfs
see https://www.kernel.org/doc/Documentation/filesystems/tmpfs.txt.
NUMA considerations
If running on a system with NUMA enabled the tmpfs
file system should be allocated in a memory area that is local to the core running the benchmark. Otherwise the kernel could allocate this in different areas over time and affect the IO performance results. To avoid this issue, the tmpfs
volume can be created with a specific memory allocation policy.
The pipeline provides a —docker-numa-node
command line parameter that forces the tmpfs
volume in /dev/shm
to be allocated from a specific NUMA node. lscpu
shows which NUMA nodes are local to each core.
NOTE: Although it should be possible to get good results on a NUMA enabled system, we do not plan to use this in production and have limited experience with it. The main reason is that the system wide optimisations required would likely reduce performance for general tasks, while the benchmark itself only runs on a single core. This makes it more suitable to run on a dedicated, smaller server, which typically has less memory and doesn’t require NUMA.
ASLR
ASLR affects performance as the memory layout is changed each time the benchmark is loaded. The ocurrent pipeline disables ASLR inside the container automatically by wrapping the benchmark command in a call to setarch [...] --addr-no-randomize
. This is normally blocked by the default Docker seccomp profile, so we have modified the profile to allow personality(2)
to be invoked with the ADDR_NO_RANDOMIZE
flag.