/lock_benchmarking

Benchmarks for locking algorithms as well as implementations of locking algorithms.

Primary LanguageC++

Locks and Benchmarks

This repository contains lock implementations, delegation lock implementations and a set of benchmarks. The code in this repository is mainly intended for evaluation of different locking algorithms. If you are looking for delegation locking implementations to use in your application, we recommend our locking libraries for C and C++. The code in this repository was originally written for the micro-benchmark parts of the evaluation of locking algorithms in the paper Queue Delegation Locking. The code needed to reproduce all the results from the Queue Delegation Locking paper is located in another repository.

Lock Implementations

QD Lock

Main file: src/datastructures_bench/synch_algorithms/qdlock.h

Lock Description: See queue delegation locking paper

HQD Lock

Main file: src/datastructures_bench/synch_algorithms/hqdlock.h

Lock Description: See queue delegation locking paper

Flat Combining

Main file: src/datastructures_bench/synch_algorithms/flat_comb.h

Lock Description:: Flat combining is described in "Flat combining and the synchronization-parallelism tradeoff". The flat combining implementation is based on the C++ source code provided by the authors of flat combining.

CC-Synch and H-Synch

Main files: src/datastructures_bench/synch_algorithms/hsynch.h and src/datastructures_bench/synch_algorithms/ccsynch.h

Lock Description: CC-Synch and H-Synch is described in "Revisiting the combining synchronization technique". The CC-Synch and H-Synch implementations are taken from the source code provided by the authors of CC-Synch and H-Synch.

Cohort Lock

Main files: src/lock/cohort_lock.{c, h }

Lock Description: The cohort lock is described in "Lock cohorting: a general technique for designing NUMA locks".

MR-QD and MR-HQD locks

Main files: src/lock/agnostic_rdx_lock.{c, h}, src/lock/rhqd_lock.{c, h}

Lock Description: Queue delegation locking paper

Write-Preference Reader-Writer locks

Main files: src/lock/wprw_lock.{c, h}

Lock Description:: Write-preference reader writer locks (DR-MCS and Cohort based) are both described in "NUMA-aware reader-writer locks".

Other Locks

Other locks that can be found in the repository are TATAS, MCS, CLH, ticket-lock, partitioned ticket lock, etc.

Requirements

  • GCC for compilation of the code (tested with gcc version 4.7.2)
  • SCons for the build system (tested with scons version 2.1.0)

The code has so far been tested only on x86-64 systems running Linux. It may not work on other platforms without modification.

  • python and matplotlib for producing graphs from the gathered data

How to Build

scons

Builds standard lock object files and benchmarks.

scons --use_pinning

Builds RWBench with thread pinning to hardware threads. The tests for cohort lock (test_cohort), cohort based write-preference RW (test_wprwcohort_rgnzi) and hierarchical multi-reader QD lock (test_rhqdlock_rgnzi) should not be used when --use_pinning is enabled. The reason for this is that when pinning is enabled, the NUMA aware locks rely on the threads staying on the same node all the time, but pinning is not enabled in the tests. To test these locks, first compile without --use_pinning.

scons --use_queue_stats

With this option set, the data structure benchmark will produce statistics about how many operations have been helped during every help session.

scons --use_print_thread_queue_stats

This option together with the --use_queue_stats option will cause the data structure benchmark to print per thread statistics. Using this is not recommended for measurements, but it can be helpful for debugging.

Tests

For example ./bin/test_tatasdx will run tests for the QD lock.

Benchmarks

Data Structure Benchmark

The data structure benchmark is described in the queue delegation locking paper. It measures the throughput for a concurrent priority queue implemented by protecting a pairing heap with QD locking, CC-Synch, H-Synch, flat combining etc. The benchmark is implemented in src/benchmark/datastructures_bench.c. It has several parameters that can be used to change, for example, the amount of thread local work. Note that the benchmark pins threads to hardware threads and if the pinning does not work it might give unpredictable results, especially for the NUMA-aware locks.

An example:

$ ./bin/pairing_heap_bench_qdlock  8 0.5 10 1 0 0
=> Benchmark 8 threads
8 10014853 26388210
|| 10014853 microseconds, 26388210 operations, (8 threads)

Run ./bin/pairing_heap_bench_qdlock for a description of the parameters.

RWBench

The RWBench benchmark is described in "NUMA-aware reader-writer locks" paper and in the queue delegation locking paper. It measure the throughput of read and write critical sections performed by a number of threads. The benchmark is implemented in the file src/datastructures_bench/rw_bench_clone.c. Note that the benchmark pins threads to hardware threads when the compile option --use_pinning has been specified, and if the pinning does not work it might give unpredictable results, especially for the NUMA-aware locks.

An example:

$ ./bin/rw_bench_clone_wprwcohort_rgnzi  8 0.8 10 4 4 32
=> Benchmark 8 threads
8 10000104 64207442
|| 10000104 microseconds, 64207442 operations (8 threads)

Run ./bin/rw_bench_clone_wprwcohort_rgnzi for a description of the parameters.

Run Benchmarks

The script bin/benchmark_lock.py can be used to run the benchmark with different parameters. See the content of the file bin/run_benchmarks_on_intel_i7.py for an explanation of the parameters. To run the pairing heap benchmarks on an 8-thread Intel i7 CPU, you can run the command:

./bin/run_benchmarks_on_intel_i7.py

The file bin/benchmark_lock_XNonCW.py is similar to bin/benchmark_lock.py but is used to generate data for the graphs where the x-axis shows the amount of thread-local work between the operations.

Produce Benchmark Graphs

The script bin/compare_benchmarks.py can be used to produce graphs. For example if you have run bin/run_benchmarks_on_intel_i7.py without any modifications the following can be used to produce graphs for the benchmark:

cd bench_results
../bin/compare_benchmarks.py gen_graphs.py $(find . -mindepth 1 -type d)

This will produce a couple a graphs in PNG format and a file called gen_graphs.py that can be changed to change the appearance of the graphs.