Note: All commands are written under the expectation they are done
from the toplevel directory (where this README
is)
sudo apt install build-essential fakeroot libncurses5-dev libssl-dev ccache flex bison libelf-dev
$> sudo apt install dwarves busybox qemu-system-x86
# Ubuntu 22+
$> sudo apt cmake install libdouble-conversion-dev libfmt-dev libglog-dev libunwind-dev libboost-all-dev
# Ubuntu 18 & Ubuntu 20
$> sudo apt cmake install libdouble-conversion-dev libfmt-dev libgoogle-glog-dev libunwind-dev libboost-all-dev
$> sudo apt install patchelf libevent-dev python3-pip gcc-9 g++-9
$> pip3 install psutil
$> sudo pip3 install psutil # Need user and root
After this point NOTHING should require sudo
.
Append the following lines to your .bashrc
file.
if [ ! -z "${BASHRC_TO_RUN}" ]
then
echo "Running: ${BASHRC_TO_RUN}"
$BASHRC_TO_RUN
unset BASHRC_TO_RUN
exit
fi
This is used to making scripting benchmarks easier.
None (assuming all other dependencies have been installed)
-
Get Linux
$> git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux-dev/src
-
Checkout 5.18-rc4
$> (cd linux-dev/src; git checkout v5.18-rc4; git checkout -b autolock-base)
-
Apply autolock patches
$> (cd linux-dev/src; git am ../patches/*)
-
Config Linux
# Note: there is an example config in `linux-dev/config/example-config`. # Some difference are expected based on compiler version / exact system # info but any config difference related to virtualization, serial ports, # or scheduling will be an issue. # see: https://ops.tips/notes/booting-linux-on-qemu/ and https://www.linux-kvm.org/page/Virtio if any issues. $> (cd linux-dev/src; make x86_64_defconfig)
-
Build linux
$> (cd linux-dev/src; make)
-
Copy
qemu
script to linux source directory$> cp linux-dev/scripts/qemu-script.sh linux-dev/src
This is annoying but the script must be run from the linux source tree
-
Init all submodules
# Note: At the moment there are no submodules but we may want to # test against `abseil`/`folly` implementations. $> git submodule update --init --recursive
-
Config userland support
$> cmake -DLINUX_DIR=../linux-dev/src -DBUILD_TESTING=OFF -DLANG=CXX -DCOMPILER=g++ -DWITH_MATH=1 -DWITH_THREAD=1 -DWITH_VDSO=1 -S user-dev/ -B user-dev/build
-
Build userland support
# Generally best workflow is keep a seperate shell around for # rebuilding / installing $> (cd user-dev/build; make)
-
Install userland support
# Note: We install this into the kernel source folder so we don't # need to change directories after entering the virtual # machine. This installs to `linux-dev/src/user-bin` $> (cd user-dev/build; make install)
- Build and install
# This may take 5-10 min $> (cd benchmarks; ./scripts/setup-benchmarks.py)
-
Get glibc
$> git clone https://sourceware.org/git/glibc.git glibc-dev/src
-
Checkout 2.35
$> (cd glibc-dev/src; git checkout glibc-2.35; git checkout -b glibc-2-3-5-cond-var-plt)
-
Apply cond var patches
$> (cd glibc-dev/src; git am ../patches/*)
-
Configure
$> mkdir -p glibc-dev/build/glibc; (cd glibc-dev/build/glibc; unset LD_LIBRARY_PATH; ../../src/configure --prefix=$(realpath ../../../linux-dev/src/glibc-install))
-
Build Glibc
$> (cd glibc-dev/build/glibc; unset LD_LIBRARY_PATH; make --silent && make install)
-
Enter Linux source directory
$> cd linux-dev/src
-
Run
qemu
# If this fails try re-configuring/building and ensure following are set # CONFIG_VIRTIO_PCI=y # CONFIG_VIRTIO_BALLOON=y # CONFIG_VIRTIO_BLK=y # CONFIG_VIRTIO_NET=y # CONFIG_VIRTIO=y # CONFIG_VIRTIO_RING=y # use `make menuconfig`. # Don't run `make menuconfig` from an emacs shell. Use bash. # You can search by typing '/' then 'VIRTIO_<config_postifx>' $> ./qemu-script.sh
-
Run tests
$> ./user-bin/driver --test --all
-
Run benchmarks
$> ./user-bin/driver --bench --all
-
Kill qemu
# If no errors you can usually do Ctrl-D $> kill -9 $(pidof qemu-system-x86_64)
Application benchmarks expect all setup steps. The way we do the
benchmarks is interpose our own locks infront of the glibc
ones. Specically with interpose any lock implemented (that is ABI
compatible) over pthread_mutex_attr
.
This requires a slightly customized glibc
to handing
pthread_cond_wait
and patchelf
to modify the elf
file (as
opposed to messing with the source of the applications)
All instructions are from the kernel directory ie: cd linux-dev/src
- Use custom GLIBC
# We need to add back the libevent dependency $> ./../../glibc-dev/scripts/use-this-glibc.py glibc-install/ bench-install/bin/memcached /lib/x86_64-linux-gnu/libevent-2.1.so.7
-
Run memcached
# pthread-mutex-profile is just a wrapper for pthread-mutex # that collects timing data on the locks $> LD_PRELOAD=./interpose-libs/libpthread-mutex-profile.so ./bench-install/bin/memcached -u noah -t 8 -m 4096 -n 2048 &
-
Run memaslap
$> ./bench-install/bin/memaslap -s 127.0.0.1:11211 -S 5s -B -T 8 -c 32
memaslap
will output data as it runs. To see the lock data collected
in memcached
send SIGINT
to the process (or some other means of
allowing it gracefully shut down).
- **
To browse the kernel code get to step 3 in the Linux Setup
The autlock commits start just after the Linux 5.18-rc4
merge:
commit af2d861d4cd2a4da5137f795ee3509e6f944a25b (tag: v5.18-rc4)
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Sun Apr 24 14:51:22 2022 -0700
Linux 5.18-rc4
So anything since then is part of autolock development.
The files of interest ordered as such
-
Autolock core
- Core autolock implementation:
kernel/auto-lock.c
- Autolock kernel level API:
include/linux/auto-lock.h
- Autolock user level API:
include/uapi/linux/auto-lock.h
- Core autolock implementation:
-
Autolock scheduler integration
- Autolock integration into CFS:
kernel/sched/fair.c
- Autolock integration into context switch:
kernel/sched/core.c
- Autolock integration into CFS:
-
Autolock task integration
- Integration into
task_struct
:include/linux/sched.h
- Integration into task teardown:
kernel/fork.c
- Integration into
-
Autolock syscalls
arch/x86/entry/syscalls/syscall_64.tbl
include/linux/syscalls.h
include/uapi/asm-generic/unistd.h
kernel/sys_ni.c
-
Other stuff for logging / building
kernel/sched/deadline.c
kernel/sched/idle.c
kernel/sched/rt.c
include/linux/auto-lock-verbose.h
kernel/Makefile
99% of the code in this project is boiler plate I carry with me. As a
general note any symbol that has the prefix I_*
is explicitly not
part of any external API and is meant for internal use only. Any
symbol without that prefix is meant to be part of the exported API.
Finally, note that the the two targets [test-driver
] and
[bench-driver
] are only for testing / benchmarking internal
functionality of the suite and are entirely unrelated to the
autolock
project.
The two directories that have interesting stuff are
src/autolock-impls
and
src/locks
.
- All locks are exported as a C++
class
with the same API. An example can be seen with withpthread_mutex
. - The API includes:
static __typeof__(this) init(void *) /* __typeof__(this) is a pointer to the class type. */
void destroy()
int try_lock()
void lock()
void unlock()
- The API is used by the templated benchmark code in
src/locks/lock-bench.h
.
Found in directory: src/autolock-impls/
-
Autolock ABI for interacting with kernel shared memory
src/autolock-impls/autolock-kernel-abi.h
defines the memory layout.src/autolock-impls/kernel-autolock.c
defines the TLS storage for the userland pointer to the shared mapping.- Neither of these should be included directly.
-
Autolock API for interacting with the kernel
src/autolock-impls/autolock-kernel-api.h
This includes setup/teardown and theextern
def for the TLS storage.- This is included by all autolock implementations.
-
Autolock exported API
src/autolock-impls/autolock-export.h
defines all the user-level locks that use the kernel autolock feature. This is included by the general purpose lock declaration code.- The class structure is just the general API that all tested locks will adhere too.
- The exact list of exported implementations is set at
#define AUTOLOCK_IMPLS
-
Actual user-level autolocks
- There is a baseline implementation that defines all the glue in
src/autlock-impls/internal/autolock-common-user-api.h
. This defines the API and implementation for:typedef user_autolock_t
void autolock_init(user_autolock_t * lock)
void autolock_destroy(user_autolock_t * lock)
int autolock_trylock(user_autolock_t * lock)
void autolock_unlock(user_autolock_t * lock)
All user-level locks built on the kernel autolock functionality can alias
<lock_name>_{init|destroy|trylock|unlock}
to the respectiveuser_{init|destroy|trylock|unlock}
functions. As well they can justtypedef user_autolock_t <lock_name>_autolock_t
.src/autolock-impls/simple-autolock.h
Just a simple spinlock.src/autolock-impls/backoff-autolock.h
spinlock with backoff.
- There is a baseline implementation that defines all the glue in
-
All lock declarations
- all lock implementations are declared in
src/locks/locks-decls.c
. They are declared as adecl_list_t
which is essentially a statically sized array of function pointers + names. - All implementations are defined in the list
#define LOCK_IMPLS
.
- all lock implementations are declared in
-
Current lock implementations (aside from autolocks)
pthread_mutex
pthread_spinlock
- a variety of configs built ontop of
src/locks/lock-base.h
. These are just references to help see what features matter for performance.
-
Low level bench/test functions
- The internal bench/test code that is templated for each lock implementation is in
src/locks/lock-bench.h
. - The API used for the thread function is
void * bench_runner(void *)
. - The low level benchmark function is
I_bench_runner_kernel()
- The internal bench/test code that is templated for each lock implementation is in
-
High level bench/test driver
- The driver code for calling
bench_runner
is insrc/locks/lock-runner.c
. This will always test that the result is correct. The difference between benchmark/test is that benchmarking will collect and export stats. - The API is simply
run(func_decl_t const *, run_params_t *, stats_result_t *)
- The driver code for calling
- The main function is found in
src/driver.c
- The general usage is:
$> ./driver --bench --threads <num_threads> --trials <num_trials> <rest=locks to test/bench>
- See
$> ./driver -h
for all options. - To run all locks you can use
--all
and to list the available locks you can use--list
. - Searching for lock names will essentially use
fnmatch
syntax so something like:$> ./driver --test spin*
will test all lock implementations whose names are prefixed with 'spin'.