Xinjing Zhou, Xiangyao Yu, Goetz Graefe, Micheal Stonebraker

Lotus: Scalable Multi-Partition Transactions on Single-Threaded Partitioned Databases

Proc. of the VLDB Endowment (PVLDB), Volume 15, Sydney, Australia, 2022.

This repository contains source code for Lotus. The code is based on the star framework from Yi Lu.

Dependencies

sudo apt-get update
sudo apt-get install -y zip make cmake g++ libjemalloc-dev libboost-dev libgoogle-glog-dev

Download

git clone https://github.com/DBOS-project/lotus.git

Build

./compile.sh

Reproducing Experiments

Note that the tutorial only works for Google Cloud Compute Engine.

Make sure you have placed the source code folder under ~.

Make sure every node in the cluster has installed all the software dependencies.

Make sure the benchmark has been compiled on every node using compile.sh.

We assume the log files for transactions are placed under /mnt/disks/nvme/.

The sample scripts provided run on 6 nodes.

Figure 9(a): Comparison with Non-Deterministic Systems

Fill in scripts/ips.txt with ip addresses of the nodes you want to run the experiments on.
Fill in scripts/instance_names.txt with corresponding instance names (Google Cloud Compute Engine instance name) of the nodes supplied in scripts/ips.txt.
Run the following bash code on the first node of the cluster to distribute benchmarking scripts to other nodes.

cd scripts
#                           port to run the experiments on
#                             |
python distribute_script.py 1234 gc_2pl_mp_ycsb.py  us-central1-a # ----- Google Cloud region name
#                                     |
#                                 baseline-specific distribution script
python distribute_script.py 1234 gc_sundial_mp_ycsb.py us-central1-a
python distribute_script.py 1234 gc_hstore_mp_ycsb.py us-central1-a
python distribute_script.py 1234 gc_lotus_mp_ycsb.py us-central1-a

Run the following code on the first node of the cluster to start the benchmark

sh run_2pl_mp_ycsb.sh
sh run_sundial_mp_ycsb.sh
sh run_hstore_mp_ycsb.sh
sh run_lotus_mp_ycsb_sync.sh

Results are placed under ~/exp_results on the first node of the cluster.

Figure 10(a): Comparison with Deterministic Systems

Fill in scripts/ips.txt with ip addresses of the nodes you want to run the experiments on.
Fill in scripts/instance_names.txt with corresponding instance names (Google Cloud Compute Engine instance name) of the nodes supplied in scripts/ips.txt.
Fill in scripts/ips_half.txt with first half of the nodes from scripts/ips.txt. This is for Calvin and Aria baselines that only need to evaluate the performance of one replica (3 nodes).
Fill in scripts/instance_names_half.txt with first half of the nodes from scripts/instance_names.txt.
Run the following bash code on the first node of the cluster to distribute benchmarking scripts to other nodes.

cd scripts
python distribute_script_half.py 1234 gc_aria_mp_ycsb.py us-central1-a
python distribute_script_half.py 1234 gc_calvin_mp_ycsb.py us-central1-a
python distribute_script.py 1234 gc_lotus_mp_ycsb.py us-central1-a

Run the following code on the first node of the cluster to start the benchmark

sh run_aria_mp_ycsb.sh
sh run_calvin_mp_ycsb.sh
sh run_lotus_mp_ycsb.sh