Marcel Taube's website This virtual machine contains the artifact for PLDI 2018 paper: "Modularity for Decidability: Implementing and Semi-Automatically Verifying Distributed Systems".
The virtual machine was created using VirtualBox version 5.2.6.
The VM should be logged on with the user name ivyuser
and password ivy
.
Most of the evaluation of the artifact is done from the terminal, which can be started by clicking on the terminal icon in the launcher (top left of the screen). To read (and modify) source files, any text editor can be used. The VM comes equipped with an Emacs mode for Ivy code (Emacs can be launched from the terminal or the launcher).
The rest of this file explains the artifact and how to exercise it.
~/z3/
contains a clone of Z3 from its master branch on github.
~/ivy/
contains a clone of Ivy from its master branch on github.
~/pldi18-artifact/
contains the artifact
~/pldi18-artifact/examples/raft/
contains Ivy source files for the
verified Raft protocol implementation.
~/pldi18-artifact/examples/multi-paxos/
contains Ivy source files for
the verified Multi-Paxos protocol implementation.
~/pldi18-artifact/evaluation/
contains unverified C++ and Python
code that uses one of the verified protocols to create a distributed
fault tolerant key-value store, as well as a client that exercises the
server code. This directory also contains various scripts used for
performance evaluation and explained later.
~/pldi18-artifact/ec2/
contains scripts that enable performance evaluation
of the key-value store on Amazon AWS.
Ivy files contain both code and proofs (e.g., pre and post
conditions,inductive invariants), and are checked with the ivy_check
command, which internally uses Z3 to discharge verification
conditions. The Raft protocol implementation (and proof) are contained
in the raft.ivy
file. It can be verified with ith ivy_check
by
running the following commands in a terminal:
ivyuser@ivy-vm:~$ cd ~/pldi18-artifact/examples/raft/
ivyuser@ivy-vm:~/pldi18-artifact/examples/raft$ time ivy_check raft.ivy # should take several minutes
You should expect this to take a few minutes (2-3 on my laptop). It
will also print messages detailing the various verification conditions
being checked (the verification is split into several isolates, which
correspond to modules from the paper, and each isolate results in
multiple verification conditions). Once ivy_check
terminates and
prints OK
, you know that the file is verified.
You may wish to edit raft.ivy
, for example by commenting out one of
the conjectures that make up the inductive invariant, and see that
ivy_check
fails. If you run ivy_check diagnose=true raft.ivy
, you
can also obtain a graphical interface that allows you to examine a
counterexample to induction (note that this interface is not part of
the contribution of the PLDI 2018 paper).
The source code of the Multi-Paxos verified protocol is split into two
files: multi_paxos_protocol.ivy
contains the abstract protocol, and
multi_paxos_system.ivy
contains the concrete implementation (and
uses the abstract protocol as ghost code for the correctness
proof). To check the Multi-Paxos implementation with ivy_check
, run the following
commands in a terminal (this also checks the abstract protocol, which is used by the implementation):
ivyuser@ivy-vm:~$ cd ~/pldi18-artifact/examples/multi-paxos/
ivyuser@ivy-vm:~/pldi18-artifact/examples/multi-paxos$ time ivy_check multi_paxos_system.ivy # should take several minutes
...
As before, the above commands will print details about the
verification conditions being checked. ivy_check
should take a few
minutes (7 on my laptop). As before, once ivy_check
prints OK
you
know that the protocol is verified. You can also edit the Ivy files to
break the verification, and run ivy_check
to see it report the
errors, or run ivy_check diagnose=true
to graphically examine
counterexamples to induction.
Ivy files can be compiled to C++, as explained in the paper. This is
done using the ivy_to_cpp
command.
To extract a verified implementation of the Raft protocol, run the following commands:
ivyuser@ivy-vm:~$ cd pldi18-artifact/examples/raft/
ivyuser@ivy-vm:~/pldi18-artifact/examples/raft$ ivy_to_cpp isolate=test target=class raft.ivy
This should take a few seconds, and generate the files raft.cpp
and
raft.h
, which contain a class raft
with a verified implementation of the Raft protocol.
To extract a verified implementation of the Multi-Paxos protocol, run the following commands:
ivyuser@ivy-vm:~$ cd pldi18-artifact/examples/multi-paxos/
ivyuser@ivy-vm:~/pldi18-artifact/examples/multi-paxos$ ivy_to_cpp isolate=iso_impl target=class multi_paxos_system.ivy
This should take a few seconds, and generate the files
multi_paxos_system.cpp
and multi_paxos_system.h
, which contain a
class multi_paxos_system
with a verified implementation of the
Multi-Paxos protocol.
The process described above results in a a C++ class that implements either Raft or Multi-Paxos. For evaluation purposes, we created a server that can use any of these classes to implement a fault tolerant distributed key-value store. This section explains how to build this server and exercise it locally using a Python client.
To build the server using the verified Raft protocol, run:
ivyuser@ivy-vm:~$ cd ~/pldi18-artifact/evaluation/
ivyuser@ivy-vm:~/pldi18-artifact/evaluation$ make raft
To build it using the verified Multi-Paxos protocol instead, run:
ivyuser@ivy-vm:~/pldi18-artifact/evaluation$ make multipaxos
Boh the make raft
command and the make multipaxos
command create a
binary file called build/server
. This allows the rest of the
evaluation (and the scripts it uses) to be uniform, since it simply
invokes build/server
. Thus, the rest of the evaluation should be
executed twice: once after running make raft
to evaluate a key-value
store that uses the verified Raft protocol, and once after running
make multipaxos
to verify a key-value store that uses the verified
Multi-Paxos protocol.
In the rest of this section, all paths are relative to ~/pldi18-artifact/evaluation
.
To run the distributed system locally, first build ./build/server
using either make raft
or make multipaxos
as described
above. Then, start the cluster of servers by running several instances
of ./build/server
with appropriate command line arguments. For your
convenience, we have provided a script, ./scripts/start-tmux.sh
,
that will start a tmux
session with multiple windows: one window for
each node running a server, and one extra window for interacting with
the cluster as a client (as explained next). The
./scripts/start-tmux.sh
script takes a command line argument with
the number of servers in the cluster (with a default of 3).
If you want to set up the cluster manually, you can run several
instances the ./build/server
executable as in the following example
(a complete list of command line arguments is given by ./build/server --help
):
ivyuser@ivy-vm:~/pldi18-artifact/evaluation$ ./build/server --log --node-id 0 --client-port 8000 --cluster localhost:4990,localhost:4991,localhost:4992
ivyuser@ivy-vm:~/pldi18-artifact/evaluation$ ./build/server --log --node-id 1 --client-port 8001 --cluster localhost:4990,localhost:4991,localhost:4992
ivyuser@ivy-vm:~/pldi18-artifact/evaluation$ ./build/server --log --node-id 2 --client-port 8002 --cluster localhost:4990,localhost:4991,localhost:4992
These are exactly the commands run by ./scripts/start-tmux.sh
. Each
server process listens for client connections on the given client
port, and talks to the servers at the given cluster addresses and
ports. The --node-id
option tells a server which element of cluster
list contains its own address and port. The command
./scripts/start-tmux.sh N
starts N
servers, where
server i
will use port 8000 + i
to communicate with clients, and
port 4990 + i
to communicate with the other servers.
If you start a tmux session with ./scripts/start-tmux.sh
, you should
make sure to properly close it before attempting to start a new
one. To do this, hit CTRL+D
in the client window to close the shell,
and then CTRL+C
once per server window. Continue until the entire
tmux session is terminated. This will be indicated by an [exited]
message, back in your original shell where you ran
./scripts/start-tmux.sh
(which will now become visible again).
After you launch a cluster of servers, you can interact with the
cluster by launching a client and sending put
and get
commands to
the key-value store. For example, you can make simple queries by
running python
or ipython
in the bench/
directory and using the
provided Python API (note that keys and values are expected to be
strings):
from vard import Client
cluster = [("localhost", 8000), ("localhost", 8001), ("localhost", 8002)]
leader_host, leader_port = Client.find_leader(cluster)
c = Client(leader_host, leader_port)
c.put("x", "7")
c.put("y", "10")
c.get("x") # returns "7"
c.get("y") # returns "10"
c.get("z") # returns None
Make sure to change the cluster
variable above to match with the
server cluster.
If you use IPython, you can use !kill <pid>
to kill some of the
servers, in order to exercise the fault tolerance of the system. Each
server window starts with a line that includes its process id. For
example, you can try the following:
ivyuser@ivy-vm:~/pldi18-artifact/evaluation$ ./scripts/start-tmux.sh 5
# Inside the client window:
ivyuser@ivy-vm:~/pldi18-artifact/evaluation$ cd bench/
ivyuser@ivy-vm:~/pldi18-artifact/evaluation/bench$ ipython
...
In [1]: from vard import Client
In [2]: cluster = [("localhost", 8000), ("localhost", 8001), ("localhost", 8002), ("localhost", 8003), ("localhost", 8004)]
In [3]: leader_host, leader_port = Client.find_leader(cluster)
In [4]: c = Client(leader_host, leader_port)
In [5]: c.put("x", "7")
In [6]: c.put("y", "10")
In [7]: c.get("x") # returns "7"
In [8]: c.get("y") # returns "10"
In [9]: c.get("z") # returns None
In [10]: !kill 6757 # change 6757 to a process id from the window on the right
In [11]: c.get("x") # still returns '7'
In [12]: !kill 6762 # change 6762 to a process id from the window on the right
In [13]: c.get("x") # still returns '7'
In [14]: !kill 6748 # change 6748 to a process id from the window on the right
In [15]: c.get("x") # hangs, since too many servers have failed (no active quorum)
^C
When killing servers, you may note that if you kill the leader, the
next client request will result in an ReceiveError
exception. To
recover, simply create a new client session by repeating the commands:
leader_host, leader_port = Client.find_leader(cluster)
c = Client(leader_host, leader_port)
The Client
object represents a connection to a specific leader (the
client communicates only with the leader, and the leader uses Raft or
Multi-Paxos to communicate requests to other nodes and reach
agreement). In a realistic application, this sort of communication
failure would be caught and handled by automatically running the
find_leader
method again, and establishing a new client connection.
Note that if you try to find a new leader while the leader-election protocol has not terminated, you will receive an error. Simply try again a few seconds later. Also note that restarting a failed server is not supported.
You can also perform a local benchmarking run using the script
./bench/bench.py
(note that you must start the servers before
running this script). If you first ran some simple queries, either
close the existing python session or open a new terminal and run the
following command. It may also be interesting to interact with the
cluster manually via the python API during or after benchmarks are
running.
If you started a cluster of 3 nodes, you can start ./bench/bench.py
with the following arguments:
ivyuser@ivy-vm:~/pldi18-artifact/evaluation$ python ./bench/bench.py --cluster localhost:8000,localhost:8001,localhost:8002 --requests 10 --threads 1 --keys 10 --put-percentage 50
The --cluster
option specifies the addresses and client ports of
the servers in the cluster. The rest of the options specify a
benchmark consisting of 10 randomly generated requests, running at
most one request in parallel, over a key space with 10 keys, where on
average 50% of requests are puts. You may wish to play with the number
of requests, threads, keys, and precentage of put requests (given by
command line arguments). Our paper reports measurements taken with 500
keys, 50 threads, and 10000 requests.
Each run of ./bench/bench.py
reports the average latency, as well as
a detailed log of all requests. This allows to examine whether the
latency increases with the size of the log (it should not). You can
also try to run ./bench/bench.py
several times, possibly killing
nodes between runs to exercise fault tolerance. You can also kill
nodes during a sufficiently long ./bench/bench.py
run, but notice
that if you kill the leader during a run, it would not recover. As
explained above, a real application would be able to recover by
starting a new session. To simulate this, you can kill the leader
between multiple ./bench/bench.py
runs.
Finally, note that the Multi-Paxos implementation has a known issues: changing leader after a sufficiently long benchmark causes nodes to try to send messages that do not fit in a UDP frame, resulting in those nodes being terminated.
If you have access to an Amazon AWS account, you can reproduce the
paper's evaluation by running our key-value store, as well as vard
and etcd
, on EC2. Spinning up on-demand instances, cleaning them up,
and running the experiments is all automated, but some configuration
is necessary.
You'll need to do a couple of things in the AWS EC2 console:
- Ensure that port 22 (SSH) is open in your default security group (you can do this from "Security Groups" under "Network & Security")
- Create a key pair called "bench" (you can do this from "Key Pairs"
under "Network & Security"). Download the resulting private key and
put it in the VM as
/home/ivyuser/.ssh/bench.pem
and runchmod 600 /home/ivyuser/.ssh/bench.pem
.
Once you've done this, you can configure the VM to access your EC2 account by running (in the VM):
aws configure
and entering your Access Key ID, Secret Access Key, and default region
as appropriate (if you don't have a region preference, enter us-west-2
).
The scripts to run on EC2 are in ~/pldi18-artifact/ec2/
. The main
entry points are experiment.py
and use_cluster.py
. In what follows, we assume that you
have run ivy_to_cpp
as explained above, so the C++ verified
implementations are already created. To run on EC2, experiment.py
copies C++ source files to EC2 instances and compiles them there. Note
that the EC2 instances themselves do not have Ivy (or Z3) installed,
since they are only used to run the resulting application.
To run experiments, you can start a cluster by running experiment.py <yaml-script> <experiment-label>
. For example, to start the Raft-based
key-value store experiment from the paper, run:
ivyuser@ivy-vm:~/pldi18-artifact/ec2$ python2 experiment.py -f experiments/ivyd-raft.yml raft-aec --no-experiment --no-cleanup --dump-instance-yaml raft-instances.yml
This will spin up 4 m4.xlarge
EC2 instances, provision them as
necessary, and start the Raft-based key-value store on three of them.
You can expect this to take 5-10 minutes.
Then, you can use the cluster to perform benchmarks by running
ivyuser@ivy-vm:~/pldi18-artifact/ec2$ python2 use_cluster.py --config experiments/ivyd-raft.yml --config experiments/closed-loop.yml --instance raft-instances.yml --time 10 --iters 5 --use-processes --threads 1 --threads 2 --threads 4 --threads 8 --threads 16 --threads 32 --threads 64
This will run a benchmarking script from the fourth server to test the
performance of the cluster under various loads. Each test runs the
benchmark script for 10 seconds with a certain number of client
processes and measures throughput and latency. Each
10-second experiment is repeated 5 times (specified by --iters 5
).
The entire benchmark (5 iterations of ~10 seconds each for each number of threads)
should take about 10 minutes.
When the benchmark script finishes, you can terminate all the nodes by running
ivyuser@ivy-vm:~/pldi18-artifact/ec2$ python2 use_cluster.py --config experiments/ivyd-raft.yml --config experiments/closed-loop.yml --instance vard-instances.yml --terminate
The logs from every process
(the provisioning script, the servers, and the client) are written to
a new directory in experiment_logs
called
raft-aec-<timestamp>
. You can also examine these files while the
experiment is running. The benchmark script's output will be in the
experiment-client-<timestamp>.log
files in that directory. The end of the file
should contain the total time, throughput, average latency. The file
also contains a list of individual latencies for every request.
Each .yml
file in the experiments
directory corresponds to an
experiment from the paper:
ivyd-raft.yml
: The Raft-based key-value storeivyd-multipaxos.yml
: The Multipaxos-based key-value storeetcd.yml
: The etcd key-value storevard.yml
: Thevard
key-value store
The provisioning scripts for vard
and etcd
download sources from
their respective repositories. All provision scripts download the
benchmarking scripts from the vard
repository, since we used the
vard
benchmarks.
To try the other experiments, you can use the following commands:
python2 experiment.py -f experiments/ivyd-raft.yml raft-aec --no-experiment --no-cleanup --dump-instance-yaml raft-instances.yml
python2 experiment.py -f experiments/ivyd-multipaxos.yml multipaxos-aec --no-experiment --no-cleanup --dump-instance-yaml multipaxos-instances.yml
python2 experiment.py -f experiments/etcd.yml etcd-aec --no-experiment --no-cleanup --dump-instance-yaml etcd-instances.yml
python2 experiment.py -f experiments/vard.yml vard-aec --no-experiment --no-cleanup --dump-instance-yaml vard-instances.yml
And then run the use_cluster.py
commands as above.
Note that the vard
experiment takes significantly longer to set up the cluster since it
installs Verdi on the EC2 instances. Also note that if you see messages like SSH connection to ... refused; retrying
, you can wait patiently and let the script
recover the connection and continue running. Should you become
impatient and interrupt the script, the instances will also be
terminated shortly. Overall, one should be patient with the
experiment.py
script, since it performs some operations that can
take minutes to complete (e.g., cause a cloud instance to install
software from a remote repository).