SeBS is a serverles benchmark suite designed to evaluate and model the performance and cost of FaaS platforms. The artifact has four main components:
-
our tool with benchmarks and dependencies
-
results obtained by us for the paper
-
analysis of results
-
scripts helping to reproduce most of the results in the paper.
First, we describe the source code, data and tooling provided with the artifact. Then, we describe the environment and software packages necessary and how to install it automatically or use with Docker. Please read it and follow instructions to properly initialize the environment. Then, we describe results obtained in the paper and present scripts use to reproduce them.
The entire artifact has been made publicly available under the DOI: 10.5281/zenodo.5209001
With our artifact we provide the following components:
serverless-benchmarks
- source code of the benchmark suitedata
- benchmarking results obtained for the paperanalysis
- Python plotting and analysis scripts used for data analysisexperiments
- scripts helping to reproduce the experimentsdocker
- compressed Docker images which were used for these experiments.
Our data has been obtained in January 2020, July and August 2020, and November 2020.
All experiments analyze the performance of three commercial serverless offerings: AWS Lambda, Azure Functions, and Google Cloud Functions.
To conduct the experiments, it is necessary to have an account on the platform. Our documentation helps to create credentials needed to use tool (details in serverless-benchmarks/docs/platforms.md
).
- Python 3.7 to install and use the benchmarking suite.
- Docker 19+ to build and deploy functions
libcurl
and its headers must be available on your system to installpycurl
- Standard Linux tools
First, clone the repository with the SeBS submodule:
git clone --recursive https://github.com/spcl/serverless-benchmarks-artifact.git
To install the benchmarks with a support for all platforms, run in the serverless-benchmarks
the following command:
./install.py --aws --azure --gcp --local
It will create a virtual environment in python-virtualenv
, install necessary Python dependencies and third-party dependencies there.
To use SeBS and deploy experiments, you must first active the new Python virtual environment:
. python-virtualenv/bin/activate
The environment must be always active when working with SeBS.
The the Docker daemon must be running and the user needs to have sufficient permissions to use it. Otherwise you might experience many "Connection refused" and "Permission denied" errors when using SeBS.
The software has been tested and evaluated on Linux (Debian, Ubuntu). We cannot guarantee that it is going to to operate correctly on systems as WSL.
We use Docker images to build the packages of serverless functions and manage the Azure CLI. The Docker images can be pulled from our Docker Hub repository spcleth/serverless-benchmarks
, they can be rebuilt manually by running tools/build_docker_images.py
in SeBS main directory, and the most important images are provided as compressed archives in the docker
subdirectory. Use docker/load.sh
to load them.
The provided data is processed with a set of Python scripts, Jupyter notebooks, and R scripts.
The processing scripts are seperated between sections and research questions. Each subdirectory contains a README.md
file describing the generation process and used datasets from the main data
directory.
-
Q1 -
performance/q1_perf/plot_time.py
generates Figure 3 in the paper. -
Q2 -
performance/q2_cold/plot_cold_startups.py
generates Figure 4 in the paper. -
Q3 -
performance/q3_portability/run.sh
generates the memory data cited in the section. -
Q4 -
performance/q4_iaas/
contains the scripts to regenerate contents of Table 5. Please follow the includeREADME.md
for details.
-
Q1, Q2 - the notebook
cost/q1_cost/plots.ipynb
generates Figures 5a and 5b in the paper. -
Q4 -
cost/q3_iaas_breakpoint/
contains the scripts to regenerate contents of Table 6. Please follow the includeREADME.md
for details.
- Q1 - the R script
invoc_overhead/parse_inv_overhead.R
generates Figure 6 in the paper.
- Q1 - Python and R scripts wrapped under
container_eviction/run.sh
generate Figure 7 in the paper.
-
After installing the benchmark suite and activating the virtual environment, create and configure cloud accounts according to the provided instructions in
serverless-benchmarks/docs/platforms.md
. -
For all platforms, define the environmental variables storing cloud credentials.
-
Then, repeat the experiments according to the instructions provided for each benchmark.
This experiment generates data for the main results from Sections 6.2 and 6.3. There are three directories experiments/perf_cost/{provider}
with provider being aws
, azure
and gcp
.
- For each platform, we repeat the generation of results for each benchmark used in the paper and with multiple memory sizes except for Azure.
- For each benchmark, we measure warm and cold invocations on AWS and GCP, and warm and burst invocations on Azure.
- For each execution we need 200 datapoints, and we perform 250 repetitions as the cloud billing system is not always reliable and it is not guaranteed that we will obtain exact billing data for each invocation.
- For cold experiments we need to enforce container eviction between each batch of invocations. Thus, the process can take several minutes.
- On Azure we no longer use the
thumbnailer
Node.js benchmark, as it is no longer possible to create Linux function app with Functions runtime 2.0 and Node.js.
Steps needed to reproduce the results.
-
Make sure that credentials are configured and the
python-virtualenv
from SeBS installation is activated. -
Execute the
run.sh
script which will run the SeBS experiment for each benchmark. -
In each subdirectory, the out.log file will contain multiple invocation results such as this - one entry for each memory configuration.
12:40:05,907 INFO AWS-8915: Published new function code 12:40:05,907 INFO Experiment.PerfCost-2b82: Begin cold experiments 12:40:17,684 INFO Experiment.PerfCost-2b82: Processed 0 samples out of 50,0 errors 12:40:17,684 INFO Experiment.PerfCost-2b82: Processed 0 warm-up samples, ignore results. 12:40:34,509 INFO Experiment.PerfCost-2b82: Processed 10 samples out of 50,0 errors 12:40:51,782 INFO Experiment.PerfCost-2b82: Processed 20 samples out of 50,0 errors 12:41:08,634 INFO Experiment.PerfCost-2b82: Processed 30 samples out of 50,0 errors 12:41:25,366 INFO Experiment.PerfCost-2b82: Processed 40 samples out of 50,0 errors 12:41:42,509 INFO Experiment.PerfCost-2b82: Processed 50 samples out of 50,0 errors 12:41:47,515 INFO Experiment.PerfCost-2b82: Mean 1538.35586, median 1475.9135, std 130.46677776369125, CV 8.480923117729812 12:41:47,517 INFO Experiment.PerfCost-2b82: Parametric CI (Student's t-distribution) 0.95 from 1500.9011734974968 to 1575.810546502503, within 2.4347218661424113% of mean 12:41:47,517 INFO Experiment.PerfCost-2b82: Non-parametric CI 0.95 from 1464.246 to 1511.239, within 1.591997091970496% of median 12:41:47,519 INFO Experiment.PerfCost-2b82: Parametric CI (Student's t-distribution) 0.99 from 1488.4066173484066 to 1588.3051026515932, within 3.246923806796777% of mean 12:41:47,519 INFO Experiment.PerfCost-2b82: Non-parametric CI 0.99 from 1459.516 to 1578.257, within 4.0226273423205345% of median 12:41:47,532 INFO Experiment.PerfCost-2b82: Begin warm experiments 12:41:53,636 INFO Experiment.PerfCost-2b82: Processed 0 samples out of 50,0 errors 12:41:53,636 INFO Experiment.PerfCost-2b82: Processed 0 warm-up samples, ignore results. 12:42:04,584 INFO Experiment.PerfCost-2b82: Processed 10 samples out of 50,0 errors 12:42:15,446 INFO Experiment.PerfCost-2b82: Processed 20 samples out of 50,0 errors 12:42:26,351 INFO Experiment.PerfCost-2b82: Processed 30 samples out of 50,0 errors 12:42:37,383 INFO Experiment.PerfCost-2b82: Processed 40 samples out of 50,0 errors 12:42:48,319 INFO Experiment.PerfCost-2b82: Processed 50 samples out of 50,0 errors 12:42:53,322 INFO Experiment.PerfCost-2b82: Mean 874.2798799999999, median 893.336, std 58.710030168835715, CV 6.7152443413012906 12:42:53,324 INFO Experiment.PerfCost-2b82: Parametric CI (Student's t-distribution) 0.95 from 857.4252767652275 to 891.1344832347723, within 1.9278269602604161% of mean 12:42:53,324 INFO Experiment.PerfCost-2b82: Non-parametric CI 0.95 from 885.337 to 918.449, within 1.853278049916267% of median 12:42:53,325 INFO Experiment.PerfCost-2b82: Parametric CI (Student's t-distribution) 0.99 from 851.802728396723 to 896.7570316032769, within 2.5709331894126377% of mean 12:42:53,325 INFO Experiment.PerfCost-2b82: Non-parametric CI 0.99 from 821.185 to 920.821, within 5.576625144402558% of median
-
Inside each benchmark subdirectory, there will be a
perf-cost
subdirectory with JSON results for each -
After few minutes, run
process.sh
to download cloud billing results and generate the output. The waiting period is important, because not each cloud provider publishes billing logs immediately after the invocation. -
Inside each benchmark subdirectory, there will be a
perf-cost
subdirectory withresult.csv
file. This file contains a summary of performance and cost data which we use for plotting and generation of tables.