/FMKe

🛠️ Realistic benchmark for key value stores

Primary LanguageErlangOtherNOASSERTION

FMKe

Erlang Version Build Status Coverage Status Dialyzer Enabled

FMKe is an extendable real world benchmark for distributed key-value stores.
This repository contains code for the application server and a set of scripts for orchestrating deployment and local execution of micro-benchmarks.

Why?

Here is a comparison of available benchmark specifications that we analyzed, with FMKe for comparison:

Benchmark Target Systems Workload type
TPC-C SQL-Based databases ❌ **realistic ✔️
TPC-E SQL-Based databases ❌ **realistic ✔️
YCSB Key-value stores ✔️ synthetic ❌
FMKe Key-value stores ✔️ **realistic ✔️

** Emulates real application patterns

Backing the realistic claims

FMKe was one of the final contributions of the SyncFree European research project. It was designed to benchmark its reference platform, AntidoteDB, by closely emulating a realistic application. One of the industrial partners of the project, Trifork, provided statistical data about Fælles Medicinkort (FMK), a sub-system relative to the Danish National Joint Medicine Card. The real system is backed by a distributed key value store to ensure high availability, which validates the decision to use it as a benchmark (originally) for AntidoteDB.

System description

The real world FMK system, and FMKe alike are designed to store patient health data, mostly revolving around medical prescriptions. Here is the ER diagram:

Build Status

There are 4 core entities: treatment facilities, patients, and pharmacies. Other records appear as relations between these entities, but it will become apparent that the workload focuses heavily on prescription records. More information about the system operations and data model can be found in this document.

Architecture

Build Status
Consider FMKe as a general application server that contains the logic mimicking the real FMK system. We decided not to release FMKe as a single monolithic application, since there are multiple benefits in separating it in these 3 components.
Firstly, separating the application server from the workload generation component doesn't require us to reinvent the wheel, since many good workload generation tools already exist. On the other hand, making the application logic independent of the database allows for collaboration in supporting a broader set of data stores.
We have a generic interface for key-value stores (implemented as an Erlang behaviour) that is well specified, which makes supporting a new database as simple as writing a driver for it. Furthermore, pull requests with new drivers or optimizations for existing ones are accepted and welcomed.

Supported data stores

  • AntidoteDB (the SQL-like interface offered by AQL is also supported)
  • Cassandra
  • Redis
  • Riak KV

Note about AQL schema:

When running the benchmark to evaluate the performance of AQL you have two options regarding the database schema. The file priv/build_schema.aql creates the tables without foreign keys, and thus, the referential integrity mechanism of AQL is not used. To use the referential integrity mechanism, use the file priv/build_schema_fk.aql, this version creates the tables with foreign keys.

How the benchmark is deployed

By default FMKe keeps a connection pool to a single database node, and the workload generation is performed by Lasp Bench.
To benchmark clustered databases with n nodes, n FMKe instances can be deployed, or alternatively one FMKe node can connect to multiple nodes (the exact number is dependent on the connection pool size).
To avoid network and CPU bottlenecks that could impact the result of the benchmark, it is advised to use different servers for each one of the components. Having said that, a number of scripts are available for development that enable local execution of micro benchmarks.

Use case: AntidoteDB evaluation

FMKe was used in January 2017 to evaluate the performance of AntidoteDB. The evaluation took place in Amazon Web Services using m3.xlarge instances which have 4 vCPUs, 15GB RAM and 2x40GB SSD storage.
The biggest test case used 36 AntidoteDB instances spread across 3 data centers (Germany, Ireland and United States), 9 instances of FMKe and 18 instances of (former Basho Bench) Lasp Bench that simulated 1024 concurrent clients performing operations as quickly as possible.
Before the benchmark, AntidoteDB was populated with over 1 million patient keys, 50 hospitals, 10.000 doctors and 300 pharmacies.

Testing out FMKe locally

FMKe requires Erlang/OTP and rebar3. You need at least Erlang 20, FMKe will not compile in previous versions.

Please check the wiki for detailed instructions on how to run FMKe with a particular database.