PlumDB

[[TOC]]

PlumDB is a globally replicated database using eventual consistency. It uses Epidemic Broadcast Trees and lasp-lang’s Partisan, an alternative runtime system for improved scalability and reduced latency in distributed Erlang applications.

It is an offspring of Helium's Plumtree – a descendant of Riak Core's Metadata Store – and Partisan.

The original Plumtree project was the result of extracting the Metadata Store from Riak Core and replacing the cluster membership state by an ORSWOT CRDT.

PlumDB builds on top of Plumtree but changes its architecture offering additional features.

Feature	PlumDB	Plumtree
Cluster membership state	Partisan's membership state which uses an AWSET	ORSWOT (riak_dt)
Data model	Riak Core Metadata (dvvset)	Riak Core Metadata (dvvset)
Persistence	leveldb. A key is sharded across `N` instances of a store. Stores can be in-memory (`ets`), on disk (Rocksdb) or both. `N` is configurable at deployment time.	Each prefix has its own ets and dets table.
API	A simplification of the Riak Core Metadata API. A single function to iterate over the whole database i.e. across one or all shards and across a single or many prefixes.	Riak Core Metadata API (`plumtree_metadata_manager`) is used to iterate over prefixes whereas `plumtree_metadata` is used to iterate over keys within each prefix. The API is confusing and is the result of having a store (ets + dets) per prefix.
Active anti-entropy	Based on Riak Core Metadata AAE, uses a separate instance of leveldb to store a merkle tree on disk. Updated to use the new API and gen_statem	Based on Riak Core Metadata AAE, uses a separate instance of leveldb to store a merkle tree on disk.
Pubsub	Based on a combination of gen_event and gproc, allowing to register a Callback module or function to be executed when an event is generated. gproc dependency allows to pattern match events using a match spec	Based on gen_event, allowing to register a Callback module or function to be executed when an event is generated

Installation

You will use PlumDB as a dependency in your Erlang application.

Requirements

Erlang 24 or later
Rebar3 3.17.0 or later
Snappy
lz4

Configuration

PlumDB is configured using the standard Erlang sys.config.

The following is an example configuration:

{plum_db, [
    {aae_enabled, true},
    {store_open_retries_delay, 2000},
    {store_open_retry_Limit, 30},
    {data_exchange_timeout, 60000},
    {hashtree_timer, 10000},
    {data_dir, "data"},
    {partitions, 8},
    {prefixes, [
        {foo, ram},
        {bar, ram_disk},
        {<<"baz">>, disk}
    ]}
]}

partitions (integer) – the number of shards.
prefixes – a list of {Prefix, prefix_type()}
- Prefix is a user defined atom or binary
- prefix_type() is one of ram, ram_disk and disk.
aae_enabled (boolean) – whether the Active Anti-Entropy mechanism is enabled.
store_open_retries_delay (milliseconds) – controls thre underlying disk store (leveldb) delay between open retries.
store_open_retry_Limit (integer) – controls thre underlying disk store (leveldb) open retry limit
data_exchange_timeout (milliseconds) – the timeout for the AAE workers
hashtree_timer (seconds) –

At the moment additional configuration is required for Partisan and Plumtree dependencies:

{partisan, [
    {peer_port, 18086}, % port for inter-node communication
    {parallelism, 4} % number of tcp connections
]}

{plumtree, [
    {broadcast_exchange_timer, 60000} % Perform AAE exchange every 1 min.
]}

Usage

Learn more by reading the source code Documentation.

Standalone testing

We have three rebar3 release profiles that you can use for testing PlumDB itself.

Running a 3-node cluster

To run a three node cluster do the following in three separate shells.

In shell #1:

$ rebar3 as dev1 run

In shell #2:

$ rebar3 as dev2 run

In shell #3:

$ rebar3 as dev3 run

Make node 2 and 3 join node 1

In node #2:

> {ok, Peer} = partisan:node_spec('plum_db1@127.0.0.1', {{127,0,0,1}, 18086}).
> partisan_peer_service:join(Peer).

In node #3:

> {ok, Peer} = partisan:node_spec('plum_db1@127.0.0.1', {{127,0,0,1}, 18086}).
> partisan_peer_service:join(Peer).

Check that the other two nodes are visible in each node

In node #1:

> partisan_peer_service:members().
{ok,['plum_db3@127.0.0.1','plum_db2@127.0.0.1']}

In node #2:

> partisan_peer_service:members().
{ok,['plum_db3@127.0.0.1','plum_db1@127.0.0.1']}

In node #3:

> partisan_peer_service:members().
{ok,['plum_db2@127.0.0.1','plum_db1@127.0.0.1']}

In node #1:

> [plum_db:put({foo, a}, x, 1).
ok

In node #2:

> plum_db:put({foo, a}, y, 2).
ok

In node #3:

> plum_db:put({foo, a}, z, 3).
ok

Do the following on each node to check they now all have the three elements:

> plum_db:fold(fun(Tuple, Acc) -> [Tuple|Acc] end, [], {'_', '_'}).
[{x,1},{y,2},{z,3}]

We are folding over the whole database (all shards) using the full prefix wildcard {'_', '_'}.

The following are examples of prefix wildcards:

{'_', '_'} - matches all full prefixes
{foo, '_'} - matches all subprefixes of Prefix foo
{foo, x} - matches the subprefix x of prefix foo

Notice that the pattern {'_', bar} is NOT allowed.

Leapsight/plum_db

PlumDB

Installation

Requirements

Configuration

Usage

Standalone testing

Running a 3-node cluster