/foundationdb-block-device

Replicated Block Device backed by FoundationDB

Primary LanguageGoApache License 2.0Apache-2.0

FoundationDB Block Device

Replicated Block Device backed by FoundationDB

What is this

This is an implementation of a block device in userspace which uses FoundationDB as a backend. It provides a replicated block device for non-replicated workloads so they can benefit from transparent block-level replication and enhanced fault tolerance.

Inspired by spullara/nbd

Is it fast?

I did a small benchmark using a FoundationDB cluster of 2 nodes (linux running on macbooks with SSDs, not tuned for FDB at all). FIO benchmark on 1GB file resulted in 10K random read/write IOPS in 4KB blocks and the latency was below 10ms (direct io was used). While doing sequential reads it was able to saturate 1Gbit network link.

Postrgres running in virtualbox showed 900 TPS on TPC-B pgbench workload with a database of size 1g.

How does it work with concurrent volume mounts?

Currently there is a mechanism which relies on lease tokens and fdb transactions to transactionally transfer ownership to the new client and discard any in-flight write requests from the old one.

Current status

It's an early version. There are several important featues which are not implemented yet (such as IOPS limits and volume size estimation) but it works and it's relatively fast!

How to use

Commands are documented in the CLI:

$ ./fdbbd --help
NAME:
   fdbbd - block device using FoundationDB as a backend. 
   Our motto: still more performant and reliable than EBS

USAGE:
   fdbbd [global options] command [command options] [arguments...]

VERSION:
   0.1.0

COMMANDS:
     create   Create a new volume
     list     List all volumes
     attach   Attach the volume
     delete   Delete the volume
     help, h  Shows a list of commands or help for one command

GLOBAL OPTIONS:
   --help, -h     show help
   --version, -v  print the version

Getting started

  1. Set up a FoundationDB cluster.
  2. Build the driver:
sh build.sh
  1. Create a new volume:
$ ./fdbbd create --size 1GB myvolume
  1. If nbd kernel module is not loaded, load it:
$ sudo modprobe nbd
  1. Attach the volume to the system:
sudo ./fdbbd attach --bpt 4 myvolume /dev/nbd0
  1. Create a directory to mount the volume:
mkdir nbdmount
  1. Create a file system on your block device. XFS is a good option:
sudo mkfs.xfs /dev/nbd0
  1. Mount the attached volume:
sudo mount /dev/nbd0 nbdmount/
  1. Done! You have a replicated volume!

What's inside

This project uses Network Block Device kernel module underneath. A unix pipe is used to talk to a kernel, and then driver translates NBD protocol into FoundationDB calls.

Roadmap

There are a few features planned in future releases, ordered by importance:

  1. Bulk insert support via batch transactions
  2. IOPS isolation
  3. CSI implementation
  4. Snapshots
  5. Volume size estimation (using roaring bitmaps or similar)
  6. Client-side encryption
  7. Control panel