Rocksplicator is a set of C++ libraries and tools for building large scale RocksDB based stateful services. Its goal is to help application developers solve common difficulties of building large scale stateful services, such as data replication and cluster management. With Rocksplicator, application developers just need to focus on their application logics, and won't need to deal with data replication nor cluster management.
Rocksplicator includes:
- RocksDB replicator (a library for RocksDB real-time data replication. It supports 3 different replication modes, i.e., async replication, semi-sync replication, and sync replication.)
- Cluster management library and tool for RocksDB replicator based stateful services
- Async fbthrift client pool and fbthrift request router
- A stats library for maintaining & reporting server stats
- A set of other small tool classes for building C++ services.
Introduction of Rocksplicator can be found in in our presentation at 2016 Annual RocksDB meetup at FB HQ and @Scale presentation (starting from 17:30).
Currently, we have 5 different online services based on rocksplicator running at Pinterest, which consist of nearly 20 clusters, over 1600 hosts and process tens of PB data per day.
The third-party dependencies of Rocksplicator can be found in docker/Dockerfile.
Docker is used for building Rocksplicator. Follow the Docker installation instructions to get Docker running on your system.
You can build your own docker image (if you want to change the docker file and test it locally).
cd docker && docker build -t rocksplicator-build .
Or pull the one we uploaded.
docker pull angxu/rocksplicator-build:latest
cd rocksplicator && git submodule update --init
Get into the docker build environment. We are assuming the rocksplicator repo is under $HOME/code/, and $HOME/docker-root is an existing directory.
docker run -v <SOURCE-DIR>:/rocksplicator -v $HOME/docker-root:/root -ti angxu/rocksplicator-build:latest bash
Run the following command in the docker bash to build Rocksplicator:
cd /rocksplicator && mkdir -p build && cd build && cmake .. && make -j
Run the following command in the docker bash:
cd /rocksplicator && mkdir -p build && cd build && cmake .. && make -j && make test
There is an example counter service under examples/counter_service/, which demonstrated a typical usage pattern for RocksDB replicator.
The cluster mangement tool rocksdb_admin.py is under rocksdb_admin/tool/.
Before using the tool, we need to generate python client code for Admin interface as follows.
cd /rocksplicator/rocksdb_admin/tool/ && ./sync.sh
host_file is a text file containing all hosts in the cluster. Each line is for a host in format "ip:port:zone". For example "192.168.0.101:9090:us-east-1c"
python rocksdb_admin.py new_cluster_name config --host_file=./host_file --segment=test --shard_num=1000 --overwrite
python rocksdb_admin.py cluster_name ping
python rocksdb_admin.py cluster_name remove_host "ip:port:zone"
python rocksdb_admin.py cluster_name promote
python rocksdb_admin.py cluster_name add_host "ip:port:zone"
python rocksdb_admin.py cluster_name rebalance
python rocksdb_admin.py "cluster" load_sst "segment" "s3_bucket" "s3_prefix" --concurrency 64 --rate_limit_mb 64
python rocksdb_admin.py cluster_name remove_host old_ip:old_port:zone_a
python rocksdb_admin.py cluster_name promote
python rocksdb_admin.py cluster_name add_host new_ip:new_port:zone_a
python rocksdb_admin.py cluster_name rebalance