CapsuleDB is a key-value store built specifically for use with the Global Data Plane and Paranoid Stateful Lambda system. It is inspired by level-tree databases, such as RocksDB, but uses DataCapsules as its backing storage medium. Read the full description of CapsuleDB here. This repo serves as the main development point for future CapsuleDB work.
CapsuleDB uses Bazel as its main building tool like the PSL project. We have a Docker container that includes everything you need to get started. Using the Docker container is highly recommended, and if you decide to go baremetal you are on your own.
To begin, run the following two commands:
MY_PROJECT=/path/to/repo
docker run -it --rm \
-v bazel-cache:/root/.cache/bazel \
-v "${MY_PROJECT}":/opt/my-project \
-w /opt/my-project \
keplerc/paranoid-asylo:latest
Given that you will be running these two commands every time you start working on CapsuleDB, I would recommend writing a script. The commands should result in you being dropped into an interactive session within the Docker container. Specifically, you should see the following:
root@<container_hash>:/opt/my-project#
At this point, executing the ls
command should list the folders in the repo. From here, you can execute Bazel commands as normal. To ensure everything is working correctly, start with the following basic sanity check:
bazel run //src:test_core
This should result in 1999 key-value pairs being written and retrieved from CapsuleDB. The stats at the end should read as:
Num levels at end: 3
First key l0: 1901
Last key l0: 1950
L0 max size: 100
First key l1: 1801
Last key l1: 999
L1 max size: 1000
First key l2: 1
Last key l2: 188
L2 max size: 10000
no.of.keys not found is:0
no.of.keys found is:1999
size of test_map: 1999
Congratulations! You have just executed your first puts and gets against CapsuleDB!
CapsuleDB is equipped with an enclave interface, allowing it to be securely run within an Intel SGX enclave. It is built using Asylo, so that theoretically, once Asylo supports other trusted execution environment backends, it will be immediately portable. However, Asylo only supports Intel SGX at the moment, and thus so do we.
This codebase contains two slightly different enclave based tests, test_enclave_basic.cc
and test_enclave_interface.cc
, both of which can be found in the test_files
directory. test_enclave_basic.cc
tests the basic functionality of the enclave itself, linking directly to it. In contrast, test_enclave_interface.cc
goes through a separate interface designed for better modularity. They can be run with the following commands:
bazel run //src:test_enclave_basic_sgx_sim --
bazel run //src:test_enclave_interface_sgx_sim --
If SGX hardware is available, you can use the following commands instead:
bazel run //src:test_enclave_basic_sgx_hw --
bazel run //src:test_enclave_interface_sgx_hw --
Some guidance on writing new enclave tests: To use the interface, you must use the provided constructor and the finalize()
method. In addition, to use the interface you must mark //src/enclave:enclave_interface
as a dependency in the relevant BUILD
file. Then, ensure you have all the correct Asylo and SGX dependencies. Finally, ensure you include the relevant abseil functions at the top of your test, as well as the relevant absl
dependencies, so that your test correctly retrives the enclave name. See the test_enclave_interface.cc
file and the test_enclave_interface
target in src/BUILD
for an example.
Note that the enclave interface differs from the way the enclave is implemented in the PSL project. There, a function called handle()
manages the actual processing of the CapsulePDUs. While also doable here, CapsuleDB's current model does not require it. Given that it is unclear whether there are any additional performance benefits, I did not implement it at this time.
Until this point, CapsuleDB and its associated tests have essentially been run as an application, it runs, finishes, and exits. However, databases rarely operate that way, instead running as a server or service. Using ZeroMQ, CapsuleDB can be run as a server, receiving and processing requests regardless of their source. You can run the basic networking test using two different terminals using the following commands:
MY_PROJECT=/path/to/capsuledb_repo
docker run -it --rm \
-v bazel-cache:/root/.cache/bazel \
-v "${MY_PROJECT}":/opt/my-project \
-w /opt/my-project \
keplerc/paranoid-asylo:latest
You should now be in an interactive session within the container. In a separate terminal window, execute docker ps
to find the name of the container. In your second terminal, run docker exec -it <container_name> bash
to gain a second terminal within the container.
In your first terminal window, you will run the server. Run the following command:
bazel run //src/network:capsuleDB_server
Then, in your second terminal, execute the test with:
bazel run //src:test_network
If it is successful, you should see the following result:
Connecting to hello world server...
Connected
Sending payload
Start Get
Sending get request
Receiving get request
Retrieved value: Test value
Switching back to your first terminal, you will also be able to monitor and see CapsuleDB handling the incoming requests. When running the test, you should see the following in your server window:
Starting server
Server ready
Received new message
Received put request
Payload key: Test key and value: Test value
PUT key=Test key, value=Test value
Received new message
Received get request
GET key=Test key
To empty the database, simply stop the process in the server window and start it again with the same bazel run command. If you would like to test using an enclave version of CapsuleDB, start the server with the following command instead:
bazel run //src/network:capsuleDB_enclave_server
This server should also be used for YCSB benchmarking; instructions and bindings can be found here.
CapsuleDB is made up of several core files stored in src/core
. There you will find the main CapsuleDB logic, including the main engine file, the indexing logic, and the block logic. You will also come across several BUILD
files used to identify different Bazel packages. For more information on packages, I highly recommend you look through this part of the Bazel docs.
In addition to the core files, you will see two other types of folder. The first is the src/shared
folder which contains several important files that are used repeatedly across CapsuleDB. For example, src/shared/capsule.h
holds the kvs_payload
structure. That folder already has a BUILD
file with targets for each of the shared files ready to go, all you need to do is add //src/shared:desiredtarget
to the deps
section of your target. The other type of folder is a feature folder, such as src/network
or src/enclave
. This folders represent subpackages that extend the basic CapsuleDB functionality. They must have their own BUILD files.
You may also notice a BUILD
file in the root of the src
directory. This file should only be used to write tests. Place your tests in the src/test_files
directory and then use this BUILD
file to build them. The reasoning behind this is that integration tests will be crossing packages so often it makes more sense to have the BUILD
file at a higher level. Frankly I am not sure this is actually the best practice, but the Bazel docs are so convoluted I am not sure they know what the best practice is either.
Since there are many of us working on CapsuleDB simultaneously, please make sure you are doing your work on the correct branch! For large additions, please follow the naming convention of dev-featurename
for your branch. Then, place all of your additional code in a new folder (for an example see the network and enclave folder) along with a BUILD file. Finally, submit a pull request when it is ready to go!
There is lots to do on CapsuleDB! Here is a short list of things that still needs to be dealt with, in order of priority.
- Integrate DC server
- Integrate signing
- Improve block search algorithm (binary search?) for reads
- Investigate Bloom Filters
- Update serialization so it does not use a counter
- Remove CapsuleDB implementation currently in PSL and convert to either Git submodule or use a Bazel load targeting this repo / version (hooray for modularity! 🎉)
- Update README with description of how CapsuleDB actually works under the hood
- Explore adding a wiki for CapsuleDB
- Add option to compress CapsuleBlocks by using block write time as timestamp
- Add failure restoration / startup from index
- Add key deletion (must be local, no mechanism to do so from DC yet)
Lead: William Mullen
Contributors:
- Nivedha Krishnakumar
- Brian Wu
- Willis Wang
- Ryan Teoh