Transient file system service featuring highly paralleled indexing on both file data and file system metadata.
XXXXXXXXX
XX XX XX XXXXXXXXXXX
XX XX XX XX
XX XX XX XX
XX XX XX XX XX
XX XX XX XX XXXXXXXXX
XX XX XXXXXXX XX XXXXXXXXXXXXXXX XX XX
XX XX XX XX XX XX XX XX XX XX
XX XX XX XX XX XX XX XX XX XX
XX XX XXXXXXXXXX XX XX XX XX XX XXXXXXXX
XX XX XX XX XX XX XX XX XX
XX XX XX XX XX XX X XX XX XX XX
XXXXXXXXX XXXXXXX XX XX XX XX XX
This software was developed, in part, under U.S. Government contract 89233218CNA000001 for Los Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC for the U.S. Department of Energy/National Nuclear Security Administration. Please see the accompanying LICENSE.txt for further information. DeltaFS is still under development.
- Serverless design featuring zero dedicated metadata servers and no global file system namespace.
- Application-owned metadata service harnessing compute nodes to handle metadata and achieve highly agile scalability.
- Freedom from unjustified synchronization among HPC applications that do not need to use the file system to communicate.
- Write-optimized LSM-based metadata representation with file system namespace snapshots as the basis of inter-job data sharing and workflow execution.
- A special directory type with an embedded striped-down streaming Map-Reduce pipeline.
- A file system as no more than a thin service composed by each application at runtime to provide a temporary view of a private namespace backed by a stack of immutable snapshots and a collection of shared data objects.
- Simplified data center storage consisting of multiple independent underlying object stores, providing flat namespaces of data objects, and oblivious of file system semantics.
DeltaFS is able to run on Linux, Mac OS, as well as most UNIX platforms for both development and local testing purposes. To run DeltaFS in production, it must be a Linux box. DeltaFS is mostly written in C++. C++11 is not required to compile the DeltaFS code, but will be used if the compiler supports it. C++14 or later is currently not used.
Our paper deltafs_pdsw15 provides an overview of the file system, and our other two papers, deltafs_pdsw17 and deltafs_sc18, provide an overview of the Indexed Massive Directory in DeltaFS.
Compiling DeltaFS requires a recent C/C++ compiler, cmake, make, mpi, snappy, glog, and gflags. Compiling some of DeltaFS' dependencies requires a recent autoconf, automake, and libtool.
On Ubuntu 14.04 LTS or later, you may use the following to prepare the system environment for DeltaFS.
sudo apt-get install gcc g++ make # Alternatively, this can also be clang
sudo apt-get install autoconf automake libtool pkg-config
sudo apt-get install cmake cmake-curses-gui
sudo apt-get install libsnappy-dev libgflags-dev libgoogle-glog-dev
sudo apt-get install libmpich-dev # Alternatively, this can also be libopenmpi-dev
sudo apt-get install mpich
For Mac OS, see README.mac.md for instuctions.
DeltaFS assumes an underlying object storage service to store file system metadata and file data. This underlying object store may just be a shared parallel file system such as Lustre, GPFS, PanFS, and HDFS. However, a scalable object storage service is suggested to ensure high performance and currently DeltaFS supports Ceph RADOS.
On Ubuntu 14.04 LTS or later, RADOS can be installed via apt-get.
sudo apt-get install librados-dev
Distributed DeltaFS instances require an RPC library to communicate with each other. Currently, we use Mercury and Mercury itself supports multiple network backends, such as MPI, bmi on tcp, and cci on a variety of underlying network abstractions including verbs, tcp, sock, and raw eth.
Please follow online Merury documentation to install Mercury and one or more of its backends. To start, we suggest using bmi as the network backend. Compiling Mercury may also require the installation of openpa, depending on the presence of <stdatomic.h>
.
# BMI
git clone http://git.mcs.anl.gov/bmi.git && cd bmi
./prepare && ./configure --enable-shared --enable-bmi-only
make && sudo make install
# OpenPA -- when in the absence of <stdatomic.h>
git clone https://github.com/pmodels/openpa.git && cd openpa
./autogen.sh && ./configure --enable-shared
make && sudo make install
# Mercury
git clone --recurse-submodules https://github.com/mercury-hpc/mercury.git && cd mercury
mkdir build && cd build
cmake -DBUILD_SHARED_LIBS=ON \
-DMERCURY_USE_CHECKSUMS=OFF -DNA_USE_BMI=ON ..
make && sudo make install
After all software dependencies are installed, we can proceed to build DeltaFS. DeltaFS uses cmake and suggests you to do an out-of-source build. To do that, create a dedicated build directory and run 'ccmake' command from it:
cd deltafs
mkdir build
cd build
ccmake -DDELTAFS_COMMON_INTREE=ON ..
Type 'c' multiple times and choose suitable options. Recommended options are:
BUILD_SHARED_LIBS ON
BUILD_TESTS ON ## <-- turn this off to skip building tests
CMAKE_BUILD_TYPE RelWithDebInfo
CMAKE_INSTALL_PREFIX /usr/local
CMAKE_PREFIX_PATH ## <-- this can be empty
DELTAFS_BENCHMARKS OFF
DELTAFS_COMMON_INTREE ON ## <-- this must be ON
DELTAFS_MPI ON ## <-- this must be ON
PDLFS_GFLAGS ON
PDLFS_GLOG ON
PDLFS_MARGO_RPC OFF
PDLFS_MERCURY_RPC ON
PDLFS_RADOS ON ## <-- only if deltafs needs to run on rados
PDLFS_SNAPPY ON
PDLFS_VERBOSE 1
Once you exit the CMake configuration screen and are ready to build the targets, do:
make
To test DeltaFS on a local machine using the local file system to store file system metadata and file data, we can run two DeltaFS server instances and then use a DeltaFS shell to access the namespace. See README.rados.md for testing DeltaFS agaist Ceph RADOS.
mpirun -n 2 ./build/src/server/deltafs-srvr -v=1 -logtostderr
This will start two DeltaFS server instances that store file system metadata in /tmp/deltafs_outputs and file data in /tmp/deltafs_data. Please remove these two folders if they exist before running DeltaFS. The two DeltaFS server instances will begin listening on tcp port 10101 and 10102.
env "DELTAFS_MetadataSrvAddrs=127.0.0.1:10101&127.0.0.1:10102" "DELTAFS_NumOfMetadataSrvs=2" \
./build/src/cmds/deltafs-shell -v=1 -logtostderr
This will start a DeltaFS shell and instruct it to connect to DeltaFS servers we previously started. Currently, this is just a simple shell that allows us to create directories, copy files from the local file system to DeltaFS, and cat files in DeltaFS.
Currently, applications have to explicitly link to DeltaFS user library (include/deltafs_api.h) in order to call DeltaFS. Alternatively, DeltaFS may be implicitly invoked by preloading fs calls made by an application and redirecting them to DeltaFS. We have developed one such library and it is available here, https://github.com/pdlfs/pdlfs-preload.