This repository is for developer documentation related to various VELOC / SCR components.
All open issues for the components can be view on the Components Project Board.
Repo | Version | Docs? | Testing? |
---|---|---|---|
KVTree | ? | ? | ? |
AXL | ? | ? | ? |
spath | ? | ? | ? |
filo | ? | ? | ? |
shuffile | ? | ? | ? |
redset | ? | ? | ? |
er | ? | ? | ? |
KVTree: Recursive key-value structure
Documentation:
Each KVTree object contains a list of key/value pairs. Each key is a string, each value is another kvtree object. This is a nested data structures, similar to a python dict or perl hash. The library provides functions to serialize a kvtree object to / from a file. It also optionally provides MPI send / recv functions to transfer an object from one process to another.
spath: represent and manipulate file system paths
Documentation:
Create an spath object from a string.
The library includes functions to extract components (such as dirname, basename).
It can create an absolute path or compute a relative path from a source path to a destination path.
It can also simplify a path (i.e., convert ../foo//bar
to foo/bar
).
rankstr: splits processes into groups based on a set of process which have the same input string
Rankstr uses bitonic sort for a scalable method to identify process groups. It is useful to create a communicator of ranks that all share the same storage device, then rank 0 in this communicator can create directory and inform others that dir has been created with barrier. It is also used to split processes into groups based on failure group (failure group of NODE --> splits MPI_COMM_WORLD into subgroups based on hostname).
AXL: Asynchronous transfer library
Documentation:
AXL is used to transfer a file from one path to another using synchronous and asynchronous methods. This can only be done between storage tiers, AXL does not (yet) support movement within a storage tier (such as between 2 compute nodes). Asynchronous methods include via pthreads, IBM BB API, Cray Datawarp. AXL will create directories for destination files.
FILO: File flush and fetch, coordinating file transfers with MPI
Documentation:
Each process in a communicator registers a list of source and destination paths. FILO then computes the union of destination directories and creates them in advance, using minimal mkdir() calls. It executes AXL transfers, optionally using a sliding window for flow control. It will record ownership map of which rank flushed which file (in the rank2file file). This is used to fetch those files back to owner ranks during a restart.
Redset: Encode/decode a set of files with a redundancy method
Documentation:
Redset will create the redundancy data needed for a set of files. It can rebuild a file with provided redundancy information.
Shuffile: Shuffle files between MPI ranks
Documentation:
Files are registered with Used during restart, shuffile will move a file to the 'owning' MPI rank.
ER: Encode + Rebuild
ER is the abstraction of shuffile and redset into a single interface, SCR and VeloC use both and er to simplify the rebuilding steps. On a restart, shuffile is used to first move files back to owning ranks, depending on new rank-to-node mapping. Then redset is used to rebuild missing files after the shuffle.