a simple volumetric datastore for dense 3D data
WARNING! Bossphorus is NOT stable and NOT tested. Use at your own risk, and always keep a backup copy of your data someplace safe.
For more information, see our Features page.
bossphorus simplifies data-access patterns for data that do not fit into RAM. When you write a 100-gigabyte file, bossphorus automatically slices your dataset up to fit in bite-sized pieces.
When you request small pieces of your data for analysis, bossphorus intelligently serves only the parts you need, leaving the rest on disk.
You can either run bossphorus using Python on your host machine, or use the provided Dockerfile to run bossphorus in a Docker container.
docker build -t bossphorus .
mkdir ./uploads
This exposes a simplified wrapper to run bossphorus in a container.
source alias
bossphorus $(pwd)/uploads
You can run bossphorus in demo-mode by omitting the path to your uploads directory. Data saved to bossphorus using this method will be destroyed when you end the bossphorus process! Use only when testing bossphorus out.
pipenv install
mkdir ./uploads
python3 ./run.py
pip3 install -U bossphorus
You can modify the top-level variables in bossphorus/config.py
in order to change where bossphorus stores its data by default, and what size each file is by default.
A word of warning: While larger values of BLOCK_SIZE
will reduce the amount of parallel threads in order to read a small file, it will also increase RAM usage per read. 2563 is probably a good default, unless you have a very good reason to change it.
That's a great question! bossphorus is certainly not the most performant, nor is it the most secure. And it's not versioned or distributed. If you're looking for a volumetric datastore, I would recommend looking below at the Alternatives section for some really well-engineered systems.
The primary advantage of bossphorus is that it uses an identical API to that of bossDB — and so if you anticipate your data growing from a few gigabytes now to a few terabytes later, you can get used to the bossDB ecosystem (intern, ingest, and many more tools) now, and then invest in real bossDB architecture later on with a seamless transition.
bossphorus borrows its indexing pattern from bossDB, a cloud-native database that can store way more data than bossphorus ever could. If your day-to-day routine includes multiple terabytes of volumetric data, bossDB may be for you.
Project | Description | If you want... |
---|---|---|
bossDB | Petabyte-scale, Cloud-Native Volumetric Database | ...faster IO speed and infinite scalability |
DVID | Distributed, Versioned, Image-oriented Dataservice | ...versioned data |
When you make any changes to outward-facing APIs or services, you must update the documentation. To do so, run the following:
cd website/ # enter the docusaurus dir
yarn # install dependencies
GIT_USER=XXXX yarn run publish-gh-pages # build and upload the documentation