Vennbase

A (pretty much WIP) disk-efficient multimedia database that partitions data by content type.

The following features are for documentation purpose, and they may not be implemented yet.

Querying the database

# Let the 'venn' alias be:
alias venn="nc 127.0.0.1 1834 -qv"

Creating a record with `save`

Request:

save <content-type> <n>
<tag-1>
<tag-2>
...
<tag-n>
<binary-data>

Response OK:

OK <created-uuid>

Response Error:

ERROR None

Examples:

Store a new image in the database.

venn <<< $'save image/png len=692521 tags=['pink' 'anime' 'rock']\n' < ./data/image.png

Storing an image without tags.

img_len=$(wc -c < ./data/image.png)
venn <<< $'save image/png len=${img_len}\n' < ./data/image.png

Querying records with `query`

Vennbase queries are written in a custom query language, similar to the logic expressions you already know:

query skip=<n> limit=<m>
<query>

Response OK:

OK <n>
<uuid-1>
<uuid-2>
<uuid-3>
...
<uuid-n>

Response Error:

ERROR 0
<empty>

Examples:

Retrieving the images and videos with tags pink and anime.

venn <<< $'query (mime:image/* && tag:anime) || (mime:video/* && !tag:anime)'

venn <<< $'query skip=20 limit=10 (tag:'pink' || tag:'anime') && (mime:image/* || mime:video/*)'

Fetching records with `get`

General request:

get <id> [<width|auto>x<height|auto>]

Non-image types will ignore the <width>x<height> parameter.

Response OK:

<mimetype> <size>
<...data>

Response when record doesn't exist:

NOT_FOUND 0
<empty>

Response on error:

ERROR 0
<empty>

Examples:

Downloading a record with ID f81d4fae-7dec-11d0-a765-00a0c91e6bf6 but ignoring the body.

venn <<< $'get f81d4fae-7dec-11d0-a765-00a0c91e6bf6' | head -n +1
# returns
image/png 69524

venn <<< $'get f81d4fae-7dec-11d0-a765-00a0c91e6bf6' | awk 'NR>1' > ./image.png

Obtaining the record metadata with `meta`

Record metadata consists on the record tags list, and pre-defined metadata. Record Mime Type is also returned.

meta <id>

Response OK:

OK <mimetype> <tags-number>
<...n-tags>
<...metadata>

Response Not Found

NOT_FOUND

Response Error

ERROR

Database and partitions

A .vennbase database file contains information about the database with the following structure:

Length	Content
16 bytes	A version string with the form `vennbase@version`
32 bytes	The Database name
64 bits	Database creation timestamp

Database partitions are represented as .vennpart files in the same directory as the .vennbase database. Each partition represents a different content type of multimedia.

Length	Content
64 bits	Partition creation timestamp
64 bits	Last partition compaction timestamp
—	List of record structures

Where each record structure has the following structure:

Length	Content
1 bit	A bit indicating whether this record is active or not.
7 bits	Record bit flags (reserved for future use; must be zero)
16 bytes	The ID (UUID v4) of the record
64 bits	Unsigned record length (`l`) in bytes
`l` bytes	The actual record data

Inactive records will be deleted in the next database compaction.

Please note:

All Vennbase data is stored in little-endian format.
All Vennbase strings are UTF-8 encoded.

Vennbase data types

Timestamps

Vennbase timestamps are signed 64-bit integers representing the number of milliseconds since the UNIX epoch (UTC). Be aware of this when converting Venn timestamps to other formats like the one from the ECMAScript specification, which uses 54 bits instead.

Vennbase uses the chrono trait to generate its timestamps. For more information, refer to its documentation.

impl VennTimestamp {
    pub fn now() -> Self {
        VennTimestamp(chrono::Utc::now().timestamp_millis())
    }
}

To do

Implement in-memory caching with shared_buffers like PostgreSQL. Currently, all key-value lookups are in-memory, which can cause performance issues with large databases.

paoloose/vennbase

Vennbase

Querying the database

Creating a record with save

Querying records with query

Fetching records with get

Obtaining the record metadata with meta

Database and partitions

Vennbase data types

Timestamps

To do

Creating a record with `save`

Querying records with `query`

Fetching records with `get`

Obtaining the record metadata with `meta`