A (pretty much WIP) disk-efficient multimedia database that partitions data by content type.
The following features are for documentation purpose, and they may not be implemented yet.
# Let the 'venn' alias be:
alias venn="nc 127.0.0.1 1834 -qv"
Request:
save <content-type> <n>
<tag-1>
<tag-2>
...
<tag-n>
<binary-data>
Response OK:
OK <created-uuid>
Response Error:
ERROR None
Examples:
Store a new image in the database.
venn <<< $'save image/png len=692521 tags=['pink' 'anime' 'rock']\n' < ./data/image.png
Storing an image without tags.
img_len=$(wc -c < ./data/image.png)
venn <<< $'save image/png len=${img_len}\n' < ./data/image.png
Vennbase queries are written in a custom query language, similar to the logic expressions you already know:
query skip=<n> limit=<m>
<query>
Response OK:
OK <n>
<uuid-1>
<uuid-2>
<uuid-3>
...
<uuid-n>
Response Error:
ERROR 0
<empty>
Examples:
Retrieving the images and videos with tags pink and anime.
venn <<< $'query (mime:image/* && tag:anime) || (mime:video/* && !tag:anime)'
venn <<< $'query skip=20 limit=10 (tag:'pink' || tag:'anime') && (mime:image/* || mime:video/*)'
General request:
get <id> [<width|auto>x<height|auto>]
Non-image types will ignore the <width>x<height>
parameter.
Response OK:
<mimetype> <size>
<...data>
Response when record doesn't exist:
NOT_FOUND 0
<empty>
Response on error:
ERROR 0
<empty>
Examples:
Downloading a record with ID f81d4fae-7dec-11d0-a765-00a0c91e6bf6
but ignoring the body.
venn <<< $'get f81d4fae-7dec-11d0-a765-00a0c91e6bf6' | head -n +1
# returns
image/png 69524
venn <<< $'get f81d4fae-7dec-11d0-a765-00a0c91e6bf6' | awk 'NR>1' > ./image.png
Record metadata consists on the record tags list, and pre-defined metadata. Record Mime Type is also returned.
meta <id>
Response OK:
OK <mimetype> <tags-number>
<...n-tags>
<...metadata>
Response Not Found
NOT_FOUND
Response Error
ERROR
A .vennbase
database file contains information about the database with the
following structure:
Length | Content |
---|---|
16 bytes | A version string with the form vennbase@version |
32 bytes | The Database name |
64 bits | Database creation timestamp |
Database partitions are represented as .vennpart
files in the same directory as the .vennbase
database. Each partition represents a different content type of multimedia.
Length | Content |
---|---|
64 bits | Partition creation timestamp |
64 bits | Last partition compaction timestamp |
— | List of record structures |
Where each record structure has the following structure:
Length | Content |
---|---|
1 bit | A bit indicating whether this record is active or not. |
7 bits | Record bit flags (reserved for future use; must be zero) |
16 bytes | The ID (UUID v4) of the record |
64 bits | Unsigned record length (l ) in bytes |
l bytes |
The actual record data |
Inactive records will be deleted in the next database compaction.
Please note:
- All Vennbase data is stored in little-endian format.
- All Vennbase strings are UTF-8 encoded.
Vennbase timestamps are signed 64-bit integers representing the number of milliseconds since the UNIX epoch (UTC). Be aware of this when converting Venn timestamps to other formats like the one from the ECMAScript specification, which uses 54 bits instead.
Vennbase uses the chrono trait to generate its timestamps. For more information, refer to its documentation.
impl VennTimestamp {
pub fn now() -> Self {
VennTimestamp(chrono::Utc::now().timestamp_millis())
}
}
- Implement in-memory caching with
shared_buffers
like PostgreSQL. Currently, all key-value lookups are in-memory, which can cause performance issues with large databases.