fubarnetes/calldown

ZFS Support

fabianfreyer opened this issue · 3 comments

Probably the first backing file system we want to support.

Spec

State

All calldown datasets shall be under a single base dataset, here denoted by <base>

  • <base>/storage - extract fetched layers here
  • <base>/storage/empty@extracted - the empty base dataset
  • <base>/storage/<layer_hash>@extracted - extracted dataset including all sub-datasets
  • <base>/runtime - clone images here from <base>/storage and start jails on them
  • <base>/runtime/<container_id> - root filesystem of a Jail

Operations

Extract

Extracting an image would basically be something along the lines of the following pseudocode:

let lower_snapshot = "<base>/storage/empty@extracted";
for layer in image {
    if !exists("<base>/storage/<layer_hash>@extracted") {
        let layer_dataset = lower_snapshot.clone_into("<base>/storage/<layer_hash>");
        layer.extract_over("<base>/storage/<layer_hash>");
        layer_dataset.snapshot("extracted");
    }
    let current_dataset = "<base>/storage/<layer_hash>@extracted";
}

Start a Jail

// 1. Determine the storage "image" to use as the root filesystem.
let layer_hash = runtime_config.topmost_layer;
let basefs = "<base>/storage/<layer_hash>@extracted";
// 2. Clone it into a new runtime root filesystem
basefs.clone_into("<base>/runtime/<container_id>");
// 3. Set up other mounts
// 4. Start Jail

Commit a Jail

After the Jail is stopped, committing runtime state creates a new storage layer with the current changes.

// 1. Get the runtime path and calculate the hash of the new layer
let runtime_rootfs = "<base>/runtime/<container_id>";
let hash = calculate_layer_hash(runtime_rootfs);
// 2. Snapshot the current runtime state 
let snap = runtime_rootfs.snapshot("extracted");
let storage_layer = snap.clone_into("<base>/storage/<hash>");
// 3a) Promote the cloned storage layer.
//     At the moment the dependency chain is
//         [base layer] -> [runtime]@extracted -> [new layer].
//     After promotion, this is reversed:
//         [base layer] -> [new layer]@extracted -> [runtime].
storage_layer.promote();
storage_layer.snapshot("extracted");
// 3b) If we aren't going to reuse the runtime (e.g. to build another layer), delete it:
runtime_rootfs.destroy();
// 4. Update container config to add layer <hash>

Package

This would be going through all changes between a layer and the layer immediately below it with zfs diff <base>/storage/<base_layer>@extracted <base>/storage/<layer>@extracted or similar, and collect the following in a tarball:

  • whiteout list containing deleted files
  • changed / added files

Implementation notes

  • The existing Rust libzfs bindings are zfsonlinux-specific. I doubt there will be any FreeBSD support, but I've opened whamcloud/rust-libzfs#63 about it.

  • We can generate bindings with bindgen for libzfs_core. However, these just implement a few transactions like clone, snapshot, delete_snapshots etc. They don't have any form of dataset enumeration or property inspection/manipulation.

  • For all other operations, we would have to shell out to zfs and zpool. That kind of sucks.

  • However, we can use ZFS Channel Programs to implement the missing functionality as libzfs_core implements lzc_channel_program.

  • We need to figure out some way to parse and generate nvlists. One option seems to be the libnv crate.

Have you looked at https://github.com/jmesmon/rust-libzfs for libzfs_core bindings? It may need to be adopted but it seems to provide a reasonable starting point.

@dsheets yes, I've looked at it. I agree it seems like a reasonable starting point, but, it doesn't seem to be maintained at the moment, and doesn't come with a license. Additionally I'm not really sure I'd want to end up with my own libzfs bindings in-tree. Also, as the libzfs crate on crates.io is https://github.com/whamcloud/rust-libzfs, I think it would probably be the best if that would end up supporting FreeBSD :)