rust-vmm/community

Crate Addition Request - seccompiler

Closed this issue · 8 comments

Crate Name

seccompiler

Short Description

Name origin: short for "seccomp compiler" or word play of the fact that it brings seccomp to applications, essentially "seccompiling" them (if that was even a word).

Seccomp filters are BPF programs that the kernel loads on the calling thread, via the prctl or seccomp syscalls, in order to restrict the system calls that the given thread can make. Writing BPF code is arguably not easy, so a higher-level language can be useful, for improved readability and maintainability.

Seccompiler is a project that makes seccomp easy to use, by providing a JSON file format that is used for expressing the seccomp filters, a compiler binary crate that transforms the JSON into BPF code and a small library interface that is used for installing the filters.

Let us take a closer look into the seccompiler components:

  • JSON format: A JSON file contains the filters for the entire process, separated onto thread categories. Here is a JSON file example. Here are the file format rules.
  • binary crate - The seccompiler binary compiles the JSON input into a binary file that contains a map from the thread name to the corresponding BPF filter.
./seccompiler
    --target-arch "x86_64"  # The CPU arch where the BPF program will run.
                            # Supported architectures: x86_64, aarch64.
    --input-file "x86_64_musl.json" # File path of the JSON input.
    --output-file "bpf_x86_64_musl" # Optional path of the output file.
                                    # [default: "seccomp_binary_filter.out"]
  • library crate - the seccompiler library exposes two helper functions, for deserializing and installing the BPF programs:
    • pub fn deserialize_binary<R: Read>(reader: R, bytes_limit: Option<u64>) -> Result<BpfThreadMap>
    • pub fn apply_filter(bpf_filter: BpfProgramRef) -> Result<()>

Based on the API defined above, there are a couple of usage models:

  • The compiled binary file is passed to the VMM at runtime. The JSON is compiled ahead of time.
  • The JSON is compiled at build time and the resulting binary is embedded in the VMM binary. Firecracker uses this model, leveraging cargo build scripts. Find out more about how Firecracker will be using seccompiler here

With some modifications to seccompiler, other potential usage models are also possible (please comment if you find them useful for you project):

  • The seccompiler library crate can expose an additional compile() function, that can be used by VMMs to perform launch-time filter compilation, as opposed to the current build-time constraint. The VMM then needs to receive the JSON file as input.
  • The seccompiler library could also provide a library interface for expressing the filters as rust code, instead of JSON. This is similar to the usage model of Cloud hypervisor and Firecracker, prior to the API changes.

Why is this crate relevant to the rust-vmm project?

This crate has its origins in Firecracker's seccomp crate, that is undergoing massive API changes, in an attempt of improving seccomp filter maintainability and libc toolchain compatibility.

Firecracker, as well as other VMMs are using seccomp to restrict the emulation and vCPU threads from making dangerous or unnecessary host system calls (execve for example).

Cloud-hypervisor is using the same code as Firecracker for seccomp jailing, getting the code as a Github dependency.

Having the code in a rust-vmm crate will improve the consumption model. The current model of getting the code from github at a fixed tag does not scale well with bugfixes and the API changes that we're bringing in Firecracker.

It's important to have this effort shared and deduplicated, as security is often a priority of VMMs.

+1

This will be great to have; my team would be keen to use this.

Our use case is a statically linked (musl) binary cross compiled for x86 and aarch64. Because we're statically linking and using only a single version, the interface which allows specifying the syscall allow list directly in the code would be the one that we'd be keen to use. Similar to how it's done in Firecracker at the moment. That'd be the library crate option where we can keep everything in the Rust code and statically link to this library.

Thanks for your input @jgowans I think it would be nice to provide both the full library interface (similar to how Firecracker and Cloud Hypervisor are using it right now), and the seccompiler binary, so that the crate would satisfy more usage models.

I'm curious what folks from Cloud Hypervisor think about this. I remember you were also looking to keep the filters embedded in the VMM source code, without having to worry about external files.
Note though that this is also possible with Firecracker's upcoming usage model, that uses build scripts to run seccompiler and embed the resulting BPF in the VMM binary. But I agree that for some use cases this could be overkill. WDYT @rbradford.

Also, @jiangliu, out of the usage models outlined above, which one would you be interested in using?

In Cloud Hypervisor we want to continue to use the programmatic approach without external files or parsing JSON at runtime.

A compile! macro that builds the filter from an inline string or file at buildtime sounds great, but having to deploy multiple files is less desirable.

Repository was created: https://github.com/rust-vmm/seccompiler,
I will post a PR with the Readme, that includes the full design, in the following days