/cap-std

Capability-oriented version of the Rust standard library

Primary LanguageRustOtherNOASSERTION

cap-std

Capability-based version of the Rust standard library

A Bytecode Alliance project

Github Actions CI Status zulip chat

The cap-std project is organized around the eponymous cap-std crate, and develops libraries to make it easy to write capability-based code, including:

There is also a cap-std-ext crate available which is maintained independently, and includes further extension APIs for both filesystem APIs (including atomic create/replace on Linux specifically) and passing file descriptors to child processes.

Cap-std features protection against CWE-22, "Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')", which is #8 in the 2021 CWE Top 25 Most Dangerous Software Weaknesses. It can also be used to prevent untrusted input from inducing programs to open "/proc/self/mem" on Linux.

Capability-based security

Operating systems have a concept of resource handles, or file descriptors, which are values that can be passed around within and sometimes between programs, and which represent access to external resources. Programs typically have the ambient authority to request any file or network handle simply by providing its name or address:

let file = File::open("/anything/you/want.txt")?;

There may be access-control lists, namespaces, firewalls, or virtualization mechanisms governing which resources can actually be accessed, but those are typically coarse-grained and configured outside of the application.

Capability-based security seeks to avoid ambient authority, to make sandboxing finer-grained and composable. To open a file, one needs a Dir, representing an open directory it's in:

let file = dir.open("the/thing.txt")?;

Attempts to access paths not contained within the directory:

let hidden = dir.open("../hidden.txt")?;

dir.symlink("/hidden.txt", "misdirection.txt")?;
let secret = dir.open("misdirection.txt")?;

return PermissionDenied errors.

This allows application logic to configure its own access, without changing the behavior of the whole host process, setting up a separate host process, or requiring external configuration.

How do I obtain a Dir?

If every resource requires some other resource to obtain, how does one obtain the first resource?

There currently are three main ways:

  • Use the cap-directories crate to create Dirs for config, cache and other data directories.
  • Use the cap-tempfile crate to create Dirs for temporary directories.
  • Use Dir::open_ambient_dir to open a plain path. This function is not sandboxed, and may open any file the host process has access to.

Examples

There are several examples of cap-std in use:

  • As a sandbox: For a simple yet complete example of cap-std in action, see this port of tide, to use cap-std to access static files, where it prevents path resolution from following symlinks outside of the designated root directory. The diff shows the kinds of changes needed to use this API.

  • As a general-purpose Dir type for working with directories: The io-streams crate uses cap-tempdir to create temporary directories for unit tests. Here, the main benefit of Dir is just convenience—Dir's API lets tests just say dir.open(...) instead of using open(path.join(...)) or dealing with chdir and global mutable state. The fact that it also sandboxes the unit tests is just a nice side effect.

  • As an application data store: See the kv-cli example for a simple example of a program using cap-directories and cap-std APIs to store application-specific data.

  • And, cap-std is a foundation for the WASI implementation in Wasmtime, providing sandboxing and support for Linux, macOS, Windows, and more.

What can I use cap-std for?

cap-std is not a sandbox for untrusted Rust code. Among other things, untrusted Rust code could use unsafe or the unsandboxed APIs in std::fs.

cap-std allows code to declare its intent and to opt in to protection from malicious path names. Code which takes a Dir from which to open files, rather than taking bare filenames, declares its intent to only open files underneath that Dir. And, Dir automatically protects against paths which might include .., symlinks, or absolute paths that might lead outside of that Dir.

cap-std also has another role, within WASI, because cap-std's filesystem APIs closely follow WASI's sandboxing APIs. In WASI, cap-std becomes a very thin layer, thinner than libstd's filesystem APIs because it doesn't need extra code to handle absolute paths.

How fast is it?

On Linux 5.6 and newer, cap-std uses openat2 to implement Dir::open with a single system call in common cases. Several other operations internally utilize openat2, O_PATH, and /proc/self/fd (though only when /proc is mounted, it's really procfs, and there are no mounts on top of it) for fast path resolution as well.

On FreeBSD 13.0 and newer, cap-std uses openat(O_RESOLVE_BENEATH) to implement Dir::open with a single system call in common cases. Several other operations internally utilize AT_RESOLVE_BENEATH and O_PATH for fast path resolution as well.

Otherwise, cap-std opens each component of a path individually, in order to specially handle .. and symlinks. The algorithm is carefully designed to minimize system calls, so opening red/green/blue performs just 5 system calls—it opens red, green, and then blue, and closes the handles for red and green.

What about networking?

cap-std also contains a simple capability-based version of std::net, with a Pool type that represents a pool of network addresses and ports that can be accessed, which serves an analogous role to Dir. It's usable for basic use cases, though it's not yet very sophisticated.

What is cap_std::fs_utf8?

It's similar to cap_std::fs, but uses camino for its Path types, so paths are always valid UTF-8. To use it, opt in by enabling the fs_utf8 feature and using std::fs_utf8 in place of std::fs.

There's also an experimental extension to fs_utf8 which allows losslessly encoding arbitrary host byte sequences within UTF-8 strings, using the arf-strings technique. To try this experiment, opt in by enabling the arf_strings feature.

Similar crates

cap-std provides similar functionality to the openat crate, with a similar Dir type with associated functions corresponding to *at functions. cap-std's Dir type performs sandboxing, including for multiple-component paths. And cap-std supports symlinks as long as they remain within the sandbox, while openat doesn't support following symlinks.

cap-std has some similar functionality to pathrs in that it also explicitly verifies that /proc has actual procfs mounted on it and nothing mounted on top, and it can also use openat2. And it has some similar functionality to unix_fd. However, cap-std uses RESOLVE_BENEATH-style resolution where absolute paths are considered errors, while pathrs and unix_fd use RESOLVE_IN_ROOT-style resolution, where absolute paths are interpreted as references to the base file descriptor. And overall, cap-std seeks to provide a portable std-like API which supports Windows in addition to Unix-like platforms, while pathrs provides a lower-level API that exposes more of the underlying openat2 options and only supports Linux, and unix_fd is specific to Unix-like platforms.

obnth is a new crate which appears to be very similar to cap_std::fs. It's not mature yet, and it doesn't support Windows. It does support openat2-like features such as RESOLVE_NO_XDEV, RESOLVE_NO_SYMLINKS, and RESOLVE_IN_ROOT, including emulation when openat2 isn't available.

Why use RESOLVE_BENEATH?

Capability-based security is all about granularity. We want to encourage applications and users to think about having separate handles for directories they need, so that they're isolated from each other, rather than in terms of having "root directories" containing multiple unrelated resources.

Also, some applications have "well known" absolute path strings present, such as "/etc/resolv.conf", and could accidentally use them within Dir methods. RESOLVE_BENEATH catches such errors early, rather than taking chances with user content inside the Dir.

And, RESOLVE_BENEATH handles symlinks within a Dir consistently. Accessing a symlink to an absolute path within a Dir is always an error. With RESOLVE_IN_ROOT, a symlink to an absolute path in a Dir may succeed, and potentially resolve to something different than it would when resolved through the process filesystem namespace.

Minimum Supported Rust Version (MSRV)

This crate currently works on Rust 1.63, when default features are enabled. Some of the optional features have stricter requirements.