/mustang

Rust programs written entirely in Rust

Primary LanguageRustOtherNOASSERTION

mustang

Programs written entirely in Rust

Github Actions CI Status zulip chat crates.io page docs.rs docs

Mustang is a system for building programs built entirely in Rust, meaning they do not depend on any part of libc or crt1.o, and do not link in any C code.

Why? For fun! And to exercise some components built for other purposes (such as rustix) but which happen to also be part of what's needed to do what Mustang is doing. And in the future, possibly also for experimenting with new kinds of platform ABIs.

Mustang is organized as 4 crates:

  • origin, a Rust-idiomatic library for program and thread startup and shutdown
  • c-scape, the no_std part of a libc implementation
  • c-gull, the std-using part of a libc implementation
  • mustang, packages the above into a usable target

Mustang currently runs on Rust Nightly on Linux on x86-64, x86, aarch64, and riscv64. It aims to support all Linux versions supported by Rust, though at this time it's only tested on relatively recent versions. It's complete enough to run:

Mustang isn't about making anything safer, for the foreseeable future. The major libc implementations are extraordinarily well tested and mature. Mustang for its part is experimental and has lots of unsafe.

Usage

To use it, first install rust-src, which is needed by -Z build-std:

$ rustup component add rust-src --toolchain nightly

Then, set the RUST_TARGET_PATH environment variable to a path to the mustang/target-specs directory, so that you can name mustang targets with --target=.... For example, within a mustang repo:

$ export RUST_TARGET_PATH="$PWD/mustang/target-specs"

Then, in your own crate, add a dependency on mustang:

[dependencies]
mustang = "<current version>"

And add a mustang::can_run_this!(); to your top-level module (eg. main.rs). This does nothing in non-mustang-target builds, but in mustang-target builds arranges for mustang's libraries to be linked in.

mustang::can_run_this!();

Then, compile with Rust nightly, using -Z build-std and --target=<mustang-target>. For example:

$ cargo +nightly run --quiet -Z build-std --target=x86_64-mustang-linux-gnu --example hello
Hello, world!
$

That's a Rust program built entirely from Rust saying "Hello, world!"!

For more detail, mustang has an env_logger feature, which you can enable, and set RUST_LOG to see various pieces of mustang in action:

$ RUST_LOG=trace cargo +nightly run --quiet -Z build-std --target=x86_64-mustang-linux-gnu --example hello --features env_logger
[2021-06-28T06:28:31Z TRACE origin::program] Program started
[2021-06-28T06:28:31Z TRACE origin::threads] Main Thread[Pid(3916066)] initialized
[2021-06-28T06:28:31Z TRACE origin::program] Calling `.init_array`-registered function `0x5555558fb480(1, 0x7fffffffdb98, 0x7fffffffdba8)`
[2021-06-28T06:28:31Z TRACE origin::program] Calling `main(1, 0x7fffffffdb98, 0x7fffffffdba8)`
Hello, world!
[2021-06-28T06:28:31Z TRACE origin::program] `main` returned `0`
[2021-06-28T06:28:31Z TRACE origin::program] Program exiting
$

A simple way to check for uses of libc functions is to use nm -u, since the above commands are configured to link libc dynamically. If mustang has everything covered, there should be no output:

$ nm -u target/x86_64-mustang-linux-gnu/debug/examples/hello
$

C Runtime interop

To compile C code with a *-mustang-* target, you may need to tell the cc crate which C compiler to use; for example, for i686-mustang-linux-gnu, set the environment variable CC_i686-mustang-linux-gnu to i686-linux-gnu-gcc.

panic = "abort"

When using panic = "abort" in your Cargo.toml, change the -Z build-std to -Z build-std=panic_abort,std. See here for background.

Known Limitations

Known limitations in mustang include:

  • Dynamic linking isn't implemented yet. Similarly, position-independent static executables aren't implemented yet. Mustang defaults to creating dynamic executables with all libraries statially linked. Position-dependent static executables also work.
  • Many libc C functions that aren't typically needed by most Rust programs aren't implemented yet.
  • Enabling LTO doesn't work yet.
  • Unwinding isn't yet implemented on 32-bit arm, and catch_unwind does not yet work on 32-bit x86.

Background

Mustang is partly inspired by similar functionality in steed, but a few things are different. cargo's build-std is now available, which makes it much easier to work with custom targets. And Mustang is starting with the approach of starting by replacing libc interfaces and using std as-is, rather than reimplementing std. This is likely to evolve, but whatever we do, a high-level goal of Mustang is to avoid ever having to reimplement std.

Where does mustang go from here? Will it support feature X, platform Y, or use case Z? If origin can do program startup in Rust, and rustix can do system calls in Rust, what does it all mean?

And could mustang eventually support new ABIs that aren't limited to passing C-style argc/argv(/envp) convention, allowing new kinds of program argument passing?

Let's find out! Come say hi in the chat or an issue.

How does one port mustang to a new architecture?

  • Port rustix to the architecture, adding assembly sequences for making syscalls.
  • Port origin to the architecture, adding assembly sequences for program and thread primitives.
  • Create a target file in mustang/target-specs, by first following these instructions to generate a specification of a built-in target, and then:
    • change is-builtin to false
    • change dynamic-linking to false
    • add -nostartfiles and -Wl,--undefined=_Unwind_Backtrace to pre-link-args
    • add "vendor": "mustang" See other targets in the mustang/target-specs directory for examples.
  • Compile some of the programs in the examples directory, using the new target. Try nm -u on the binaries to check for undefined symbols which need to be implemented.
  • Add the architecture to tests/tests.rs.
  • Add CI testing to .github/workflows/main.yml, by copying what's done for other architectures.

How does one port mustang to a new OS?

One probably needs to do similar things as for a new architecture, and also write a new origin::rust implementation to handle the OS's convention for arguments, environment variables, and initialization functions.

Similar crates

c-scape has some similarities to relibc, but has a different focus. Relibc is aiming to be a complete libc replacement, while c-scape is just aiming to cover the things used by Rust's std and popular crates. Some parts of Relibc are implemented in C, while c-scape is implemented entirely in Rust.

c-scape is also similar to steed. See the Background for details.

The most significant thing that makes c-scape unique though is its design as a set of wrappers around Rust crates with Rust interfaces. C ABI compatibility is useful for getting existing code working, but once things are working, we can simplify and optimize by changing code to call into the Rust interfaces directly. This can eliminate many uses of raw pointers and C-style NUL-terminated strings, so it can be much safer.

Another similar project is tiny-std. Similar to steed, tiny-std contains its own implementation of std.