/rust-lang-playground-2023

Toying around a bit with rust-lang in 2023

Primary LanguageRust

rust-lang playground 2023

Toying around a bit with rust-lang in 2023.

Why Rust? Because it promises the speed of C/C++ with type/memory safety, without the runtime overhead of golang. (Five or six threads for a simple Hello world...!?)

Sections:

Install

Setting up rust on Ubuntu/Jammy as normal user:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Why? Because the rustc supplied with Ubuntu is slightly outdated, causing more headaches than stability.

This will install files in ~/.cargo, like ~/.cargo/bin/rustc and ~/.cargo/registry/src/github.com-1ecc6299db9ec823/libc-0.2.142/src/lib.rs.

After ensuring your PATH is up to date, you can now create hello.rs:

fn main() {
    println!("Hello World!");
}

Compile and run:

$ rustc hello.rs
$ ./hello
Hello World!
$ stat -c%s hello
4249176

4MiB is rather large. We can make it a tad bit smaller:

$ rustc hello.rs --edition=2021 -C strip=symbols \
    -C lto=true -C opt-level=3 -C panic=abort
$ stat -c%s hello
301448
$ ./hello
Hello World!

Without the miscellaneous -C options, you'll get a (way) bigger binary. Until release, you'll probably want to stick with the defaults. (But see the use of Cargo.toml below.)

Note that adding -C panic=abort is less beneficial than it looks. The binary is linked against libgcc_s.so.1 regardless, even when we're not using stack unwinding on panic:

$ ldd hello
        linux-vdso.so.1 (0x00007ffe93d48000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007ff685c23000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ff6859fb000)
        /lib64/ld-linux-x86-64.so.2 (0x00007ff685cb3000)
$ diff -pu <(nm -D hello-panic-unwind) <(nm -D hello-panic-abort)
--- /dev/fd/63      2023-05-04 12:01:40.967163712 +0200
+++ /dev/fd/62      2023-05-04 12:01:40.967163712 +0200
@@ -48,14 +48,12 @@
                  U sysconf@GLIBC_2.2.5
                  U __tls_get_addr@GLIBC_2.3
                  U _Unwind_Backtrace@GCC_3.3
-                 U _Unwind_DeleteException@GCC_3.0
                  U _Unwind_GetDataRelBase@GCC_3.0
                  U _Unwind_GetIP@GCC_3.0
                  U _Unwind_GetIPInfo@GCC_4.2.0
                  U _Unwind_GetLanguageSpecificData@GCC_3.0
                  U _Unwind_GetRegionStart@GCC_3.0
                  U _Unwind_GetTextRelBase@GCC_3.0
-                 U _Unwind_RaiseException@GCC_3.0
                  U _Unwind_Resume@GCC_3.0
                  U _Unwind_SetGR@GCC_3.0
                  U _Unwind_SetIP@GCC_3.0

Why am I so obsessed with size? Because I'd like to use programs not only as microservices, but also as simple binaries. The concept of simple programs taking up 4MiB is ridiculous to me.

Variables and functions

For starters, this println! that we see is a macro, not a function. Because functions take have a fixed arity, macros can be used to support multiple arguments or differing argument types.

See variadics.

Calling a function might look like this:

fn add(a: u8, b: u8) -> u16 {
    // thread 'main' panicked at 'attempt to add with overflow'
    // run with `RUST_BACKTRACE=full` for a verbose backtrace
    //let c: u16 = (a + b) as u16;
    let c: u16 = (a as u16) + (b as u16);
    // We can do an explicit return
    return c;
    // Otherwise the last statement without semi-colon is the return value
    0xbeef
}

fn main() -> () {
    // See: https://doc.rust-lang.org/rust-by-example/hello/print.html
    println!("add = 0x{:x}", add(255, 255));
    // Exit with something other than 0?
    std::process::exit(1)
}
$ rustc func.rs
$ ./func
add = 0x1fe

What we've also learnt here, is that we want to set RUST_BACKTRACE=full in the environment when running microservices. We do want full backtraces if something goes wrong.

Cargo.toml

For projects that are not toy examples, we'll use cargo and a Cargo.toml file.

Use cargo new to set up a directory:

$ cargo new helloproj
     Created binary (application) `helloproj` package
$ find helloproj/ -type f
helloproj/Cargo.toml
helloproj/src/main.rs

This includes the Hello world app we saw earlier and a Cargo.toml that looks like this:

[package]
name = "helloproj"
version = "0.1.0"
edition = "2021"

[dependencies]

This edition setting is important. Don't omit it.

$ cd helloproj
$ cargo build
   Compiling helloproj v0.1.0 (rust-lang-playground-2023/helloproj)
    Finished dev [unoptimized + debuginfo] target(s) in 0.33s
$ ./target/debug/helloproj
Hello, world!

Setting default optimization options for the --release build in Cargo.toml:

[profile.release]
strip = true        # Automatically strip symbols from the binary
                    # (don't use for microservices, you want backtraces)
#opt-level = "z"    # Optimize for size?
lto = true          # Enable Link Time Optimization (LTO)
codegen-units = 1   # serial build, slow, but better opt
#panic = "abort"    # No debug stacktrace awesomeness?

Now we build using cargo build --release. The output is at ./target/release/helloproj.

Dependencies

Let's do this again, creating helloasm, but now we create a library instead.

We reimplement parts of 151-byte static Linux binary in Rust (did I mention I like small things?), just to get a feel of Rust low level internals.

While still in the helloasm directory, we can add some dependencies:

$ cargo add syscalls
    Updating crates.io index
      Adding syscalls v0.6.10 to dependencies.
$ tail -n2 Cargo.toml
[dependencies]
syscalls = "0.6.10"

We alter main.rs to lib.rs:

use syscalls::{Sysno, syscall};

fn exit(n: usize) -> ! {
    unsafe {
        let _ignored_retval = syscall!(Sysno::exit, n);
        std::hint::unreachable_unchecked();
    }
}

fn write(fd: usize, buf: &[u8]) -> isize {
    let res; // or: let r: Result<usize, Errno>;
    unsafe {
        res = syscall!(Sysno::write, fd, buf.as_ptr(), buf.len());
    };
    let ret: isize;
    match res {
        Ok(val) => { ret = val as isize; }
        Err(_) => { ret = -1; },
    };
    ret
}

#[no_mangle]
pub fn main() {
    write(1, "Hello, world!\n".as_bytes());
    exit(0);
}

We set the project output type to rlib in Cargo.toml:

[lib]
crate-type = ["rlib"]

I added a small Makefile for convenience. Letting us fetch main.o:

$ make
cargo build --release
   Compiling helloasm v0.1.0 (/home/walter/srcelf/rust-lang-playground-2023/helloasm)
    Finished release [optimized] target(s) in 0.09s
f=$(ar t target/release/libhelloasm.rlib | grep -vxF lib.rmeta) && \
  ar x target/release/libhelloasm.rlib "$f" && \
  mv "$f" main.o
$ objdump -dr main.o
...

0000000000000000 <main>:
   0:       48 8d 35 00 00 00 00    lea    0x0(%rip),%rsi        # 7 <main+0x7>
                        3: R_X86_64_PC32    .rodata..Lanon.fad58de7366495db4650cfefac2fcd61.0-0x4
   7:       b8 01 00 00 00          mov    $0x1,%eax
   c:       bf 01 00 00 00          mov    $0x1,%edi
  11:       ba 0e 00 00 00          mov    $0xe,%edx
  16:       0f 05                   syscall
  18:       b8 3c 00 00 00          mov    $0x3c,%eax
  1d:       31 ff                   xor    %edi,%edi
  1f:       0f 05                   syscall
  21:       0f 0b                   ud2

Okay. This demonstrates that we can write (close to) assembler code. This is totally not useful for common programming tasks.

Can we call this from other Rust code?

If we go back to helloproj, we can add dependencies to our local project:

[dependencies]
helloasm = { path = "../helloasm" }

In this particular case, we used #[no_mangle] in helloasm to avoid getting a name like _ZN8helloasm4main17h7914df8e74e71984E.

Now we'd get a duplicate name when changing our Hello world application:

fn main() {
    helloasm::main();
}
error: entry symbol `main` declared multiple times
 --> src/main.rs:1:1
  |
1 | fn main() {
  | ^^^^^^^^^
  |
  = help: did you use `#[no_mangle]` on `fn main`? Use `#[start]` instead

If we want to be able to call into helloasm from helloproj, we'll have to mangle (remove no_mangle) or rename the function:

#[no_mangle]
pub fn any_name_except_main() {
    write(1, "Hello, world, using syscalls!\n".as_bytes());
    exit(0);
}
pub fn main() {
    write(1, "Hello, world, using syscalls!\n".as_bytes());
    exit(0);
}

Either fix works:

$ ./target/debug/helloproj
Hello, world, using syscalls!

And that concludes basic cargo and library usage.

Keeping track of dependencies

Where do we keep track of source libraries versions so that vulnerable components can highlighted when security vulnerabilities have been discovered?

Do we need a software bill of materials (SBOM)? Do we have to generate it ourselves? Can Rust keep track of library (crate) versions inside the binaries?

A quick glance at the crates shows cargo-auditable. Using it should be a matter of:

$ cargo install cargo-auditable cargo-audit

Build helloproj again, this time with cargo auditable build:

$ cargo auditable build

This adds a JSON blob to the binary:

$ objcopy --dump-section .dep-v0=/dev/stdout target/debug/helloproj | pigz -zd
{"packages":[
  {"name":"cc","version":"1.0.79","source":"crates.io","kind":"build"},
  {"name":"helloasm","version":"0.1.0","source":"local","dependencies":[9]},
  {"name":"helloproj","version":"0.1.0","source":"local","dependencies":[1],"root":true},
  {"name":"proc-macro2","version":"1.0.56","source":"crates.io","dependencies":[10]},
  {"name":"quote","version":"1.0.26","source":"crates.io","dependencies":[3]},
  {"name":"serde","version":"1.0.160","source":"crates.io","dependencies":[6]},
  {"name":"serde_derive","version":"1.0.160","source":"crates.io","dependencies":[3,4,8]},
  {"name":"serde_repr","version":"0.1.12","source":"crates.io","dependencies":[3,4,8]},
  {"name":"syn","version":"2.0.15","source":"crates.io","dependencies":[3,4,10]},
  {"name":"syscalls","version":"0.6.10","source":"crates.io","dependencies":[0,5,7]},
  {"name":"unicode-ident","version":"1.0.8","source":"crates.io"}]}

This is quite nice. The second thing to remember when building microservices with Rust.