/bilge

Use bitsized types as if they were a feature of rust.

Primary LanguageRustApache License 2.0Apache-2.0

bilge: the most readable bitfields

crates.io docs.rs loc

Y e s, this is yet another bitfield crate, but hear me out:

This is a bit better than what we had before.

I wanted a design fitting rust:

  • safe
    • types model as much of the functionality as possible and don't allow false usage
  • fast
    • like handwritten bit fiddling code
  • simple to complex
    • obvious and readable basic frontend, like normal structs
    • only minimally and gradually introduce advanced concepts
    • provide extension mechanisms

The lib is no-std (and fully const behind a "nightly" feature gate).

For some more explanations on the "why" and "how": blog post and reddit comments.

WARNING

Our current version is still pre 1.0, which means nothing is completely stable.

However, constructors, getters, setters and From/TryFrom should stay the same, since their semantics are very clear.

The nightly feature is tested on nightly-2022-11-03 and will not work on the newest nightly until const_convert comes back.

Usage

To make your life easier:

use bilge::prelude::*;

Infallible (From)

You can just specify bitsized fields like normal fields:

#[bitsize(14)]
struct Register {
    header: u4,
    body: u7,
    footer: Footer,
}

The attribute bitsize generates the bitfield, while 14 works as a failsafe, emitting a compile error if your struct definition doesn't declare 14 bits. Let's define the nested struct Footer as well:

#[bitsize(3)]
#[derive(FromBits)]
struct Footer {
    is_last: bool,
    code: Code,
}

As you can see, we added #[derive(FromBits)], which is needed for Register's getters and setters. Due to how rust macros work (outside-in), it needs to be below #[bitsize]. Also, bool can be used as one bit.

Code is another nesting, this time an enum:

#[bitsize(2)]
#[derive(FromBits)]
enum Code { Success, Error, IoError, GoodExample }

Now we can construct Register:

let reg1 = Register::new(
    u4::new(0b1010),
    u7::new(0b010_1010),
    Footer::new(true, Code::GoodExample)
);

Or, if we add #[derive(FromBits)] to Register and want to parse a raw register value:

let mut reg2 = Register::from(u14::new(0b11_1_0101010_1010));

And getting and setting fields is done like this:

let header = reg2.header();
reg2.set_footer(Footer::new(false, Code::Success));

Any kinds of tuple and array are also supported:

#[bitsize(32)]
#[derive(FromBits)]
struct InterruptSetEnables([bool; 32]);

Which produces the usual getter and setter, but also element accessors:

let mut ise = InterruptSetEnables::from(0b0000_0000_0000_0000_0000_0000_0001_0000);
let ise5 = ise.val_0_at(4);
ise.set_val_0_at(2, ise5);
assert_eq!(0b0000_0000_0000_0000_0000_0000_0001_0100, ise.value);

Depending on what you're working with, only a subset of enum values might be clear, or some values might be reserved. In that case, you can use a fallback variant, defined like this:

#[bitsize(32)]
#[derive(FromBits, Debug, PartialEq)]
enum Subclass {
    Mouse,
    Keyboard,
    Speakers,
    #[fallback]
    Reserved,
}

which will convert any undeclared bits to Reserved:

assert_eq!(Subclass::Reserved, Subclass::from(3));
assert_eq!(Subclass::Reserved, Subclass::from(42));
let num = u32::from(Subclass::from(42));
assert_eq!(3, num);
assert_ne!(42, num);

or, if you need to keep the exact number saved, use:

#[fallback]
Reserved(u32),
assert_eq!(Subclass2::Reserved(3), Subclass2::from(3));
assert_eq!(Subclass2::Reserved(42), Subclass2::from(42));
let num = u32::from(Subclass2::from(42));
assert_eq!(42, num);
assert_ne!(3, num);

Fallible (TryFrom)

In contrast to structs, enums don't have to declare all of their bits:

#[bitsize(2)]
#[derive(TryFromBits)]
enum Class {
    Mobile, Semimobile, /* 0x2 undefined */ Stationary = 0x3
}

meaning this will work:

let class = Class::try_from(u2::new(2));
assert!(class.is_err());

except we first need to #[derive(Debug, PartialEq)] on Class, since assert_eq! needs those.

Let's do that, and use Class as a field:

#[bitsize(8)]
#[derive(TryFromBits)]
struct Device {
    reserved: u2,
    class: Class,
    reserved: u4,
}

This shows TryFrom being propagated upward. There's also another small help: reserved fields (which are often used in registers) can all have the same name.

Again, let's try to print this:

println!("{:?}", Device::try_from(0b0000_11_00));
println!("{:?}", Device::new(Class::Mobile));

And again, Device doesn't implement Debug:

DebugBits

For structs, you need to add #[derive(DebugBits)] to get an output like this:

Ok(Device { reserved_i: 0, class: Stationary, reserved_ii: 0 })
Device { reserved_i: 0, class: Mobile, reserved_ii: 0 }

For testing + overview, the full readme example code is in /examples/readme.rs.

Custom -Bits derives

One of the main advantages of our approach is that we can keep #[bitsize] pretty slim, offloading all the other features to derive macros. Besides the derive macros shown above, you can extend bilge with your own derive crates working on bitfields. An example of this is given in /tests/custom_derive.rs, with its implementation in tests/custom_bits.

Back- and Forwards Compatibility

The syntax is kept very similar to usual rust structs for a simple reason:

The endgoal of this library is to support the adoption of LLVM's arbitrary bitwidth integers into rust, thereby allowing rust-native bitfields. Until then, bilge is using the wonderful arbitrary-int crate by danlehmann.

After all attribute expansions, our generated bitfield contains a single field, somewhat like:

struct Register { value: u14 }

This means you could modify the inner value directly, but it breaks type safety guarantees (e.g. unfilled or read-only fields). So if you need to modify the whole field, instead use the type-safe conversions u14::from(register) and Register::from(u14). It is possible that this inner type will be made private.

For some more examples and an overview of functionality, take a look at /examples and /tests.

Alternatives

benchmarks, performance, asm line count

First of all, basic benchmarking reveals that all alternatives mentioned here (besides deku) have about the same performance and line count. This includes a handwritten version.

build-time

Measuring build time of the crate inself (both with its dependencies and without), yields these numbers on my machine:

debug debug single crate release release single crate
bilge 1.67-nightly 8 1.8 6 0.8
bitbybit 1.69 4.5 1.3 13.5 1 9.5 1
modular-bitfield 1.69 8 2.2 7.2 1.6

This was measured with cargo clean && cargo build [--release] --quiet --timings. Of course, the actual codegen time on an example project needs to be measured, too.

handwritten implementation

The common handwritten implementation pattern for bitfields in rust looks somewhat like benches/compared/handmade.rs, sometimes also throwing around a lot of consts for field offsets. The problems with this approach are:

  • readability suffers
  • offset, cast or masking errors could go unnoticed
  • bit fiddling, shifting and masking is done all over the place, in contrast to bitfields
  • beginners suffer, although I would argue even seniors, since it's more like: "Why do we need to learn and debug bit fiddling if we can get most of it done by using structs?"
  • reimplementing different kinds of fallible nested-struct enum-tuple array field access might not be so fun

modular-bitfield

The often used and very inspiring modular-bitfield has a few problems:

  • it is unmaintained and has a quirky structure
  • constructors use the builder pattern
    • makes user code unreadable if you have many fields
    • can accidentally leave things uninitialized
  • from_bytes can easily take invalid arguments, which turns verification inside-out:
    • modular-bitfield flow: u16 -> PackedData::from_bytes([u16]) -> PackedData::status_or_err()?
      • needs to check for Err on every single access
      • adds duplicate getters and setters with postfix _or_err
      • reinvents From<u16>/TryFrom<u16> as a kind of hybrid
    • bilge: usual type-system centric flow: u16 -> PackedData::try_from(u16)? -> PackedData::status()
  • big god-macro
    • powerful, but less readable to the devs of modular-bitfield
    • needs to cover many derives in itself, like impl Debug (other bitfield crates do this as well)
      • bilge: solves this by providing a kind of scope for -Bits-derives

and implementation differences:

  • underlying type is a byte array
    • can be useful for bitfields larger than u128
      • bilge: if your bitfields get larger than u128, you can most often split them into multiple bitfields of a primitive size (like u64) and put those in a parent struct which is not a bitfield

Still, modular-bitfield is pretty good and I had set out to build something equal or hopefully better than it. Tell me where I can do better, I will try.

bitbybit

One of the libs inspired by the same crate is bitbybit, which is much more readable and up-to-date. Actually, I even helped and am still helping on that one as well. After experimenting and hacking around in their code though, I realized it would need to be severely changed for the features and structure I had in mind.

implementation differences (as of 26.04.23):

  • it can do read/write-only, array strides and repeat the same bits for multiple fields
    • bilge: these will be added the moment someone needs it
  • redundant bit-offset specification, which can help or annoy, the same way bilge's reserved fields can help or annoy

deku

After looking at a ton of bitfield libs on crates.io, I didn't find deku. I will still mention it here because it uses a very interesting crate underneath (bitvec). Currently (as of 26.04.23), it generates far more assembly and takes longer to run, since parts of the API are not const. I've opened an issue on their repo about that.

most others

Besides that, many bitfield libs try to imitate or look like C bitfields, even though these are hated by many. I argue most beginners would have the idea to specify bits with basic primitives like u1, u2, ... This also opens up some possibilities for calculation and conversion on those primitives.

Something similar can be said about bitflags, which, under this model, can be turned into simple structs with bools and enums.

Basically, bilge tries to convert bit fiddling, shifting and masking into more widely known concepts like struct access.

About the name: a bilge is one of the "lowest" parts of a ship, nothing else to it :)

Footnotes

  1. This is just a weird rustc regression or my setup or sth, not representative. 2