google/zerocopy

Default value for enums

Opened this issue · 2 comments

Project

It's not public but I am working on a high-throughput IO block device subsystem.

Use Case

I want to use enums for an operation log where the enum represents the operation type e.g. Put or Delete.

Current State

If I do

#[derive(FromBytes, FromZeroes, AsBytes, Unpacked)]
#[repr(u8)]
pub enum OperationType {
   Put = 1,
   Delete = 2
}

... I get an error that the enum must have all 256 variants.

Desired State

It sure would be nice if we could specify a default value somewhat like:

#[derive(FromBytes(default = 0), FromZeroes, AsBytes, Unpacked)]
#[repr(u8)]
pub enum OperationType {
   Unknown = 0,
   Put = 1,
   Delete = 2
}

This is similar to how the serde handles this issue.

Thank you for the feature request! The FromBytes trait marks a type that can be soundly viewed from a buffer of arbitrary bytes. Those bytes may or may not be owned, shared borrowed, or mutably borrowed. How should this behave:

let bytes = &[42u8];
let ot: &OperationType = OperationType::from_bytes(bytes)

Both ot and bytes point to the same memory. OperationType doesn't have a variant that corresponds to the byte 42u8, and we can't mutate that memory during parsing because bytes is immutable. I don't see a way to reconcile these two issues.

That said, if your complaint is that defining 256 variants is irritating, I can only wholeheartedly agree. @joshlf perhaps we could provide an attribute macro to generate these excess variants.

Understood and thanks for the thoughtful reply. I suppose in general this is the issue with SerDe libraries ... there is an impedance mismatch between the semantics of the language into which the bytes are deserialized and the bytes themselves (and the attendant "wire" specification).

Instead of expecting the language to match perfectly to the serialized format e.g.:

#[repr(u8)]
pub struct enum OperationType {
  Foo = 0,
  Bar = 1,
  Baz = 2,
}

let bytes = &[1u8];
let ot = &OperationType = OperationType::from_bytes(bytes);

We could expect that the language representation includes some method to detect a mismatch between the library and the binary representation. For example, the Rust SerDe library expects you to call deserialize from the Deserialize trait and that the method returns an error if it fails. Zerocopy could have methods to get the enum value if it exists and the raw bytes if it does not.

let bytes = &[3u8];
let ot: Enum<OperationType> = OperationType::from_bytes(bytes);
match ot.get() {
  Some(e) => {/*deserializes*/},
  None => {/*fails*/}
}
let raw = ot.raw();
let raw_mut = ot.raw_mut(); // Could return mutable slice containing just the value

... These method signatures are sketches for discussion. Tastes vary on specific method names and signatures :)