question: why is Buf32 #[repr(packed)]
Closed this issue · 11 comments
If I understand https://doc.rust-lang.org/nomicon/other-reprs.html#reprpacked correctly,
#[repr(packed)]
forces byte alignment. And it sounds like this may be problematic:
most architectures strongly prefer values to be aligned.
I very new to Rust (and system level programming), but this sounds scary to me.
Could you explain, why the Buf32
should be packed?
#[repr(packed)]
pub struct Buf32<H> {
pub ptr: *mut H,
pub len: u32,
pub cap: u32,
}
For Buf32
I think #[repr(packed)]
may not be necessary. But it’s also not problematic, because the fields happen to not require any padding: laying them out in order without padding puts each field at an offset from the start of the struct that is a multiple of its size. (u32
fields are 32-bit-aligned.)
The Tendril
struct also uses #[repr(packed)]
. It is also OK there for the same reason. But it is critical there because, when a Tendril represents a string of 8 bytes or less, two consecutive 32-bit fields are treated like a single [u8; 8]
buffer.
@kmcallister, could #[repr(packed)]
be removed on Buf32
?
Using packed
doesn't seem correct or necessary. It does technically allow you to run into UB in Rust (see rust issue 27060). It turns the struct into a minimum-alignment-1 struct, which is indeed incorrect.
It seems redundant, unless you have some very special use of Buf32? With its fields being of size 8, 4, 4 bytes (64-bit) and 4, 4, 4 bytes (32-bit), the struct already doesn't use any padding at all on common platforms.
@bluss Per rust-lang/rust#27060, packed
can only cause UB when it forces fields to be unaligned which it doesn’t for these specific structs, right?
See my comment above: I think packed
is not necessary on Buf32
but it is on Tendril
.
since align_of
reports 1 for Buf32, it seems to me that it allows the fields to be unaligned at the whims of the compiler, it's instructed that the struct only needs byte alignment (which is not true).
Even worse, generic custom code might ask align_of
and place the value depending on that reported alignment, then it's very easy to produce UB (technically, but it's hard to demonstrate since x86 allows it anyway..).
Ok, I’m pretty sure we can safely remove packed
from Buf32
, but again I’m more worried about Tendril
and its unsafe code linked above.
I would use repr(C)
instead, then fields are well aligned but you know that the fields are laid out in order, and then it's safe. I guess we are missing some representation alternatives here (fixed order but not C).
I suppose this unsafe code should probably use an union
now that that exists, but we want this library to run on stable Rust.
So, yeah. In the meantime we should probably remove packed
on Buf32
and replaced it with C
on Tendril
.
I'm looking into fixing Tendril and Buf32 by using repr(C). @SimonSapin, do you know what the Header struct is for?
Non-inline tendrils (any tendril with more than 8 bytes of data) have a heap allocated chuck of memory that starts with a Header
and continues with a bytes buffer that stores the string’s data. The header contains a refcount (which needs to be shared for reference-counted tendrils that share a buffer) and the capacity (size of the allocation, it’s there to keep size_of::<Tendril<_>>()
smaller).
#33 switches Tendril
to #[repr(C)]
, but leaves #[repr(packed)]
on Header
to keep the C API working. (An to avoid wasting 4 bytes of padding between the header and data in heap allocations, though maybe that’s not significant.)