Lokathor/bytemuck

Should `pod_collect_to_vec()` input type require `AnyBitPattern`?

MarijnS95 opened this issue · 8 comments

Disclaimer: I'm still fuzzy on the various traits and their requirements.

pod_collect_to_vec() requires NoUninit + AnyBitPattern on both the source slice and output Vec type. However, casts like cast_vec and cast_slice only require NoUninit on the source type (we can safely read all bytes, but don't need to zero-initialize the source type) and require the full AnyBitPattern exclusively on the output type (for pod_collect_to_vec(): to first zero-initialize the output Vec before copying into it?).

bytemuck/src/allocation.rs

Lines 285 to 288 in 1039388

pub fn pod_collect_to_vec<
A: NoUninit + AnyBitPattern,
B: NoUninit + AnyBitPattern,
>(

Is this constraint desired (and if so, can it be more clearly documented?) or accidental (and can it be relaxed)?

Thanks for this crate by the way! It's solving a bunch of problems for us that have always been looming around the corner 😬

The reason for the NoUninit requirement on the destination type is because of the guts of the function itself:

pub fn pod_collect_to_vec<
  A: NoUninit + AnyBitPattern,
  B: NoUninit + AnyBitPattern,
>(
  src: &[A],
) -> Vec<B> {
  let src_size = size_of_val(src);
  // Note(Lokathor): dst_count is rounded up so that the dest will always be at
  // least as many bytes as the src.
  let dst_count = src_size / size_of::<B>()
    + if src_size % size_of::<B>() != 0 { 1 } else { 0 };
  let mut dst = vec![B::zeroed(); dst_count];

  let src_bytes: &[u8] = cast_slice(src);
  let dst_bytes: &mut [u8] = cast_slice_mut(&mut dst[..]);
  dst_bytes[..src_size].copy_from_slice(src_bytes);
  dst
}

The actual copy from the Vec<A> into the new Vec<B> happens on two [u8] references that are equal sized. However, we can't make both those [u8] references if either type has uninit bytes.

If we really had to, we could use slice pointers instead of slice references, but then the function itself would have to do one or two unsafe calls. Even within bytemuck, it's preferred when a function doesn't need to open up any unsafe of its own and we can just rely on the other traits/functions.

So, it's not a copy-paste error, but it's also not an absolute requirement of performing this operation. It's possible, in an absolute sense, to relax the restriction, but without a strong and clear motivating case i wouldn't want to relax the restriction "just to do it".

NoUninit requirement on the destination type

I should have been more clear and repeated the title once more: my main concern is with the input/source type. After all, src is only passed into cast_slice which only requires NoUninit.

but without a strong and clear motivating case i wouldn't want to relax the restriction "just to do it".

I don't want to suddenly derive bytemuck::Pod, bytemuck::Zeroable on all my source types that are currently defined bytemuck::NoUninit, if a simpler alternative is to just call cast_slice(src).to_vec(). It'd be more in-line with what we are currently using for cast_slice and cast_vec.

The actual copy from the Vec<A> into the new Vec<B> happens on two [u8] references that are equal sized. However, we can't make both those [u8] references if either type has uninit bytes.

Still fine (and it makes sense) to keep NoUninit, it is the AnyBitPattern on A/src that is the culprit.

EDIT: Here I wonder what this additional complexity is for, over just calling .to_vec() on the result from cast_slice(src)?

If we really had to, we could use slice pointers instead of slice references, but then the function itself would have to do one or two unsafe calls. Even within bytemuck, it's preferred when a function doesn't need to open up any unsafe of its own and we can just rely on the other traits/functions.

Fwiw looks like everything compiles just fine without any changes (no unsafe additions) beyond dropping AnyBitPattern from A.

Ahhhh, I understand the concern now (I think).

Opened #178, which I think fixes what you're after.

Here I wonder what this additional complexity is for, over just calling .to_vec() on the result from cast_slice(src)?

The way it's written casting both vecs to be bytes and then copying as bytes lets you go up in alignment (eg, u16 to u32) without fail.

Ahhhh, I understand the concern now (I think).

Opened #178, which I think fixes what you're after.

Yes, that's exactly it! Lots of unrelated changes though, but I found and approved what is relevant :)

The way it's written casting both vecs to be bytes and then copying as bytes lets you go up in alignment (eg, u16 to u32) without fail.

Right of course, that is what the Error and unwrap is for (within (try_)cast_*). But means the to_vec() might be more efficient when we only need to go down, from some &[T] to a Vec<u8>, by saving on the zero-initialization (which I hope to_vec() does not do via MaybeUninit... TBD).

Lots of unrelated changes though

Clippy is always coming after me with new warnings :(

But means the to_vec() might be more efficient when we only need to go down, from some &[T] to a Vec

If you statically know that the alignment won't go up then yeah just using cast_slice(data).to_vec() will likely be more efficient.

I seem to recall that this fn was long ago suggested by someone that wanted to go up in alignment (u8 to u32, or maybe it was f32 to f32x4?)

Stricter clippy lints are nice at times (keeps getting better and more complete at pointing out code-smells in a codebase) but sometimes overly pedantic and unnecessarily blocking unrelated PRs. Fwiw uninlined_format_args was demoted to pedantic.

Yeah we statically know alignment always goes to u8, effectively using it as bytes_of on a &[T], and then currently needing an owned Vec but hope to relax that to a borrow at some point.

I seem to recall that this fn was long ago suggested by someone that wanted to go up in alignment (u8 to u32, or maybe it was f32 to f32x4?)

That's what I understand from reading the original issue #49 (PR #50). At first I just thought this was a neat helper but now understand its applicability more clearly. I'll use it in places where alignment requirements go up (and where we need an allocation).

Oh and @Lokathor, while I have you here there are a few questions that I don't feel like spawning new issues/discussions for (but can definitely do if desired!).

NoUninit on nested structs with arrays

It seems I cannot derive(NoUninit) on a struct that uses another derive(NoUninit) struct within an array:

#[repr(C)]
#[derive(Clone, Copy, NoUninit)]
struct Foo(pub u32);

#[repr(C)]
#[derive(Clone, Copy, NoUninit)]
struct Bar {
    my_foo: Foo,
    my_foo_arr: [Foo; 4], // Error: doesn't implement Pod
    blah: u32,
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=1d5f5e8a37533c81109d18d97b720e61

Is this something to do with #141?

io::Cursor helpers etc?

For now I'm writing code along the lines of:

let mut blah = vec![Blah::zeroed(); num_blah];
stream.read_exact(bytemuck::cast_slice_mut(*mut blah))?;

(This is still better than the vec![0u8; ...]; read_exact(); from_raw_parts() that was there previously, not caring about alignment)