rust-lang/rust

attempted to leave type (enum) uninitialized, which is invalid

Closed this issue · 9 comments

This code worked fine a couple of months ago:

#[derive(Copy, Clone, Debug)]
enum Fruit {
    Apple,
    _Banana,
}

fn foo() -> Fruit {
    unsafe {
        let mut r = mem::uninitialized();
        ptr::write(&mut r as *mut Fruit, Fruit::Apple);
        r
    }
}

Now (1.44.1), this code crashes with:

thread 'main' panicked at 'attempted to leave type `Fruit` uninitialized, which is invalid', ...

This code looks perfectly valid to me: we create a slot for a variable, write to that slot exactly once, and return the value.

Play

lcnr commented

This code is sadly not perfectly valid. mem::uninitialized can set r to be any possible value, while only 0 and 1 are actually expected. The compiler can use this info to optimize your code, for example by storing some data in the unused bits of r.

To implement this without causing undefined behavior, you can use
MaybeUninit here (which prevents the compiler from assuming that r only requires the lowest bit):

use std::mem::MaybeUninit;
use std::ptr;

#[derive(Copy, Clone, Debug)]
enum Fruit {
    Apple,
    _Banana,
}

fn foo() -> Fruit {
    unsafe {
        let mut r = MaybeUninit::uninit();
        ptr::write(r.as_mut_ptr(), Fruit::Apple);
        r.assume_init()
    }
}

fn main() {
    println!("{:?}", foo());
}

Closing as expected behavior

Maybe technically it is UB, but practically that code worked fine for many years, and suddenly broke.

MaybeUninit was stabilized less than a year ago. I did not switch my code to MaybeUninit because of that reason: to avoid forcing users to switch to the latest compiler (people complain when they have to upgrade the compiler because of tiny library change). And suddenly the library broke.

And even if it is that bad, then why panic at runtime, instead of compilation error? The latter is much more damaging because it's harder to spot, harder to debug: if mem::uninitialized() call is not on the main execution branch, it may crash only in certain rare circumstances.

This is not how backwards compatibility should work.

Anyway, I have already patched the library, if you think this is not an issue, let be it.

Also, CC @RalfJung who has implemented this check in #66059.

UB is a property of the execution, not always possible to statically determine (e.g., you may have gated the mem::uninitialized on a zero-sized check or so and that happened to be sufficient to avoid UB in all cases), so this must be a runtime panic rather than a compile time error.

This code looks perfectly valid to me

Unfortunately, the code is not valid. It violates one of the fundamental rules of Rust:

It is the programmer's responsibility when writing unsafe code to ensure that any safe code interacting with the unsafe code cannot trigger these behaviors. unsafe code that satisfies this property for any safe client is called sound; if unsafe code can be misused by safe code to exhibit undefined behavior, it is unsound.
[...]

  • Producing an invalid value, even in private fields and locals. "Producing" a value happens any time a value is assigned to or read from a place, passed to a function/primitive operation or returned from a function/primitive operation. The following values are invalid (at their respective type):
    • A discriminant in an enum not included in the type definition.

These rules are as fundamental to Rust works as its type system. Unlike the type system, we cannot reliably ensure that you follow these rules when writing unsafe code. It is the responsibility of unsafe code authors to ensure that their code follows these rules.

Some more background on Undefined Behavior:

This code worked fine a couple of months ago:

You were lucky. But this code had a bug all along, and that bug could have surfaced any time you change any of your code, update a dependency, or update the compiler.
Newer compilers got better at detecting this particular kind of bug.

And even if it is that bad, then why panic at runtime, instead of compilation error? The latter is much more damaging because it's harder to spot, harder to debug: if mem::uninitialized() call is not on the main execution branch, it may crash only in certain rare circumstances.

Agreed, a static check would be better. That's why we have a lint that can detect some cases of using mem::uninitialized the wrong way, added in #63346 and expanded since then. (Try replacing your type by bool, for example. Then the lint should fire.) Unfortunately, it cannot detect your case. Static detection is harder that dynamic checks, and in general static detection of this issue is impossible (see halting problem). That's why we have both: the more user-friendly static check, and the more reliable dynamic check.

Your case seems detectable though. Feel free to report a bug to improve that lint (and Cc me).

MaybeUninit was stabilized less than a year ago. I did not switch my code to MaybeUninit because of that reason: to avoid forcing users to switch to the latest compiler (people complain when they have to upgrade the compiler because of tiny library change).

https://crates.io/crates/maybe-uninit could help with this.

I should also mention that this UB is not theoretical, it has already lead to SIGILL errors in the past.

I'll stop now.^^ Sorry for the multipost.

@RalfJung thank you for the explanations and your work on improving Rust safety!