Unsound construction of uninitialized values

Question

Unsound construction of uninitialized values

Ralith opened this issue 5 years ago · 7 comments

https://github.com/jeromefroe/lru-rs/blob/master/src/lib.rs#L222-L223 uses MaybeUninit in a way that is exactly equivalent to the old mem::uninitialized which was deprecated for being practically always unsound. The correct way to use MaybeUninit is to never call assume_init until you have written a legal value to it. In the linked section of code, values of type K and V are obtained from uninitialized memory; these values might have types like &'static [u8] or ! for which this operation is insta-UB, for example.

Answer 1 · 2020-02-05T15:02:26.000Z

Hey @Ralith! Thanks for calling this out. Based on the documentation, it is not possible to create a struct by calling MaybeUninit::uninit::<Struct>() and then writing to its fields. Consequently, if I understand the documentation correctly, it's not currently possible for us to first write legal values to head and tail, which are structs of the type LRUEntry, and then call assume_init.

With that being said though, after the call to assume_init, we immediately write legal values to the next and previous pointer fields of the head and tail of the cache respectively. Furthermore, we only ever access these fields of the head and tail, we never access the key or value fields of the entries and we never access the previous and next pointers of the head and tail respectively either. Therefore, since we never access fields which are uninitialized, I believe the code is safe. Please let me know if this matches what you were thinking, or if I misunderstood something along the way. Any suggestions on how we can perhaps not need to use unsafe code at all would also be extremely welcome!

Answer 2 · 2020-02-05T16:24:38.000Z

Consequently, if I understand the documentation correctly, it's not currently possible for us to first write legal values to head and tail, which are structs of the type LRUEntry, and then call assume_init.

head and tail are raw pointers, not structs, but indeed you cannot initialize only a subset of the struct that contains them. Even if there was a way to, this would still be insta-UB because it constructs a K and a V from uninitialized memory. If you only needed pointers, you could initialize them to null instead.

Therefore, since we never access fields which are uninitialized, I believe the code is safe.

This is unfortunately false. I did mean it when I said insta-UB. Referring again to the docs:

Calling this when the content is not yet fully initialized causes immediate undefined behavior.

I haven't fully absorbed the design of this library so I'm not in a good position to recommend new designs. Using raw pointers to construct an intrusive linked list seems reasonable in general, but obtaining a K and V from uninitialized memory as is currently done means that constructing an LruCache is insta-UB. A straw alternative would be to wrap the key/val fields in Option so that you could initialize safely with None instead.

Answer 3 · 2020-02-05T18:51:40.000Z

head and tail are raw pointers, not structs, but indeed you cannot initialize only a subset of the struct that contains them.

Indeed, I should have been more clear and specified that the structs pointed to by head and tail could not be initialized with MaybeUninit.

This is unfortunately false. I did mean it when I said insta-UB.

You're correct. While I think in practice the code would be okay because we don't access the uninitialized fields and implement Drop manually for the cache to ensure the compiler doesn't try to drop them, we have no such guarantees.

A straw alternative would be to wrap the key/val fields in Option so that you could initialize safely with None instead.

I actually like this idea and put up a PR taking this approach in #71. Any thoughts or feedback you may have would be welcome.

Answer 4 · 2020-07-14T03:24:45.000Z

Is it still the case? I can't find any dangerous .assume_inits in the code anymore

Answer 5 · 2020-07-14T12:49:23.000Z

A lot of the MaybeUnit code was changed in #74 and now we only call assume_init when returning a value to the caller so I believe this issue can be closed. I'll leave it open for now though so others can comment if they think differently.

Answer 6 · 2020-08-01T20:17:12.000Z

The new approach looks good to me too.

Answer 7 · 2020-08-02T14:30:17.000Z

Great! I'll go ahead and close this issue then.