rust-lang/unsafe-code-guidelines

Validity of references: Memory-related properties

RalfJung opened this issue · 114 comments

Discussing the memory-related properties of references: does &T have to point to allocated memory (with at least size_of::<T>() bytes being allocated)? If yes, does the memory have to contain data that satisfies the validity invariant of T?

If the answer to both of these questions is "yes", one consequence is that &! is uninhabited: There is no valid reference of type &!.

Currently, during LLVM lowering, we add a "dereferencable" attribute to references, indicating that the answer to the first question should be "yes". This is a rather unique case in that this is the only case where validity depends on the contents of memory. This opens some new, interesting questions:

  1. I mentioned above that size_of::<T>() many bytes need to be dereferencable. How do we handle unsized types? We could determine the size according to the metadata and the type of the unsized tail. For slices, that's really easy, but for trait objects this involves the vtable, so it would introduce yet another kind of dependy of validity on the memory. However, vtables must not be modified, and they never deallocated (right?), so this is a fairly weak form of dependency where if a pointer was a valid vtable pointer once, then it always will be.

    With more exotic forms of unsized types, this becomes less easy. extern type we can mostly ignore, we cannot even dynamically know their size so we basically can just assume it is 0, and check dereferencability for that. But what about custom DST? I don't think we want to make validity depend on executing arbitrary user-defined code. We could just check validity for the sized prefix of this unsized type, but that would introduce an inconsistency between primitive DST and user-defined custom DST. Is that a problem?

    For unsized types, even the requirement that the pointer be well-aligned becomes subtle because determining alignment has similar issues than determining the size.

  2. What about validity of ManuallyDrop<&T>? ManuallyDrop<T> certainly shares all the bit-level properties of T, because we perform layout optimization on it. But does ManuallyDrop<&T> have to be dereferencable?

Note that this is not about aliasing or provenance; those should be discussed separately -- a bunch of open issues already exist for provenance in general and stacked borrows specifically.

CURRENT STATE: The thread is long and there were many positions without a good summary. My own latest position can be found here.

Notice that there is an alternative that would make validity not depend on memory, while maintaining the dereferencable attribute: the Stacked Borrows aliasing model includes an operation on references called "retagging" that, among other things, raises UB if the reference is not dereferencable. So, if we answer the two questions from the OP with "yes" and "no", respectively, we could equivalently say that validity does not make any requirements about references being dereferencable, but the aliasing model does. That would make validity be a property that only depends on raw bits, not on memory, which would simplify the discussion elsewhere (and resolve #50 (comment)).

With this approach, the properties of ManuallyDrop<&T> would be determined by whether retagging descends into the fields of ManuallyDrop or not.

Concerning the second question, my personal thinking is that we should not require the pointed-to memory to be valid itself.

One good argument for making things UB with a strict validity invariant is bug-checking tools, but in this case actually doing recursive checking of references all the time is really costly, and probably makes it near impossible to actually develop useful tools.

On the other hand, it is very useful when writing unsafe code to be able to pass around a &mut T to some uninitialized data, and have another function write into that reference to initialize it. If we say that valid references must point to valid data, this pattern becomes UB. As a consequence, then, we should offer tons of new APIs in libstd that take raw pointers instead of references, so that they can be used for initialization.

Some examples:

So @arielb1, for example, has traditionally maintained that having &T require that its referent is valid would invalidate far too much code. I'm inclined to agree. I think that our idea for ! patterns kind of assuaged my concerns about how to handle &!, so I feel comfortable with making the validity invariant shallow. (The argument about &mut T to an uninitialized T is also strong.)

I am also intrigued by this comment from @RalfJung :

Notice that there is an alternative that would make validity not depend on memory,

That seems like a very good property to have. I am inclined to pursue this approach, personally.

I (personally) think that not considering & to be "pointers" is the only sensible solution here (similar to how C++ does it, references behave more like plain values rather than pointers). My motivating example is that given a function like this:

fn generic<T>(foo: &T) {
    // body
}

it is way too easy to end up with something that will most likely indefinitely stay UB for T = ! where code would otherwise be valid for all other Ts. Making &! uninhabited avoids this problem altogether and we may be able to relax this later on if we figure out that:

  1. There are convincing use cases for &! not be uninhabited;
  2. Figure out how to make all the safe constructs for &T be defined with &!.

it is way too easy to end up with something that will most likely indefinitely stay UB for T = ! where code would otherwise be valid for all other Ts.

Could you come up with such an example - that is UB for T = ! but not UB for say T = bool and foo containing a pointer to an invalid bool?

@arielb1 I do not think I’m able to come up with an example (is there one?) where it would not be UB if bool had invalid bit pattern, but at least it is possible to produce a reference to anything valid at all for &bool.

I realized since I last wrote the comment that, in order to obtain &!, unsafe code is necessary one way or the other (even though @RalfJung says this is a property of the safety system, not value validity system, and those are independent). With that in mind, I’m fine with whatever ends up being decided here.

We talked about this at the all-hands. @cramertj expressed interest in &! being uninhabited to be able to optimize functions away for being dead code. @Centril noted that in particular related to matches, if we just make validity of &T recursive, there is no question about automatically going below reference types in a match, such as in

fn foo<T>(x: &!) -> T { match x { } }

Even in unsafe code, the match can never cause issues on its own, the reference would already be invalid and hence you'd have UB earlier.

I believe we should handle all types consistently, meaning that if &! is uninhabited (from a validity perspective, not just from a safety perspective), then we should also say that &bool is UB if it does not point to a valid bool, and so on.

One issue with this is that this makes validity very hard to check for in a UB checker like Miri, or in a valgrind tool. You'd have to do a recursive walk following all the pointers. Also, it is unclear how much optimizations benefit from this (beyond removing dead code for &!) because a value that used to be valid at some point, might become invalid later when the contents of memory change.
Also, new hard questions then pop up about the interaction with Stacked Borrows, where I think it might be hard to make sure that transitively through the pointer chain, all the aliasing works out the right way. Retagging is currently a key ingredient for this, but if we do this transitively we'd have to Retag references that are stored in memory, which I don't think we want to do -- magically modifying memory seems like a bad idea.

it is way too easy to end up with something that will most likely indefinitely stay UB for T = ! where code would otherwise be valid for all other Ts. Making &! uninhabited avoids this problem altogether.

I don't understand what you are saying here. Making &! uninhabited makes strictly more programs UB? How is that supposed to solve problems with programs being UB?

Also one thing @Centril brought up at the all-hands: we need more data. In particular, we should figure out if there are interesting patterns of unsafe code that rely on having references to invalid data, and that would be too disruptive to convert to raw pointers or too widely used to break.

One issue with requiring references to be transitively valid: we have a whole bunch of existing reference-based APIs, such as for slices, that we could then not use. I expect this to cause a lot of trouble with existing code, but I am not sure.


Another proposal for references that enables @cramertj's optimizations could be: if reference's validity depends on memory in complex ways, we will need a notion of "bitstring validity". (Avoiding that is one argument for shallow validity, IMO.) We could define validity of a reference to require that the pointee is bitstring valid. This makes checking validity feasible and enables some optimizations. However, it would mean that &! is uninhabited while &&! is not.

CAD97 commented

Another data point is rust-lang/rfcs#2645 (FCP-merge) which theoretically will allow transmuting between &mut T and &mut MaybeUninit<T>, which removes some of the pressure of wanting to use &mut T to uninitialized memory.

I'm in favor of the validity invariant of &_ being shallow but the borrowing model requiring some amount of deep validity (though it could potentially be just one dereference deep) at most usage of the type, including reborrows for function calls..

@CAD97 that RFC is not necessary for the transmute you mentioned -- it is only needed if we want to pass things by value. In memory, T and MaybeUninit<T> are already the same (as in, they have the same size and alignment; there might be a different for layout optimizations).

Data point for a use-case where a reference to invalid data is useful: https://users.rust-lang.org/t/am-i-triggering-undefined-behavior-here/28364/10

tl;dr:

/// A byte array with the same size as type `T`
///
/// If `T: !Sized`, this size may be dynamic.
#[repr(packed)]
pub struct Bytes<T: ?Sized> {
    buf: ManuallyDrop<T>,
}

where Bytes<T> is never created directly, but rather is always used via a reference such as &Bytes<T> or &mut Bytes<T>.

Counterpoint: if MaybeUninit<T> accepted T: ?Sized the reference would be valid, and thus this wouldn't be an issue.

@petertodd can you walk me through why do you think that &Byte<T> is invalid in that case? I don't see any UB in the code you show in the post.

"A byte array with the same size as type T" where T: ?Sized... that's weird.^^

We don't have a story currently for uninitialized unsized data -- and given some of the plans for custom DSTs, it will not be possible to support that in general (namely, for "thin" DSTs).

I don't follow. I don't see anything wrong with Bytes<T>. IIUC, @petertodd always uses it when T is properly initialized. If anything, there would be something wrong in the conversions from &[u8] to &Bytes<T> and vice-versa, but I don't see why @petertodd cannot use &[MaybeUninit<u8>] there instead, although depending on #71 that might not be necessary and @petertodd code might be correct.

@gnzlbg the issue is in this function:

impl<T> Bytes<T> {
    pub fn from_slice(buf: &[u8]) -> &Bytes<T> {
        assert_eq!(buf.len(), mem::size_of::<T>());
        unsafe {
            &*Self::from_ptr(buf.as_ptr() as *const T)
        }   
    }
}

I can use that to turn a &[u8; 1] into a &Bytes<bool> even if the array contains a 3.

That function should be unsafe, and then the caller needs to assert the validity of the buf, or is there an expectation that we will be ever able to do better than that ?

If we don't make validity of references recursive (and my opinion still is that we should not :D ), then that code is just fine.

@gnzlbg So the full use-case where I came up with that beast is in-place "deserialization", e.g. for memmapped files (which as @RalfJung correctly noted elsewhere have issues with other processes changing them, but for sake of argument assume that has been solved).

As we know, for many types not all bit-patterns are valid. Thus we can have an API along the lines of:

unsafe trait Validate {
   type Error;
   /// Validate that `buf` represents a valid instance of `Self`.
   fn validate(buf: &Bytes<Self>) -> Result<(), Self::Error>;
}

The validate() function is safe to call, because Bytes can only be (safely) created from byte slices for which all bytes in the slice are initialized; Validate is unsafe to implement, because other code depends on it actually working.

This is why Bytes has a safety problem: the whole point is to verify that the bytes are valid for the type in question.

Unsized Types

So how do unsized types fit into all this? See https://users.rust-lang.org/t/am-i-triggering-undefined-behavior-here/28364/9

As @RalfJung correctly notes, it's not clear that you can always/will always be able to determine the size of an unsized type value from pointer metadata. However it's certainly possible to do this for a subset of types, such as slices. So I simply made a Pointee trait for the subset of types that can do this - essentially implementing part of what a custom DST API would do.

But MaybeUninit doesn't (yet) support unsized types, which leads me to this problem.

Alternative Solution

Wrap a pointer instead:

struct Bytes<'a, T: ?Sized> {
    marker: PhantomData<&'a ()>,
    ptr: *const T,
}

Which is fine for me - not quite as nice an API for what I'm doing, as I'll need BytesMut etc., but it'll work. I'm just bringing all this up to give an example of a potential use-case where validity of references matters.

See rust-lang/rust-memory-model#2 for some earlier discussion of basically the same subject.

Also see prior discussion at rust-lang/rust-memory-model#12.

Apologies if this is the wrong issue, it's what was linked a few times when talking about passing uninitialized memory as byte slices (such as to Read).

There's been an increase in claiming that the following should be assumed to be UB:

let mut buf: [u8; 8192] = mem::uninitialized();
io.read(&mut buf)

Many network-related applications already do this, because zeroing the buffer isn't free, but rather can appear noticeably in profiles. However, there's no "UB-free" API to read from a socket with uninitialized memory (there's libc::read I suppose). As long as an API doesn't exist, realistically applications will just ignore the possibly-future-UB claims. And the more that do that, the harder it becomes to declare this as actually UB (and cause those applications to do horrible things with a new compiler).

@seanmonstar this is almost the right issue, could you move that post to #71 ? The primary issue with your snippet is around integers, not references.

Here I am assuming that you know exactly which code io.read will call, and that you carefully audited it for not inspecting the content of buf before they get initialized, and (if that code is in another crate) that you got a guarantee from the crate author that code will also guarantee that behavior for the future. In particular, this means that this is not a generic Read implementation. That case, however, is also unrelated to references -- see rust-lang/rust#62102.

Here I am assuming that you know exactly which code io.read will call, and that you carefully audited it for not inspecting the content of buf before they get initialized

Yea, there's unsafe fn prepare_uninitialized_buffer in AsyncRead that zeros the memory by default, but specific implementations can choose to leave it alone, such as impl AsyncRead for TcpStream.

Please move any discussion of uninitialized integers over to #71.

I have written some notes on values and representations in Rust that are relevant here.

Namely, I propose to interpret the validity invariant as arising from the fact that some bit patterns do not represent any value. This works great for almost everything, except for the memory-related properties of references: the representation relation can only talk about the raw bytes on the Abstract Machine that make up some value, it cannot refer to the contents of memory. This is basically another argument towards making the validity invariant not depend on memory contents, which was already previously suggested because it simplifies the model, because it can help checking (by doing caching), and because we still need a notion of "bit-string validity" when talking e.g. about layout optimizations (so if validity depends on memory, we end up with three invariants: representation/bit-string invariant, validity invariant, safety invariant).

As a result of this, I lean even stronger towards making the validity invariant mechanism not interact with memory. Given that, to also make &! insta-UB, I propose that we special-case references to uninhabited types in the representation relation of references. Basically, we could imagine they require infinite alignment (impossible to satisfy). The representation relation already has to "look" at the Tin &T to determine alignment, also checking inhabitedness does not seem completely unrasonable.

This would leave dereferencable. But that attribute is anyway not justifiable based only on the validity invariant, because it means "dereferencable for the entire duration of the function", and the validity invariant is only checked on uses and "typed copies" of the value. (Which seems okay for me but is being discussed at #84.) So it probably makes more sense to let the aliasing model do that. Currently, Stacked Borrows does justify dereferencable for references it retags, which are those passed around "bare" (as opposed to inside a compound type).

To make things more complicated, shared references (when retagged by Stacked Borrows) are subject to some additional inadvertent "checking" to figure out where there are UnsafeCell. So an &Result<Cell<i32>, usize> with an invalid discriminant would actually cause UB on retagging (without ever dereferencing the reference) just because we are trying to find the UnsafeCells in there. This only applies to enums though, and it makes me wonder if we shouldn't rather just check the type of the enum for Freeze to avoid having UB "sometimes" when a shared reference points to something invalid. (UPDATE: Stacked Borrows has since then been changed to not behave like this any more. There is no extra "validity checking" for shared references now.)

As a result of this, I lean even stronger towards making the validity invariant mechanism not interact with memory. Given that, to also make &! insta-UB, I propose that we special-case references to uninhabited types in the representation relation of references. Basically, we could imagine they require infinite alignment

Hmm, this is clever :)

How would it interact with rust-lang/rust#54987?

CAD97 commented

Naively, I'd think that the layout engine could put the ! type at the "end" of a type's layout, meaning everything that is not ! is laid out "normally". Then, because ! obviously isn't actually provided, the optimization then removes the stack space for the ! that doesn't exist, leaving a "partial allocation" that can be manipulated for the partial initialization.

The issue with "infinite alignment" though is the reason that ! is currently implemented as zero-size and minimal-alignment: Result<T, !> needs to exist and be the same as T. If we naively say that ! has infinite alignment, then so does Result<T, !>.

The rules may be able to "just happen" if we can find some way to say ! has infinite size and/or alignment while still guaranteeing that its statically-guaranteed-nonexistant status means that the space for it is never actually attempted to be allocated, even though it would require an impossibility to actually be allocated. (What would Vec::<!>::with_capacity do?)

The issue with "infinite alignment" though is the reason that ! is currently implemented as zero-size and minimal-alignment: Result<T, !> needs to exist and be the same as T. If we naively say that ! has infinite alignment, then so does Result<T, !>.

One way to think about ! being infinite alignment would be if the alignment calculation of sum types ignored all uninhabited parts.

Unfortunately we already have uninhabited types in stable via empty enums, and they have alignment 1, so if this were done consistently it wouldn't be backwards compatible.

How would it interact with rust-lang/rust#54987?

I don't think it would interact at all. Rust won't let you do anything with uninitialized fields of a partially initialized struct.

Or do you mean that (String, !) would have "infinite alignment"? No, we couldn't do that. Similar to how we don't mark (String, !) as uninhabited.

One way to think about ! being infinite alignment would be if the alignment calculation of sum types ignored all uninhabited parts.

I am not sure what your backwards compatibility concern is with empty enums. We don't guarantee that size and alignment of repr(Rust) types stays the same. And C doesn't really have uninhabited types, so I don't think we have to worry about repr(C) here.

The more interesting case is something ([u32; 0], !) which currently is a 4-aligned ZST. So Result<T, ([u32; 0], !)> is also at least 4-aligned. But that might be something we could change as well.

I am not sure what your backwards compatibility concern is with empty enums. We don't guarantee that size and alignment of repr(Rust) types stays the same. And C doesn't really have uninhabited types, so I don't think we have to worry about repr(C) here.

That's a good point. Although I'd say changing from alignment 1 to, say, alignment 4 is still a lesser class of change than 1 to ∞. :)

edit: Notably code that does calculations around alignment that works fine with small alignments could run into arithmetic overflow problems if some types have "infinite" (maxint) alignment.

The more interesting case is something ([u32; 0], !) which currently is a 4-aligned ZST. So Result<T, ([u32; 0], !)> is also at least 4-aligned. But that might be something we could change as well.

Makes sense to me.

@RalfJung Do you actually mean infinite alignment or e.g. usize::MAX? Actual infinity can't be represented by the return type of align_of etc. so we'd run into a problems there, similar to the issues with align_of_val(&extern type).

usize::MAX would work fine, I suppose, since references can't point to address zero and that's the only address aligned to usize::MAX.

I was thinking more abstractly and actually meant infinite. And I am not sure about the consequences of actually reporting such alignment for ! in align_of. This is more of a mathematical trick.

In #158 (comment), @HadrienG2 brings up some interesting and reasonable cases that are UB if references need to point to valid data:

  • Given *const Option<T>, say you want a *const T.
  • In a macro that works for any tuple size, given a raw ptr to the tuple, get a raw ptr to the element.

We could solve this by enabling users to avoid references for these steps, but let's leave that for elsewhere. The interesting option for this issue is relaxing the validity invariant for references so that merely having a reference pointing to invalid data is not UB, along the lines of what I proposed earlier.

@HadrienG2: the one good argument for this UB that has surfaced in the discussion so far is references to uninhabited types, like &! but also &(u32, !). There's a benefit from making those uninhabited as well, which is exactly what happens if you say "references have to point to valid data". So, my question for you is: in any of these cases, is there a chance that the type the reference points to is uninhabited? Like, could this be *const Option<(u32, !)> (using a non-zero-sized uninhabited type here to make things interesting), or a tuple involving !, or so? Since your use-case is deserialization I assume no because you can't deserialize a !, but I do not know how you have set up your traits. Could Abomonation ever be implemented for an uninhabited type?

@RalfJung Implementing Abomonation can go wrong in plenty of other ways than ! and is therefore soon to become unsafe (once my PR for that is accepted). We could specifically clarify in the safety contract that it is illegal to implement it for inhabited types, and I don't foresee anyone to have an issue with that. As you say, (de)serializing an inhabited type makes no sense, since no value of that type can exist,

struct Foo { x: Option<(u32, !)> }

is a perfectly valid type that I might want to serialize with abomonation, so I'm not sure whether banning such types is the answer. No user might write that type directly, but it might happen as a consequence of generics, e.g., using a Result<u32, !> somewhere.

Good point, I did not think about that one.

In any case, since @RalfJung asked for it in #158 , here is a collection of cases where abomonation has "reference to invalid data" UB and I have no idea what to do about it:

  • Given a NonNull<Struct>, get a NonNull<Field> without going through an &mut Field.
    • Should be resolved by &raw, once it stabilizes.
  • Given a NonNull<(A, B, C, ...)>, get NonNull<A>, NonNull<B>, NonNull<C>, ... without generating invalid references via (ref mut a, ref mut b, ref mut c, ...).
    • This may seem identical to the previous case, but isn't. Currently, our codegen for tuples is based on declarative macros, and these cannot generate the tuple.0, tuple.1, tuple.2, ... sequence, yielding to weird pattern matching based workarounds.
    • Possible ways to resolve this include &raw support for pattern matching (raw ref?), switching to procedural macros (which would be a big jump up in compile time and code complexity for abomonation), improvements in declarative macros, improvements to tuple field accesses, or variadic/tuple generics.
  • Given a pointer to a Rust enum, such as Option<T>, perform pattern-matching on alternatives and get a pointer to each field of enum variants without materializing Rust references to them.
    • Currently, the only other enum with built-in abomonation support is Result<T, E>, but there is a crate called abomonation_derive which aims to allow easy abomonation implementations on user-defined enum types. It currently works by generating pattern matching statements.
    • As above, a possible solution would be &raw support in pattern matching, in combination with an assertion that pattern-matching partially invalid data is okay.
  • Given a pointer to a slice like NonNull<&[u8]>, or to a slice-like type like NonNull<&str>, get access to the length of the slice without materializing an invalid slice reference.
    • One way would be to stabilize the layout of fat pointers, in combination with &raw.
  • Given a pointer to a container class, such as Vec or String, gain access to the allocation length in spite of the allocation pointer being invalid.
    • I think this might already be okay, because containers use NonNull<T> or Unique<T> and these have no data validity invariant as long as they aren't accessed.

Thanks @HadrienG2, that's a great list!

One way would be to stabilize the layout of fat pointers, in combination with &raw.

That's overkill, we just need to provide len and get_unchecked methods on raw slices. This was suggested before and met with general agreement; I think the only reason these do not exist yet on nightly is that nobody took the time to write them.^^

Given a pointer to a container class, such as Vec or String, gain access to the allocation length in spite of the allocation pointer being invalid.

That is mostly a question of turning what is right now a "happens to be" property of Vec into a stable guarantee.


@gnzlbg good point. So it looks like disallowing references to uninhabited types basically already entirely kills using references for abomonation. That's interesting because it makes you wonder if there is any benefit in terms of unsafe code one could write for allowing references to invalid data except for uninhabited types (my proposal above). There are benefits in terms of being able to check this with reasonable effort at run-time, and in some sense this simplifies the rules (no unbounded recursion through the heap to determine validity of a value -- the validity invariant is just a consequence of the value representation relation); but it also makes "uninhabited" a primitive notion for the purpose of defining validity invariants, as opposed to it being just defined in terms of other stuff and not itself even appearing in the spec.

@cramertj you were the one, at the last All-Hands, to bring up the case of &!-taking methods and wanting to eliminate them as dead code during compilation. Could you say a bit more about what those methods look like and how they arise?
In particular, the following function could be erased as unreachable even if references are fully shallow:

fn foo(x: &!) {
  match *x {}
}

After all, the first thing this function does is enter an unreachable match, hence the entire function is unreachable. I'm wondering if that could be "good enough"?

@RalfJung

So it looks like disallowing references to uninhabited types basically already entirely kills using references for abomonation.

IIUC @HadrienG2 above, abomonation does not want to deal with or construct references. Right now, it has to construct references because of limitations in the language (e.g. lack of &raw). Fixing those limitations to allow abomonation to avoid using references is orthogonal to whatever the rules of references end up being.

does &T have to point to allocated memory (with at least size_of::() bytes being allocated)?

And can we factor platform-specific constraints into this? For example x86_64-unknown-linux-gnu is limited to the lower half of the address space, the kernel owns the upper half. Even less if we take the non-canonical addresses into account. Since no allocation can exist in that region one cannot safely construct a reference to that.

This could be used as a much larger niche for layout optimizations compared to the current non-null restriction.

Similarly: if T has an alignment > 1 then &T could have many niches in the lower bits.

And can we factor platform-specific constraints into this? For example x86_64-unknown-linux-gnu is limited to the lower half of the address space, the kernel owns the upper half. Even less if we take the non-canonical addresses into account. Since no allocation can exist in that region one cannot safely construct a reference to that.

But that would mean we couldn't use Rust to write code that runs in kernel space, which seems bad.

Similarly: if T has an alignment > 1 then &T could have many niches in the lower bits.

That is definitely already the case, the compiler just does not exploit it. AFAIK we have consensus on the alignment point. This open issue is about how much more than that we have.

We should've some mechanism to invoke mem::size_of_val::<dyn Trait>(x) when x: &dyn Trait, or maybe x: &MaybeUninit<dyn Trait> or similar, has a valid vtable pointer but an invalid data pointer.

As an aside, if T is unsized then &T is always still sized, but actually &MaybeUninit<T> stays unsized, which looks like a bug. https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b1a7f3336fec5b0dcf1d426252df14e4

@burdges you misunderstood the error message. MaybeUninit (and unions in general) can't have unsized fields. So this doesn't work,

// note: not calling, just trying to instanciate the generic function will fail
std::mem::size_of_val::<MaybeUninit<[u8]>>;

We should've some mechanism to invoke mem::size_of_val::(x) when x: &dyn Trait, or maybe x: &MaybeUninit or similar, has a valid vtable pointer but an invalid data pointer.

Yes we should, but that's a job for https://doc.rust-lang.org/nightly/std/mem/fn.size_of_val_raw.html I think -- so unrelated to this discussion.

And can we factor platform-specific constraints into this? For example x86_64-unknown-linux-gnu is limited to the lower half of the address space, the kernel owns the upper half. Even less if we take the non-canonical addresses into account. Since no allocation can exist in that region one cannot safely construct a reference to that.

But that would mean we couldn't use Rust to write code that runs in kernel space, which seems bad.

Would kernel code be written against the x64_64-unknown-linux-gnu target? Wouldn't it have its own target similar to how embedded bare-metal and embedded linux on the same architecture being separate targets?

Maybe it would, I don't know.^^

Would kernel code be written against the x64_64-unknown-linux-gnu target? Wouldn't it have its own target similar to how embedded bare-metal and embedded linux on the same architecture being separate targets?

Indeed, there's the x86_64-linux-kernel tier 3 target for that.

And can we factor platform-specific constraints into this? For example x86_64-unknown-linux-gnu is limited to the lower half of the address space, the kernel owns the upper half.

On 32bit x86 it is possible to have a 3GB/1GB split. (CONFIG_VMSPLIT_3G) There is even a patch that allows programs to use the full 4GB address space: https://lwn.net/Articles/39925/

That's why I limited my question to x86_64. The most recent sysv ABI spec says

3.3.2 Virtual Address Space

Although the AMD64 architecture uses 64-bit pointers, implementations are only required
to handle 48-bit addresses. Therefore, conforming processes may only use addresses from
0x00000000 00000000 to 0x00007fff ffffffff.
Processes begin with three logical segments, commonly called text, data, and stack.
Use of shared libraries add other segments and a process may dynamically create segments.

With newer processors the kernel supports more but to maintain backwards compatibility processes have to opt into that on every mmap call. glibc does not add an address hint, so processes using malloc remain limited to 47bits and considering spec requirements and their guarantees to not break userspace I assume that will remain so.

But we don't even need all those bits. Niches are not about bits but value ranges. Currently the null pointer optimization only uses a single value (0). Restricting references to the lower half of the address space leaves room for address space growth while still giving us an additional 2^63 niches. And these niches are easier to use since they don't depend on the alignment of the pointee.

The only wrinkle I see would be ZSTs, since they don't need backing memory. Is there anything safe that could create &zst with arbitrarily high addresses?

Please note that all of this discussion of niches for references is off-topic here, it belongs to #76. Could you please post a summary there? Then I can fold the posts here.

I would support making &T require pointing to a valid T. In a compiler I am working, validity checks are at a much lower level than in rust. The compiler can assume that it can yank full values out of dereferenceable pointers (provided the validity range is at least large enough for one value). It would be nice to assume that if you read of of a &Option derived from a &T (where T is a niche-optimizing candidate type, such as a reference, NonNull pointer, bool, or anything with a xlang-representable validity invariant) it won't magically become None (Under ucg, a transmute from a type with a niche to an Option-like enum of that type is well-defined, and the potential scenario does not produce an invalid Option, it would only produce an invalid T, ). Requiring &T to refer to a valid value would allow such an optimization to take place, in the absence of any other source of undefined behaviour not permitted under the Unsafe Coding Guidelines.

@chorman0773 sure it would be "nice to assume", but it also causes a lot of problems, as discussed in the thread above. You talk very abstractly about potential optimizations; what we still haven't seen in this thread is a concrete example of some code and a desirable and realistic optimization that only works if references must point to valid data.

#245 is an example of a case where making validity depend on the contents of memory is a problem: that means that a value that was valid at some point, can cease to be valid later.

This is the main reason why I think that validity of references and Box should not at all depend on the current state of memory. It should be solely a property of the reference itself.

I actually have an example of code that could benefit from having passthrough validity.
Consider the following basic rust code (for exposition).

fn copy_into<T: Copy>(r: &T,a: &mut [T}){
    for v in a{
        *v = *r;
    }
}

In the rules of the rust language, it may be perfectly valid to hoist the dereference of r outside of the loop. In particular, rustc can do this.
However, in lccc, because the r is dereferenceable for the size of T, reading from the pointer has a type T. This may produce invalid values if T has an "xlang-representable niche" (such as nonnull, aligned, nonzero, range). When a is an empty array, this can cause undefined behaviour when it would not otherwise have undefined behaviour (reading from r produces an invalid value, which if proven yields the invalid value, which is a fancy way of saying core::hint::unreachable_unchecked()).
This would make such a transformation invalid whenever T is a dependant type (which, in the example, it is until generic instantiation) or is known to have some form of niche tag.
Boxes can be really fun to deal with, and I will look into how such optimizations could interact with ManuallyDrop<Box<_>> but certainly this should be a fair optimization to apply to references.

CAD97 commented

Just to say it again, it's not required that it's a validity requirement of &_ for the pointee to also be valid in order to justify that optimization, if I've understood everything correctly.

There's two main ways to make the optimization valid:

  1. The optimization is valid because it's always valid to read from a reference. This requires (probably, mod option 2) the reference to have the validity requirement that it is dereferencable for the size of the pointee, and that it points to some memory, though it may potentially undefined. In all existing proposed models for Rust, memory is untyped, and types are a property of memory accesses. So if the virtual machine performs an untyped read of the pointee (roughly, reading [MaybeUninit<u8>; {size_of::<T>()}]), then the read is always valid. It then gains the ability to type the read memory at T by observing that a has a nonzero length. Making up a surface-level interpretation of the optimization,

    fn copy_into<T: Copy>(r: &T, a: &mut [T]) {
        let _0: [MaybeUninit<u8>; size_of::<T>()] = mem::transmute_copy(r);
        for v in a {
            *v = mem::transmute::<_, T>(_0);
        }
    }
  2. Extending validity through the aliasing model. The short version is, whenever a reference is provided to a function (thus, implicitly reborrowed), that asserts extra information about the validity of the pointee beyond just the (bitlevel) validity of the reference itself. This could be as little as "the pointer is dereferencable and describes (at least) size_of_val(reference) bytes," it could require the (shallow) bit validity of the pointee, or it could even recursively require the validity of the pointee and any of it's pointees. Depending on how aggressive of properties you require, it could validate the above optimization on its own, or just shift the "dereferencable / readable" property from an invariant validity requirement to a requirement of reborrowing. Though, it is worth noting that you can barely do anything to a reference without reborrowing it (basically just drop, dereference, or cast to pointer, unless I'm missing something), so it sort of still becomes an informal part of reference validity, if not part of the formal bit-level validity.

As I understand it, vague consensus is that the bit-level validity of references should just be that of pointers, and everything else is derived from the aliasing rules for references. This would include reborrowing of references requiring that they're dereferencable for the correct amount of memory, thus validating the optimization as described in way 1 above.

xlang, the ir for lccc, has "typed" memory, in that it can assign a particular type to memory it reads as it reads it. This is needed because I reason about pointers in a similar way to C and to a slightly weaker version of restrict (which can be manually tweaked depending on the soundness of the requirements, note that this is not soundness of rust), as well as to allow pointer tagged TBAA (which is not applicable here because I wouldn't emit the strict_alias tag for rust pointers). In particular the applicable rules that converting an lvalue (place in rust) into an rvalue (value in rust), and assigning an rvalue into an lvalue cannot load/store an invalid bitpattern according to the niches of types, nor can load/store uninit (which is effectively llvm poison) through a type with a niche (both operations are considered a load/store of invalid, which is a value that causes undefined behaviour just by existing, and indicates an unreachable control path if it can be detected).
It's possible I can experiment with dragging arround untyped memory, however, convert reinterpret (which is effectively what transmute would become) and derive convert reinterpret (which is pointer transmuting) seem like they could have non-zero optimizer overhead, especially with all other languages where the transformation (or an equivalent) is valid. This is particularily the case when T is a pointer type (and therefore, somewhat, applies when T is dependant).
As pointers aren't just bytes (and the UCG proposed model at least treats pointers in memory as more than just arbitrary bytes), inventing a pointer out of thin air is not actually something I can do, nor particularily want to allow (that could be used to invent valid pointers from tagged invalid pointers that kinda point at something, either an object, null, or passed-the-end, but really don't because they are tagged invalid which means they don't, and basically allow unrestricted laundering of pointers, even beyond that of C++'s std::launder).
I don't necessarily think UCG needs to adopt a typed memory model, but I think it should at least consider an implementation with typed memory (even if rust's rules prevent anything more than superficial optimizations based on that or prevent me from meming the language into JVM bytecode because reasons).
If there are significant blockers on this, that's fair, and I can probably figure out a workarround that maintains this class of optimizations (this is still in the works), but I want to avoid compromising other equally important optimizations in different languages using the same IR.
A lot of this is theoretical, as the compiler in question is still a heavy work in progress, though I have a pretty good understanding of what optimizations I want to allow, what optimizations each language I'm supporting will permit (currently 4 planned, Rust, C, C++, and a work in progress language that has some syntax derived from Rust, called Laser), and how the underlying intermediate representation will be structured, and what rules I can put in place to satisfy the first without breaking the second. Keeping memory typed, especially with things that may be pointers (and a dependant type like T, or as its represented %0, definately can be a pointer) is really one of the easiest ways to keep both C and C++ happy, while being incredibly aggressive in how I use pointer tags to optimize, especially since things like noalias may not be soundly propagatable to the codegen, if there is secondary ir at the codegen level (hecking llvm).

I don't necessarily think UCG needs to adopt a typed memory model, but I think it should at least consider an implementation with typed memory

Rust allows transmuting using unions, which is incompatible with typed memory, or at least the way C has it.

Rust allows transmuting using unions, which is incompatible with typed memory, or at least the way C has it.

To a certain extent, so does C. In this case, all typed memory means is that if I'm told I have a *dereferenceable(size T) T, I can read a T out of it, not just size T bytes which may or may not be easily reconstructible into a T (see again pointers, and particularily how two values of incompatible pointer types can be bitwise equal).
And implementations with stronger typed memory are possible, they just take more effort and lose some advantages.

I have no idea what xlang and lccc are, it would help if you could phrase things in more Rusty terms and not assume tons of extra knowledge. :) If you just want to discuss IR design in general, that's great, but please do not do it in this issue -- open a thread on Zulip or internals.rust-lang.org instead.

As far as copy_into is concerned, I agree with @CAD97 -- the optimization does not require validity to propagate through references, just that references be dereferencable, which basically everyone agrees on. It is true that the resulting program is somewhat awkward to write in Rust, but an IR could easily have a notion of "variables holding T or maybe not", basically like a built-in MaybeUninit<T>. Then this transformation could be represented very directly.

I don't necessarily think UCG needs to adopt a typed memory model, but I think it should at least consider an implementation with typed memory

I feel very strongly that typed memory is a huge mistake. It adds a very complicated bit of "shadow state" that the programmer cannot directly interact with -- on top of all the shadow state that we need to handle aliasing, which is already complex enough. And also it has basically no benefits that I know of, given our strong aliasing model. C uses it to get type-based alias analysis, but we get something much better without typed memory.

"A compiler can share the IR between Rust and C/C++" is really not an argument as far as I am concerned; I do not want all the bad design decisions of C/C++ to "infect" Rust. (They already do, through LLVM, and that's already hard enough to handle.^^)

In this case, all typed memory means is that if I'm told I have a *dereferenceable(size T) T, I can read a T out of it, not just size T bytes which may or may not be easily reconstructible into a T (see again pointers, and particularily how two values of incompatible pointer types can be bitwise equal).

This is not "typed memory". "Typed memory" means that data stored in memory is typed one way or another. What you ask for, that all references point to data that matches their types, does not require memory itself to remember any types because the reference already knows its type.

I feel very strongly that typed memory is a huge mistake. It adds a very complicated bit of "shadow state" that the programmer cannot directly interact with -- on top of all the shadow state that we need to handle aliasing, which is already complex enough. And also it has basically no benefits that I know of, given our strong aliasing model. C uses it to get type-based alias analysis, but we get something much better without typed memory.

I think that the claim is not that Rust should adopt a typed memory model, but it should be compatible with an implementation that has typed memory. Compare this to how Rust targets a number of different architectures, most of which do not have the Rust memory model. The decisions made on these platforms should not "infect" Rust, but at the same time Rust should try not to set things up so that a common operation cannot be compiled without horrific performance.

In this case, @chorman0773 is talking about lowering operations to their xlang IR. I also don't know what that is, but apparently it has a typed memory, and I should hope this doesn't completely make a Rust implementation impossible. As long as the IR has some escape hatches, they can be invoked whenever Rust does something type-unsafe to memory. It is not clear to me whether such escaper hatches would need to be invoked all the time though.

For clarification of terms (some have equivalents in rust, others may not or aren't expressible in such terms, or I am unaware of the terms rust uses):

  • Object is simply a region of storage allocated for the purpose of storing some value of some type for some lifetime (static/let declarations introduce objects, allocation calls introduce objects). Note that the typing of an object does not imply strict-aliasing, I simply consider a region of storage to have a particular type, but you can pull really anything out of that region until you tell me strict aliasing applies
  • lvalue is an expression that designates an object, called a place in rust.
  • Dependant type is a type which depends on the instantiation of a generic parameter which has not been instantiated (so T in fn identity<T>(T)->T is dependant, as is &T, Vec<T>, or something considerably more complex, but for example, [u8;size_of<*mut T>()] is not dependant if T is Sized (if T is not Sized, it is), since the size of *mut T is known without instantiating T). Instantiation is the replacement of a generic parameter with a concrete type/lifetime/value.

lccc is a compiler I am working on that will be able to compile rust, and xlang is the name of the ir it uses. I'm using it to offer opposition to the direction of this thread.

This is not "typed memory". "Typed memory" means that data stored in memory is typed one way or another. What you ask for, that all references point to data that matches their types, does not require memory itself to remember any types because the reference already knows its type.

This is fair, though my point is that its beneficial to "know" I can perform typed memory accesses through references, especially when dealing with pointers or dependant types that can be or contain pointers. As I say, I am strongly opposed to allowing things to just "invent" pointer values from raw memory except in very specific circumstances (std::launder being one of them, and that is incredibly limited). I do technically use objects as a way to reason about this, and that implies typed-memory semantics (though as I say, this doesn't implicitly bring along strict-aliasing).

"A compiler can share the IR between Rust and C/C++" is really not an argument as far as I am concerned; I do not want all the bad design decisions of C/C++ to "infect" Rust. (They already do, through LLVM, and that's already hard enough to handle.^^)

I'm trying to avoid as much as possible issues where the semantics of one language interferes with the other, but observing that maintaining certain types of optimizations is a goal is generally a good argument. I would add that a number of the "bad" design decisions were either the respective standards committee keeping the number of possible implementations wide, and not making things difficult for a particular implementation. I actually have an interest in making rust easily specifiable and, at least in my opinion, the current direction of the thread may impose limitations on people seaking to implement the language, noting that this seems to be a limiting factor with my implementation given a number of other optimizations I have interest in maintaining. Particularily, pointers are really annoying to reason about, so as I have mentioned, I want to limit how they are invented. Even in rust, its possible for a &U and a &mut T to overlap, provided U at some level contains (or may contain, as a dependant type) an UnsafeCell

uninit T is an IR type that mirrors MaybeUninit<T>, but it's really fun to specify conversions between them, especially in the precense of types that do or can have niches (and pointers, again pointers are really fun). I don't want to implement or make use of anything I can't specify

I think that the claim is not that Rust should adopt a typed memory model, but it should be compatible with an implementation that has typed memory.

Indeed, this is what I was attempting to state, rust shouldn't gain typed memory, but should permit implementations that either benefits from, or necessarily enforces typed memory. There is also an advantage to typed memory in rust, at least how I see it, and that is reasoning about UnsafeCell. I distinguish between an UnsafeCell<T> and the inner T at a type level, with a mechanism similar to how C++ distinguishes between a most-derived object and a subobject (even when the objects are pointer-interconvertible which effectively means they share the same address). The advantage of this is that I can apply (weaker) readonly optimizations to a &UnsafeCell, and more particularily, a dependant type &T, where T either may be UnsafeCell or contain an UnsafeCell. (And of course, as I have mentioned, pointers are hecking annoying to reason about without typed memory).

At the relevant level, assuming that not only can I read a [u8;size_of()] out of a &T, but actually read a T (particularily when T is known to be copyable aka, for rust purposes, T: ::core::marker::Copy) has benefits, especially when T is or can be a pointer, without having to deal with the shenanighans of inventing pointers (and allowing the invention of pointers).

Indeed, this is what I was attempting to state, rust shouldn't gain typed memory, but should permit implementations that either benefits from, or necessarily enforces typed memory.

I don't see how you can have a language without typed memory can be implemented on top of an ir with mandatory typed memory. If things forbidden by typed memory are forbidden in a language, as necessary to be implementable using an ir with mandatory typed memory, then that language has typed memory.

I agree with @bjorn3. I also think we already have enough on our plate to define a semantics that is compatible with LLVM(*) and that poses a reasonable trade-off between programmer intuition and UB-enabled optimizations -- we really don't need another set of constraints^^. And finally I think additionally imposing constraints for compatibility with third-party IRs will just mean we make suboptimal choices for Rust. I'm sorry, but that is just not a design goal I think we should consider at all.

Every bit of UB we add will bite some programmer at some point, and will make the language harder to reason about. I proposed simple framework for precisely formalizing type-based invariants, but that framework is insufficient for the kind of invariant you are asking for. So not only will this mean UB for a bunch of unsafe code that would otherwise be fine, it would also mean a notably more complicated model for Rust.

I do appreciate you wanting to help give a precise spec to Rust and your IR, and you clearly did think a lot about this. But that on its own IMO does not justify all the extra cost this would impose -- and as sketched above, the optimization you are asking for is compatible with the proposed semantics, so it's not like we are leaving performance on the table here. If we can reduce UB by making the IR a bit more complicated, I think that is a great trade-off.

(*) I'd love to even get LLVM out of the picture, but that is not realistic.

uninit T is an IR type that mirrors MaybeUninit, but it's really fun to specify conversions between them, especially in the precense of types that do or can have niches (and pointers, again pointers are really fun). I don't want to implement or make use of anything I can't specify

Yeah, conversions are tricky. That's why I did not propose to make it a separate type, but rather to decouple "this variable has type X" from "this variable can be assumed to be valid at type X" in the IR. You can think of this like an extension of the flag that tracks whether some variable is initialized (and thus needs any memory associated with it at all), i.e., an extension of what is called StorageLive/StorageDead in Rust -- there could also be StorageValid.

That said, I don't think it would be so hard to specify the conversions either. But really this is getting waaaaay off-topic for this thread, and deep into "how do I design a compiler IR". An interesting topic, but not the topic of the discussion at hand.

Addendum:

I would add that a number of the "bad" design decisions were either the respective standards committee keeping the number of possible implementations wide, and not making things difficult for a particular implementation.

Indeed, that is likely the key reason for many of the bad design decisions in C/C++ -- the standard seems to consider supporting a wide range of implementations, and ease of compiler writing, way more important that basic comprehensibility by programmers. The result is a language that permits a huge amount of implementations, but makes it basically impossible to write a non-trivial UB-free program. This is exactly the mistake that we should not repeat with Rust. That's why every bit of extra UB needs a really good justification, one that outweighs all the extra pain it will cause programmers.

C/C++ feel to me like standards written primarily by and for compiler writers, entirely disregarding that there are other consumers of the standard that vastly outnumber compiler writers: programmers. If we can make the spec simpler to understand and work with for programmers in a way that makes optimizations a bit harder but not impossible, then I think we should always take that opportunity -- there'll only be a few compilers that will have to solve that problem, compared to the number of programmers that benefit.

It seems pretty inevitable that Rust will spend almost all of its "UB pain budget" on the aliasing model, so everything else should be as boring and simple as we can possibly make it. That's the main reason why I think "validity of a value" should only be concerned with the bits making up the value representation, and not complicate things by taking into account memory.

Indeed, that is likely the key reason for many of the bad design decisions in C/C++ -- the standard seems to consider supporting a wide range of implementations, and ease of compiler writing, way more important that basic comprehensibility by programmers. The result is a language that permits a huge amount of implementations, but makes it basically impossible to write a non-trivial UB-free program. This is exactly the mistake that we should not repeat with Rust. That's why every bit of extra UB needs a really good justification, one that outweighs all the extra pain it will cause programmers.

I think this hits the nail on the head: a language spec is a balance between the needs of the backend - compiler writers and optimization - vs the programmer's mental model. Certainly the programmer should be the higher priority, but compiler writers should also have a seat at the table.

Seeing a compiler writer's perspective of the Rust memory model has opened my eyes a little. It sounds like the situation might be potentially dire, disabling a huge number of optimizations in most situations, so I think it is worthwhile to at least consider the situation enough to ensure that there is a not-horrible option available.

I also think we already have enough on our plate to define a semantics that is compatible with LLVM(*) and that poses a reasonable trade-off between programmer intuition and UB-enabled optimizations -- we really don't need another set of constraints^^.

That sounds like a good way to tie yourself to LLVM decisions and inherit LLVM constraints.

(*) I'd love to even get LLVM out of the picture, but that is not realistic.

And that sounds like an interesting way to make Rust uncompilable. I don't think Rust needs to go nearly as far as C/C++ in supporting compiler writer convenience, but we have to acknowledge that backends (and varieties of backends) exist, and Rust's value as a high performance programming language is to some extent predicated on this interface being reasonably smooth and not incurring undesirable overhead or limiting optimizations (not just optimizations in LLVM architecture) too much.

But to get back to the topic, I think that it is problematic that an arbitrary safe function fn(&mut bool) can not be assumed to leave 3 in the input, even though fn(bool) -> bool can. This is an optimizer problem, since it prevents inventing (typed) reads. But even as a programmer I find it extremely unintuitive that the assertion does not necessarily hold (and as an optimizer I would like to assume that it does):

let mut x = Some(true);
if let Some(ref mut bool_ref) = x {
    f(bool_ref);
}
assert!(x.is_some());

If we can't declare a function that sets the reference to 3 to be UB, it breaks reasoning of safe code. This is a tricky subject to talk about though, because I know that if I turn the crank on the Stacked Borrows model I will get that the assertion doesn't necessarily hold and so one can turn around and say my mental model is broken. But at the type level it's an option of bool and I didn't say the function f had permission to change the tag. The fact that this depends on the nature of the niche optimization makes this especially uncomfortable; is it really an optimization if it changes the meaning of code (safe code with no references to niches or layout)?

Ralf has brought up that the issue with this example is that there is no particular operation that is clearly UB, and additionally this makes UB less easily dynamically checkable. That's a shame but I don't think that this should be a primary factor. Ah, well, maybe I'm just reinventing "Types as Contracts", but I don't really know what it means if they aren't.

I don't think Rust needs to go nearly as far as C/C++ in supporting compiler writer convenience, but we have to acknowledge that backends (and varieties of backends) exist,

This is precisely what I'm saying. Rust should consider the fact that other compilers with different requirements exist or may exist, and llvm should 100% be an implementation detail of rustc.

But even as a programmer I find it extremely unintuitive that the assertion does not necessarily hold (and as an optimizer I would like to assume that it does):

I wasn't considering the programmer side of this, but I do agree. it would be somewhat annoying if passing a reference to the inner-field of a niche-optimized Option could change the Option itself (beyond just the field) (I also 50% hate this from lccc, because I want to have the enclosing Option unreachable from a pointer to the inner T).

I would add that the code snippet would require some unsafe, as it is definately UB if you store 3 into an actual most-derived bool (direct let/static, or really anything not inside an Option, or any other enum that UCG says is niche-optimizable), based on the current rules surrounding validity. For example, if you did

let mut b = false;
f(&mut b);

and f in some way stored 3, that would be UB, so such an f would be unsound if safe. However, I don't believe this negates the overall point.

Hm, setting aside the question of whether this change would be desirable, I think I have a solution to how to make the following code UB with an operational model:

let mut b = true;
let x = &mut b as *mut bool as *mut u8;
*x = 3;

The property we would like to establish is that when the lifetime of the &mut b ends, b is again valid at type bool, but in an operational model we can't just say that, so instead, when &mut b as *mut bool is executed, we push on the borrow stack of b a record that, when popped, reasserts the bool invariant at this byte. That would make the code as written not UB, but if you access a parent of the reference in a way that forces the lifetime of &mut b to end, or if b was an argument to the function and the function ends, then UB would occur at that point. That would make the option setting code in the previous post UB, as well as:

fn foo(b_ref: &mut bool) {
    let x = b_ref as *mut bool as *mut u8;
    *x = 3;
} // UB occurs here, at the `FnBarrier`

I would add that the code snippet would require some unsafe, as it is definately UB if you store 3 into an actual most-derived bool (direct let/static, or really anything not inside an Option, or any other enum that UCG says is niche-optimizable), based on the current rules surrounding validity. For example, if you did

let mut b = false;
f(&mut b);

and f in some way stored 3, that would be UB, so such an f would be unsound if safe. However, I don't believe this negates the overall point.

I'm deliberately eliding the code of function f here, because it represents an arbitrary safe Rust function. By "safe" I mean that it doesn't have unsafe in its signature, even though it can have unsafe in the body, and we are considering optimization and reasoning opportunities for the (entirely safe) code surrounding this function.

As far as I understand, the Stacked Borrows model with shallow validity of references does not yield that your code is UB, and that's exactly the problem. If you were to read the value of b (at type bool, since b is type bool) after calling the function, then UB would occur, but as long as you don't look at it or you cast it to a &mut u8 before reading it, it is UB free.

Honestly, that may be reasonable (it wouldn't be implemented like that, but my model would make it UB). I can play arround with replacing the optimization with one that reads an uninit T, my primary worry is going from Uninit to something with a niche, as well as pointers (though probably asserting either null or invalid for uninit pointers is reasonable, as I say, I do not like inventing pointers). I would definately prefer having typed reads allowed, as it makes reasoning about pointers far easier, but as I mentioned, if there are compelling blockers to doing that, it may be possible to work arround that. (As a side note, I have a very specific operation that exists solely for the purposes of manipulating pointers).

By "safe" I mean that it doesn't have unsafe in its signature

What I was arguing is that it cannot be a safe function without being unsound, because it could be used to cause/allow UB in safe code if it was (writing 3 into a bool at the top level, which produces an invalid value, either period, or when it is used). Reasoning about how an unsound function affects safe code is pointless imo, because it already causes/allows UB in said safe code.

By "safe" I mean that it doesn't have unsafe in its signature

What I was arguing is that it cannot be a safe function without being unsound, because it could be used to cause/allow UB in safe code if it was (writing 3 into a bool at the top level, which produces an invalid value, either period, or when it is used). Reasoning about how an unsound function affects safe code is pointless imo, because it already causes/allows UB in said safe code.

Ah, now you've got me confused. If we have the following function:

fn f(b: &mut bool) {
    unsafe { *(b as *mut bool as *mut u8) = 2 }
}

then we can say that the function is unsound, because it is possible to use it to write safe code that has UB:

let mut b = true;
f(&mut b);
let c = b; // boom

However, the function f does not itself have UB, and there are programs that never exercise that UB:

let mut b = true;
f(&mut b);
let c = unsafe { *(&mut b as *mut bool as *mut u8) };
assert_eq!(c, 2);

My understanding of the nature of UB is that any program that does not have UB is required to be compiled to have the same semantics as the Rust abstract machine, which means that it is illegal to write a compiler that optimizes the assertion to false, because the RAM says it's true. Whether unsound functions are involved or not seems to be irrelevant, but I could be wrong. In particular, the soundness of a function is not something Miri can detect, because it is an assertion that universally quantifies over program states and inputs to the function.

While it is reasonable to take a stance that says we only guarantee that your program compiles correctly if you have no unsound functions, that is a stronger assumption than simply having no UB at runtime, and it greatly decreases the value of having a dynamic UB checker if that's not the precondition for correct compilation.

While it is reasonable to take a stance that says we only guarantee that your program compiles correctly if you have no unsound functions, that is a stronger assumption than simply having no UB at runtime

I'm not saying that in this case, I'm just saying that reasoning about what safe code does is unreasonable in the precense of unsound functions. Safe code can only be effectively reasoned by the programmer about if unsafe code behaves correctly. Whether or not optimizations can affect the reasoning beyond any UB actually caused as the result of unsound uses of unsafe, is unreasonable. Certainly with something like a reference, if you were allowed to, changing a Some(&T) to None by writing a null pointer into a reference that refers to the inner &T would be correct, as the niche-"optimization" of Option<&T> is guaranteed and stable. While this would be suprising, under the basic rules of Rust, this would be a correct result of f. Whether or not this is valid for Option<bool> is up in the air, but under UCG, we could assume that similar reasoning as for Option<&T> applies, and an unsound f could turn a Some(b) into None (in absense of anything saying that unsafe code trying such is UB).

Certainly with something like a reference, if you were allowed to, changing a Some(&T) to None by writing a null pointer into a reference that refers to the inner &T would be correct, as the niche-"optimization" of Option<&T> is guaranteed and stable.

Hm, I would like clarification on this point. The layout optimization of Option<&mut T> using a niche is guaranteed, but I hope that it is not yet stabilized that writing 0 via the inner &mut T reference is guaranteed not UB, as my proposal above would make it UB. (It certainly looks wrong to have it not be UB to write null to a guaranteed non-null reference.)

CAD97 commented

The current Rust documentation doesn't explicitly specify one way or the other. The general working consensus is that if you have (a pointer derived from) &mut T, you're (currently) only allowed to write valid instances of T into the memory described by the pointer.

I've forgotten the original example and probably shouldn't be trying to think this hard when I should be going to bed, but could someone remind me why a rule along those lines ("a write to a raw pointer is a typed copy, typed at the pointee type of not the pointer, but of the pointer type of the reference that is directly above the raw pointer on the borrow stack") is undesirable for the model or insufficient to solve the discussed example in an acceptable manner? To me right now that feels like a formalizable description of the reasonable intuition here.

I think this hits the nail on the head: a language spec is a balance between the needs of the backend - compiler writers and optimization - vs the programmer's mental model. Certainly the programmer should be the higher priority, but compiler writers should also have a seat at the table.

We have almost exclusively compiler writers at the table. ;) I think we do not need to worry about hearing compiler writers here, given that it's mostly Rust (and other) compiler developers discussing the problems. The voice that needs amplifying is that of programmers. I am mostly lending my perspective as a PL researcher here -- people formally studying the language are yet another consumer of the spec, and like programmers, simplicity of the spec is more important for us than simplicity of the compiler.

Seeing a compiler writer's perspective of the Rust memory model has opened my eyes a little. It sounds like the situation might be potentially dire, disabling a huge number of optimizations in most situations, so I think it is worthwhile to at least consider the situation enough to ensure that there is a not-horrible option available.

So far, not a single interesting optimization has been show that is disabled by the proposal to not require recursive validity. The examples brought up this week are unaffected by that choice. So I really don't agree with your rhetorics here.

Your use of "huge" and "dire" is entirely unjustified from all I can tell. 0 is not a "huge number". (Of course in theory there are infinitely many programs that cannot be optimized due to this, but that is true for every single bit of behavior we make not UB, so it is not a useful argument. Hence my qualification with "interesting".)

But to get back to the topic, I think that it is problematic that an arbitrary safe function fn(&mut bool) can not be assumed to leave 3 in the input, even though fn(bool) -> bool can.

Now are are opening an entirely different can of worms -- none of the proposals previously on the table in this thread achieves this! In particular, "validity of references requires them to point to valid data" is insufficient to achieve this. Please stop mixing up entirely separate issues. You are now talking about #84. This does require typed memory in the way I defined it above -- with the memory remembering the type of the data stored in it. And indeed the proposal to have Stacked Borrows remember some type information does exactly that.

Most of the rest of the discussion looks off-topic due to this. Maybe you want to start a new thread, with a clear problem statement (e.g., "I want to enable this optimization")? I am getting a bit lost here, and goalposts keep changing as well: @digama0 just switched to a different piece of code they want to be UB. So now we have 3 threads mixed in 1:

  • the optimization by @chorman0773, which works both with and without recursive validity (but without recursive validity, is slightly more taxing for the IR)
  • the UB by @digama0, which does not yet have an optimization example that it would enable -- but anyway this is not UB with and without recursive validity, it actually requires typed memory to be UB (and this issue has already been brought up at #84)
  • and the original discussion about whether validity should be recursive

GitHub issues are not a great forum for discussion to begin with, but they become even worse when people start mixing threads, so I'd really appreciate if you could stop doing that and open new issues instead. Thanks. :) I am happy to discuss the trade-offs involved here, I just would like to not make the life of whoever will have to summarize this thread unnecessarily hard. Also it seems unlikely at this point that new people joining the discussion will actually read everything above, so a new thread is more honest about what the established arguments are.

That sounds like a good way to tie yourself to LLVM decisions and inherit LLVM constraints.

Indeed, and we should do whatever we can to prevent this. But tying ourselves to LLVM and another IR certainly does not make it any better.

Also note that for validity of references in particular, LLVM imposes no constraints, so for the discussion at hand this is entirely irrelevant. The places where this comes up are mostly pointer arithmetic, integer-pointer casts and aliasing (and for the latter, LLVM is basically broken, so I have ignored it so far).

Your use of "huge" and "dire" is entirely unjustified from all I can tell. 0 is not a "huge number". (Of course in theory there are infinitely many programs that cannot be optimized due to this, but that is true for every single bit of behavior we make not UB, so it is not a useful argument. Hence my qualification with "interesting".)

To clarify, I don't mean to ring the alarm bells here or make a rhetorical argument. I'm saying that it is not clear what the situation is, and I see possible situations where the situation is dire and other possible situations where there is no problem at all, and I can't currently tell which situation is the real one because the spec is still up in the air.

Sorry for picking an off topic example. It was the result of trying to figure out what actual piece of code would demonstrate the difference between shallow and deep validity, using &mut bool validity specifically. But I think your current position wants depth 1 validity? From a formal perspective that seems like an odd choice, but I guess this is some kind of compromise position considering the optimizations that are seen in practice?

But I think your current position wants depth 1 validity? From a formal perspective that seems like an odd choice, but I guess this is some kind of compromise position considering the optimizations that are seen in practice?

How are you counting the depth here?

I am proposing that whether a value v is valid for type T should depend only on v and T, but not on the current state of memory. From a formal perspective, this will give us a bunch of reasoning principles that we would otherwise not get -- for example, if a value is valid at some point during the execution, then it will also always remain valid in the future.

Maybe from a different perspective this is an odd choice, but I think there is a perspective from which this is a very natural choice.

Now are are opening an entirely different can of worms -- none of the proposals previously on the table in this thread achieves this! In particular, "validity of references requires them to point to valid data" is insufficient to achieve this. Please stop mixing up entirely separate issues. You are now talking about #84. This does require typed memory in the way I defined it above -- with the memory remembering the type of the data stored in it. And indeed the proposal to have Stacked Borrows remember some type information does exactly that.

I won't elaborate on it more in this thread, but my proposal could enforce this property without requiring a typed memory.

Thinking more about deep validity, as a programmer this seems like the obvious choice. It's how we teach references: A reference to a T is a pointer to a valid T for some lifetime. When you have a reference you can always dereference it without worrying about null pointer exceptions or segfaults. So I would like to know more about the issues with it.

  • One point I hear is that this makes lots of things UB, and it's better to have less UB. I get that, but safe code is always going to satisfy this, and unsafe code needs the validity property to be simple so that it gets into peoples' heads. "References point to valid data" is easy to understand, "References point to bitwise valid data" is less so. Furthermore the fact that the rules are apparently different in unsafe land, where all the types are mere guidelines and UB is hiding behind every reference, seems like a pretty big footgun. We already have raw pointers if we need not-necessarily valid pointers.

  • Deep references are hard to check dynamically. I think this is not a problem most of the time, unless you are transmuting a struct with embedded recursive references. If it's just a simple struct or a primitive data type then you only have to check the bits of that struct. Is there data on the cost of this in Miri?

  • Shallow validity simplifies the formal semantics. Does it really? The usual way to define a pointer would be to say something like x points to some v which is valid at type T, and this is a deep validity property. Shallow validity requires some step indexing or other approximative definition of validity, and all the safe operations will only be approximately safe. I'm sure there is a way to make it work, but it certainly doesn't look easy.

I am proposing that whether a value v is valid for type T should depend only on v and T, but not on the current state of memory. From a formal perspective, this will give us a bunch of reasoning principles that we would otherwise not get -- for example, if a value is valid at some point during the execution, then it will also always remain valid in the future.

At least in the simple case, a shared reference should have that property already, because while borrowed no one else can touch the data. Similarly for a unique reference. UnsafeCell complicates this, but I think it is possible even in this case to have type safety over time assuming no UB causing operations.

Without having time to respond to all of your points in detail, I think you are referring to a proposal I made earlier that I am not considering the best option any more, namely "references must point to data that is bit-valid but not fully valid". Indeed that was a compromise proposal, and notably it violates the principle "validity should not depend on memory" that I posited above. In this comment I have explained what is still my current position. That post and the ones below it contain a bunch of arguments for not making validity depend on memory.

Shallow validity simplifies the formal semantics. Does it really? The usual way to define a pointer would be to say something like x points to some v which is valid at type T, and this is a deep validity property. Shallow validity requires some step indexing or other approximative definition of validity, and all the safe operations will only be approximately safe. I'm sure there is a way to make it work, but it certainly doesn't look easy.

I agree that "references must point to data that is bit-valid but not fully valid" is an odd semantics (it would certainly not require step-indexing though, to the contrary -- recursive validity might require step-indexing, due to its, well, recursive nature, but shallow validity does not). However, I hope we can agree that "validity does not depend on memory at all" is strictly simpler than any kind of validity that does depend on memory one way or another. The question is whether it is "good enough" from an optimization perspective.

What shallow validity doesn't give you is a proof that this function is safe:

fn f(x: &mut u8) -> u8 { *x }

That's perhaps the crux of my confusion. How can a programmer reason about anything if they are always one step away from possible UB?

That's perhaps the crux of my confusion. How can a programmer reason about anything if they are always one step away from possible UB?

The point of the validity invariant is for the compiler to do optimizations. You seem to be looking for an invariant that helps the programmer do reasoning, but that is a different invariant! I called it safety invariant in the past, but the name is up for debate. Everyone agrees that the safety invariant for references is fully recursive, and I have formalized this to be able to reason about unsafe code (in a step-indexed model, FWIW, and I think that is required).

But the discussion here is about the validity invariant, not the safety invariant. As the blog post I linked above argues, I do not think there is a good reason to make the two the same, and indeed that is not even possible for all types (e.g. Vec) and would lead to undecidable UB for other types (e.g. fn).

But the discussion here is about the validity invariant, not the safety invariant.

As I have mentioned prior in different topics, I don't think there should be a distinction, because it makes it easier to reason about undefined behaviour, one of the worst things a programmer can do. But that is neither here nor there, and that's all I'll say on that.

My argument is that this should be upgraded to a validity invariant, because it maintains an optimization that could be performed manually with varying levels of ease, while not sacrificing certain types of optimizations which can be argued as more important (the example I gave was trivial, and a decent programmer could easily lift it manually, more complex examples with a similar basis exist; there is a reason why I do not want people just inventing pointers or to allow the invention of pointers). Recursive Validity makes things easier to reason about from the compiler side, again particularily with pointers or dependant types that may be or contain pointers particularily. I also think that because you are admitting it as a Safety Invariant, I don't think its unreasonable to make it a validity invariant because of concerns about the burden it imposes on programmers, as its a safety invariant, I think we can consider the programmers already burdened by it, and changing safety to validity here would allow compilers to perform optimizations that may depend on the knowledge it can manipulated typed accesses through references (as shown here).

As I have mentioned prior in different topics, I don't think there should be a distinction, because it makes it easier to reason about undefined behaviour, one of the worst things a programmer can do. But that is neither here nor there, and that's all I'll say on that.

I think such a distinction is absolutely required (my blog post argues for why not making the distinction is impossible in some cases and undesirable in others), and the lang-team has officially decided that the two invariants are different for str. If you want to have "validity = safety" for all types, I'd invite you to open a new issue and lay down your arguments for why you think it would be better to not make this distinction, and how you think to overcome the issues I laid out in my blog post.

If you only want "validity = safety" for references but are fine with them being different for, e.g., str, Vec, or fn(), then please argue why references warrant special treatment here.

My argument is that this should be upgraded to a validity invariant, because it maintains an optimization that could be performed manually with varying levels of ease

For example? So far the set of interesting optimizations that require recursive validity is very sparse.

why I do not want people just inventing pointers or to allow the invention of pointers

You keep talking about "inventing pointers"; I do not see the connection to the discussion here. Most of the time I assume incorrectly-typed references arise through raw pointer casting and then turning the resulting raw pointer back to a reference, or by creating references to not-yet initialized data. In neither case is a new pointer being "invented".

Also I do not buy the argument that making validity harder to prove (by making stricter requirements) will make it easier for programmers to reason about unsafe code. That seems paradoxical to me. Clearly the easiest model for unsafe code authors would be to say that there is no validity invariant ever for any type; then they simply would not have to worry about this problem. The more validity requirements we add, the harder it gets for programmers. Ergo we should make validity as relaxed as possible and as simple as possible to state precisely. In particular, there is a large benefit to having validity be concerned only with the bits of the value in question, and not depend on the content memory -- a point that you have not commented on at all.

and the lang-team has officially decided that the two invariants are different for str.

Well, that's another kind of distinction that you've blogged about before, namely language vs library UB. I don't think the compiler should ever be able to trust library UB-freedom, but I don't yet see a compelling reason why all language-sourced safety properties can't be validity properties too.

If you only want "validity = safety" for references but are fine with them being different for, e.g., str, Vec, or fn(), then please argue why references warrant special treatment here.

For str, the language should presumably just see it as the same as a [u8]. For vec, it's the same as any other unsafe code managing memory layouts directly. For fn(), the obvious safety property is that it points to a function with the right signature, and so the compiler could make assumptions about that, although I don't know what nontrivial facts it could derive since you can always just magic up a function like a JIT.

Also I do not buy the argument that making validity harder to prove (by making stricter requirements) will make it easier for programmers to reason about unsafe code. That seems paradoxical to me. Clearly the easiest model for unsafe code authors would be to say that there is no validity invariant ever for any type; then they simply would not have to worry about this problem. The more validity requirements we add, the harder it gets for programmers.

Constraints are a two way street; they provide an obligation on one hand and an assumption on the other. If all references are be recursively valid, then I can make the same assumptions about references in safe code as unsafe code. That seems like a good thing, since you aren't having to learn about a separate world where all the rules are different, a la the "Tootsie Pop" model. Instead, the different invariants are expressed with different types, i.e. unsafe code makes much heavier use of raw pointers because they have the appropriate weak invariants, as compared to references which have strong invariants.

In particular, there is a large benefit to having validity be concerned only with the bits of the value in question, and not depend on the content memory -- a point that you have not commented on at all.

If you know nothing other than the bits, then it's just an aligned nonzero number. That seems barely any different from a raw pointer.

One operation that seems useful would be switching between a T and a Box at the ABI level, i.e. moving large objects into memory or adding/removing indirections. If references are not recursively valid, then one of those has less information than the other, and so one or the other direction will not be provable by the compiler. I think similar things apply to other kinds of layout transformations as well. Just because LLVM doesn't do much with it now doesn't mean it isn't useful for a compiler to be able to reason about memory, and the weak validity invariant you are proposing makes pretty much all rust memory inaccessible to compiler reasoning.

Well, that's another kind of distinction that you've blogged about before, namely language vs library UB. I don't think the compiler should ever be able to trust library UB-freedom, but I don't yet see a compelling reason why all language-sourced safety properties can't be validity properties too.

Validity invariant and safety invariant are exactly analogous to language UB and library UB -- to the extend that I proposed to rename them to language invariant and library invariant. If you agree that "language UB != library UB", then I don't think there is a consistent way to at the same time argue "language/validity invariant = library/safety invariant".

For str, the language should presumably just see it as the same as a [u8].

See, so you do agree that validity and safety invariants can differ. The same can easily be done for references.

Constraints are a two way street; they provide an obligation on one hand and an assumption on the other.

I think that is the key misunderstanding here: this is just not true for validity invariants.

All public-facing functions of an API may assume that all arguments satisfy the safety invariant. So changing the validity invariant will just mean unsafe code authors have more things to prove, but it will not mean they get any extra assumptions.

If all references are be recursively valid, then I can make the same assumptions about references in safe code as unsafe code.

You can make that assumption even with my proposal for a validity invariant. No need to require recursive validity.

If you know nothing other than the bits, then it's just an aligned nonzero number. That seems barely any different from a raw pointer.

True. And that is by design. All the difference comes from the aliasing model.

("You" here being the compiler. As I have said again and again, this has no impact on what unsafe code authors can assume, just on what they can prove.)

One operation that seems useful would be switching between a T and a Box at the ABI level, i.e. moving large objects into memory or adding/removing indirections. If references are not recursively valid, then one of those has less information than the other, and so one or the other direction will not be provable by the compiler. I think similar things apply to other kinds of layout transformations as well. Just because LLVM doesn't do much with it now doesn't mean it isn't useful for a compiler to be able to reason about memory, and the weak validity invariant you are proposing makes pretty much all rust memory inaccessible to compiler reasoning.

Given that Box guarantees that its content stay at a fixed address in memory, such an optimziation will only be possible in very controlled circumstances, where the compiler has a lot of insight into all code interacting with the Box. So I doubt the extra assumptions gained from validity would help much here.

Validity invariant and safety invariant are exactly analogous to language UB and library UB -- to the extend that I proposed to rename them to language invariant and library invariant. If you agree that "language UB != library UB", then I don't think there is a consistent way to at the same time argue "language/validity invariant = library/safety invariant".

The difference is that the compiler knows about language UB, but a library can have whatever invariants it likes and trigger language UB if those invariants are not satisfied (or not, although in that case declaring library UB seems like overkill). The rust language has a spec (or so we would like), and that spec defines language UB; the absence of which would be the safety invariant. Anything less than this (the validity invariant) is just leaving optimizations on the table.

See, so you do agree that validity and safety invariants can differ. The same can easily be done for references.

I don't disagree that it is possible to do this; I just question whether it is a good idea.

Given that Box guarantees that its content stay at a fixed address in memory, such an optimziation will only be possible in very controlled circumstances, where the compiler has a lot of insight into all code interacting with the Box. So I doubt the extra assumptions gained from validity would help much here.

I don't mean literally Box, but rather a compiler invented type that acts like Box, in the sense that there is some new memory and the data is owned as if it existed at type T but is heap allocated. The compiler may well know that the address is only accessible within one function or just a few for such a circumstance. These kinds of things happen when you are setting up the ABI for a function call, for example, and can't pass a large object in a register (or perhaps not even on the stack, although I don't think rust attempts any optimization along those lines).

It appears that your argument is that compilers should have everything they need given the stacked borrows aliasing model, but if the reference isn't even known to be valid how does it help to know that no one else can see the data (except when they can, because UnsafeCell makes everything harder)? It's too bad that rust never took on the responsibility of doing what LLVM does, because the model is apparently quite a departure from the C like aliasing model, and I would like to see some evidence that most/all of LLVM's optimizations (or more abstractly, the set of general optimizations that have been identified as worthwhile in a compiler) are applicable without dereferenceable and with SB. (This is obviously difficult evidence to procure, but until then I think it is good to hear a compiler dev that says "this causes big problems for me" as a proxy for such evidence.)

CAD97 commented

but if the reference isn't even known to be valid

But it is known to be valid. And here, valid means

  • it's dereferencable*
    • it's nonnull
    • it's aligned
    • it has provenance over at least size_of_val(self) bytes at the described location (this follows from the aliasing model).

That is exactly the required information for the compiler to insert a read of the described location. The only thing it can't do is introspect any of the bytes.

As soon as user code actually reads the memory, it also is now guaranteed

  • the read memory is (bit/language) valid for the pointee type.

I'd like to see an actual optimization that you think is impossible without recursive language validity of pointee types. Here I feel it necessary to point out that "I can't do the optimization within so-and-so simplified model" is different from "this optimization can't be justified." The former is significantly less important, because there are vastly more unsafe code authors than optimization authors.


* oh jeez the Arc::drop issue terrifies me because the dereferencable only lasts through last use and not the end of the function

† this is my own basic understanding and may be incorrect; do not take it at face value; in any case this is a nonguaranteed property of the language

comex commented

I don't mean literally Box, but rather a compiler invented type that acts like Box, in the sense that there is some new memory and the data is owned as if it existed at type T but is heap allocated.

Can you clarify? "Not literally Box" would seem to apply when the source code has a T and the compiler wants to pass it around as CompilerInventedBox<T>. But in that case the language-level rules for &T are irrelevant since there are no language-level references. Those rules would matter more in the opposite direction: when the source code has &T or Box<T> and the compiler wants to pass it as T. But in that case it is literally Box (or a reference, which has the same guarantee about preserving pointer identity).

comex commented

@digama0

I would like to see some evidence that most/all of LLVM's optimizations (or more abstractly, the set of general optimizations that have been identified as worthwhile in a compiler) are applicable without dereferenceable and with SB.

rustc does use dereferenceable, as the rules being discussed here should justify it. There are not-fully-resolved concerns about dereferenceable lasting too long, but those concerns also caused miscompilations in C++ so it's not a Rust-specific issue.

More trickily, rustc also uses noalias on function parameters… or at least would be using it if not for longstanding LLVM optimizer bugs (which might actually get fixed soon; there's a mammoth patch going through review). As I understand it, while noalias is only a gross approximation of Stacked Borrows, it is believed to be a sound approximation, i.e. any program which satisfies Stacked Borrows must satisfy LLVM's definition of noalias (in the places where rustc uses it). Hopefully someday it will be possible to provide more precise aliasing information to LLVM's optimizer.

@CAD97

I'd like to see an actual optimization that you think is impossible without recursive language validity of pointee types.

Well, since you mentioned speculative dereferencing, one example would be speculatively dereferencing twice, from &&i32 to i32.

Alternately, consider speculative dereferencing's stronger cousin, "argument promotion", an LLVM transformation that changes function arguments from pointer-to-T to T. This is valid if if it can prove that (a) the function implementation only loads from the pointer, without storing to it or observing its identity; and (b) the pointer is always dereferenceable. This can be performed twice on the same argument, but then the pointer would have to be known to be doubly dereferenceable. That can sometimes be proven from context, but it would be easier if it were always guaranteed. Currently there is no way to express this property in LLVM IR anyway, but in theory there could be, or the same optimization could be implemented on the MIR level.

I'd like to see an actual optimization that you think is impossible without recursive language validity of pointee types.

So the concern I raised was over optimizations in a generic context, where a dependant type may be a pointer. In the IR I use, pointers are incredibly special, and inventing them out of a bunch of bytes that may not actually have been a pointer is not a thing I want to allow. This may not be a total concern, and I am looking into alternative solutions.
Another concern is reachability, but #84 would be sufficient to make the reachability concern a non-issue (In brief, I would be valid to assume that passing a &mut T from the inner field of an Option<T> cannot alter the variant of the Option itself, even when T is a dependant type that may be a niche-otpimization candidate).

Depending on the results of both, it may be reasonable to withdraw my concern on this issue.

CAD97 commented

In the IR I use, pointers are incredibly special, and inventing them out of a bunch of bytes that may not actually have been a pointer is not a thing I want to allow.

FWIW, it's safe code to write rand::<usize>() as *const T in Rust, so you can invent completely arbitrary pointers in Rust so long as you don't dereference them. If your IR can't support that, it can't support compiling even just all safe Rust code correctly. It's not a reference, it's a pointer type, but it is still a pointer.

When I'm referring to "inventing pointers" I'm specifically referring to inventing pointers to things. Any invented pointer is basically always either null or invalid, so it doesn't actually point to something, and saying it does is pretty much a poor idea (and a great way to get annihilated by compiler optimizations). *const T is fine because it doesn't promise anything that contradicts null_or_invalid (doesn't really promise anything at all).

When I'm referring to "inventing pointers" I'm specifically referring to inventing pointers to things. Any invented pointer is basically always either null or invalid, so it doesn't actually point to something, and saying it does is pretty much a poor idea (and a great way to get annihilated by compiler optimizations). *const T is fine because it doesn't promise anything that contradicts null_or_invalid (doesn't really promise anything at all).

Just to make sure that I fully understand what you are asserting here...

It is very common in low-level programming (think embedded or OSdeving) to communicate with CPU peripherals by reading and writing data at mutually agreed upon memory addresses. This pattern is called memory-mapped IO, and the most convenient way to implement it in a higher-level programming language is 0xB0000 as *mut u8.

In a higher-level context, using dynamically loaded libraries (of the dlopen() kind) involves taking memory addresses from the OS and entrusting them to point to data of a specific type which wasn't there at program compilation time. This functionality is commonly used for plugins, live code updates, dispatching a single standardized API to multiple video card drivers... among many other things.

Depending on what you are asserting here, even plain memory allocation might be invalid in your model, since after all it involves getting an *mut u8 from the OS, writing a bunch of stuff in there that may include pointers, and later reinterpreting that *mut u8 as an *mut T.

In that sense, I think that supporting some sort of "please assume that that this bunch of bytes is a valid pointer" pattern is a prerequisite for system programming.

Depending on what you are asserting here, even plain memory allocation might be invalid in your model, since after all it involves getting an *mut u8 from the OS, writing a bunch of stuff in there that may include pointers, and later reinterpreting that *mut u8 as an *mut T.

In that sense, I think that supporting some sort of "please assume that that this bunch of bytes is a valid pointer" pattern is a prerequisite for system programming.

This is an interesting pattern. Assuming that you actually want to get references and not just pointers to random locations, it seems you need to assume that the invented number does not overlap any memory under the control of the rust allocator, because otherwise you can arbitrarily corrupt the state of the machine. So this has to be UB.

In the stacked borrows model, I guess this is worked around by the fact that the RAM makes a sort of "claim" to parts of memory that it has allocated, through tags stored in memory, and any writes to those locations will invalidate the tags that are there and cause UB. It nevertheless seems difficult as an unsafe code author to ensure that an operation like *(0x80000 as *mut u8) = 42 is not UB, though, because Rust isn't providing any indication of what memory areas are under its control; for all you know every byte in memory is owned by something up the call stack, and then your write will be UB.

This is an interesting pattern. Assuming that you actually want to get references and not just pointers to random locations, it seems you need to assume that the invented number does not overlap any memory under the control of the rust allocator, because otherwise you can arbitrarily corrupt the state of the machine. So this has to be UB.

Under the intptr model doing a ptr to int cast will make the compiler assume that any future (I think for as long as the original pointer is valid) int to ptr cast may create a pointer to the memory pointed to by the previously casted pointer.