CTSRD-CHERI/llvm-project

Pointer equality succeeds surprisingly

davidchisnall opened this issue · 8 comments

At some point in the last few years, pointer equality was redefined from identity to address equality. So far, this has caused confusion for everyone who I've helped write CHERI C in a scenario with temporal safety. In particular, the revoker clear the tag and so we end up with variations of:

void *a = malloc(...);
free(a);
void *b = malloc(...);
if (a == b)
 ...

This comparison is expected to always fail, yet can succeed because of the change to pointer equality. Worse, it is trivial to use intrinsics or casts to ptraddr_t to get the address if that's what you want to compare, but it is nontrivial to build the exact comparison. The above can be rewritten as:

if ((ptraddr_t)a == (ptraddr_t)b)

This is portable and does what you want if you want only address comparison.

This problem is much worse in C++, where the == often comes appears in a template and so cannot be trivially replaced with something else.

Please can we revert to the less-confusing behaviour?

I don't think we ever used exact equality since it broke cases such as realloc() checking for in-place updates - we only compared address+tag.
I think for equality comparisons including the tag should be generally compatible, but it caused quite a few issues for relational comparisons (e.g. things like x+offset > end or while (ptr < (void*)-1).

In a 100% CHERI-aware setting, I think I agree that stricter comparisons make sense (although I am still a bit concerned about the relational operators in case arithmetic makes something become unrepresentable).
It should be possible to get the stricter behaviour using the -cheri-comparison=exact compiler flag (it will emit CExEq for MIPS), but it appears this flag is unimplemented for RISC-V... https://cheri-compiler-explorer.cl.cam.ac.uk/z/orG8hb

Would you want the strict "compare all capability bits" behaviour or the somewhat relaxed address+tag semantics?

I don't think we ever used exact equality since it broke cases such as realloc() checking for in-place updates - we only compared address+tag.

The realloc case was why we originally decided to use exact comparison. If we realloc does in-place updates then it either:

  • Doesn't resize (sizeclass-based allocators) and originally returned the larger bounded.
  • Does resize, in which case you need the comparison to fail because callers need to know to update stashed copies of the pointer.

That said, the realloc comparison is actually UB and recent clang optimises it to false for non-CHERI platforms.

I think for equality comparisons including the tag should be generally compatible, but it caused quite a few issues for relational comparisons (e.g. things like x+offset > end or while (ptr < (void*)-1).

I am mostly happy for ordered comparisons to use only the address, because they don't imply substitutability.

Would you want the strict "compare all capability bits" behaviour or the somewhat relaxed address+tag semantics?

For equality, I want it to be all bits. For less / greater comparisons, there isn't a total ordering that makes sense. Consider:

  • A has address 5, bounds 4-10, read only.
  • B has address 5, bounds 3-8, write only.
  • C has address 6, bounds 4-10, read only.

Clearly A != B, but I don't think that there's a way of defining an ordering such that A<B or A > B makes sense. A < C and B < C both seem understandable, if not totally obvious. With C++20's addition of weak and partial orderings as core abstractions, I hope that we can define that pointers are something weaker than strongly ordered for CHERI C++.

The equality test is as it is for two reasons: it is more compatible with real-world software and it follows the actual C standard.

On the former, it may be true that getting the behaviour you want is simple, but patching it over and over is a pain and those uses of a magic CHERI-specific ptraddr_t are not upstreamable.

On the latter, C99 says:

Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.

This very clearly states that in a CHERI world the address must be used and bounds not used. Using the tag would likely also break uintptr_t uses in the real world, and I do not want different semantics between real pointer and uintptr_t equality, uintptr_t is already weird enough as it is.

Plus, the use-after-free, whether realloc or not, is UB, and recent GCC fortification options break in the same way as CHERI if you skip updating pointers when the address doesn’t change after realloc.

The equality test is as it is for two reasons: it is more compatible with real-world software and it follows the actual C standard.

The former, at least, is the opposite of my experience. Both are equally compatible with code that does not use any CHERI features, the address comparison breaks code as soon as you do any bounds manipulation (including sub-object bounds inserted by the compiler).

Going through the points in the spec in turn:

Two pointers compare equal if and only if both are null pointers

Fine with bitwise identity. With address-only comparison, a tagged capability at address 0 will compare equal to a null pointer constant.

both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function.

With address comparison, an untagged capability containing the address of object A is not a pointer to object A, so may not use this rule to compare equal to a pointer to that object and may not compare equal under any of the other rules.

both are pointers to one past the last element of the same array object,

This holds in both cases.

or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.

I believe C23 clarifies that this is not part of the 'if and only if' and is allowed but not required. This breaks any provenance model, including the ones that all compilers depend on. This is one of Peter Sewell's examples: all modern C/C++ compilers contain cases where they will optimise assuming that these will compare not-equal. Any code that requires this behaviour has been miscompiled for years.

This very clearly states that in a CHERI world the address must be used and bounds not used

I don't know how you reach this conclusion, since this paragraph does not say that and it was explicitly written to allow segmented architectures to include the segment ID + offset in equality comparison and not require mapping to a linear address space.

Using the tag would likely also break uintptr_t uses in the real world, and I do not want different semantics between real pointer and uintptr_t equality, uintptr_t is already weird enough as it is.

How so? It means that the following compare not-equal:

uintptr_t ptr = (uintptr_t)&obj;
uintptr_t addr = (uintptr_t)(ptraddr_t)&obj;

This is desirable behaviour, because one can be cast back to a usable pointer and the other cannot. If they compare equal, then this introduces a lot of footguns into any code that uses sealing incredibly hard to write with correct security guarantees: a tag-clearing unseal operation will end up with an untagged capability that will compare equal to something that has provenance.

I think you would find it very hard to find a programmer who would find the semantics of this to be expected:

bool is_valid(uintptr_t val)
{
   return val == (uintptr_t)&obj;
}

void use_thing(uintptr_t val)
{
  if (is_valid(val))
  {
    SomeType *ptr = (SomeType*)val;
    ptr->someField = 12; // <- Crash because ptr is untagged, yet compares equal to `&obj`.
  }
}

I certainly have no come across anyone for whom this is expected behaviour.

Note that compliance with a de-facto C standard has always been a higher-priority goal than conformance with the letter of the de-jure C standard for CHERI C.

Plus, the use-after-free, whether realloc or not, is UB, and recent GCC fortification options break in the same way as CHERI if you skip updating pointers when the address doesn’t change after realloc.

Yes, I said almost that:

That said, the realloc comparison is actually UB and recent clang optimises it to false for non-CHERI platforms.

The equality test is as it is for two reasons: it is more compatible with real-world software and it follows the actual C standard.

The former, at least, is the opposite of my experience. Both are equally compatible with code that does not use any CHERI features, the address comparison breaks code as soon as you do any bounds manipulation (including sub-object bounds inserted by the compiler).

Subobject bounds isn't clear-cut like you portray. There are cases where including the bounds helps, but there are also cases where it hurts (e.g. using a pointer as a key and you're inconsistent about taking a pointer to a field vs casting/doing pointer arithmetic).

Going through the points in the spec in turn:

Two pointers compare equal if and only if both are null pointers

Fine with bitwise identity. With address-only comparison, a tagged capability at address 0 will compare equal to a null pointer constant.

A tagged capability at address 0 isn't a real issue, you can't realistically have an object at address 0 in traditional de-facto C and CHERI C doesn't remove most of those footguns.

both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function.

With address comparison, an untagged capability containing the address of object A is not a pointer to object A, so may not use this rule to compare equal to a pointer to that object and may not compare equal under any of the other rules.

An untagged capability containing the address of object A isn't a valid pointer in the first place and must have come from one of:

  1. Out-of-bounds arithmetic: UB
  2. Using a pointer to an object whose lifetime has ended and been revoked: UB
  3. Casting an integer to a pointer: implementation-defined

Point 3 is the only legitimate case, and for that we're free to define the semantics however we choose. We choose this because, empirically, this appears to work best. This is particularly true if uintptr_t is involved, where whether you get a tag or not can depend on the exact arithmetic being performed and the optimisation level. Otherwise you have things like if (((uintptr_t)p & PAGE_MASK) == 0) as standard upstream code which then doesn't work because the bounds get inherited on the left-hand side and, if you happen to have a pointer to a low page, probably stays representable and keeps its tag.

both are pointers to one past the last element of the same array object,

This holds in both cases.

or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.

I believe C23 clarifies that this is not part of the 'if and only if' and is allowed but not required. This breaks any provenance model, including the ones that all compilers depend on. This is one of Peter Sewell's examples: all modern C/C++ compilers contain cases where they will optimise assuming that these will compare not-equal. Any code that requires this behaviour has been miscompiled for years.

This very clearly states that in a CHERI world the address must be used and bounds not used

I don't know how you reach this conclusion, since this paragraph does not say that and it was explicitly written to allow segmented architectures to include the segment ID + offset in equality comparison and not require mapping to a linear address space.

Whilst compilers may ignore this somewhat and it may have since been relaxed, it is clear that the intent is (still) that, if you have bool equal(void *a, void *b) { a == b; } the comparison being performed is "do these refer to the same memory location". In a segmented world the segment ID is included in the determination of what that memory location is. In a CHERI world the bounds are not part of determining the memory location, only whether you can access it. In a world where sealing and memory capabilities are disjoint, one could argue that the capability "kind" should be taken into account, with integers being memory-kinded so as to only perturb sealing things, but our architectures do not have such a distinction.

Using the tag would likely also break uintptr_t uses in the real world, and I do not want different semantics between real pointer and uintptr_t equality, uintptr_t is already weird enough as it is.

How so? It means that the following compare not-equal:

uintptr_t ptr = (uintptr_t)&obj;
uintptr_t addr = (uintptr_t)(ptraddr_t)&obj;

This is desirable behaviour, because one can be cast back to a usable pointer and the other cannot.

You see it as desirable. I see it as "it depends", and in practice our experience is that the "comparing capabilities with different metadata but the same address" case is common. There are always going to be cases where using the other interpretation would help, but the concern is what's more compatible, both with the intent of the standard (where there's already the notion that two pointers can compare equal but have different provenance) and real-world code.

If they compare equal, then this introduces a lot of footguns into any code that uses sealing incredibly hard to write with correct security guarantees: a tag-clearing unseal operation will end up with an untagged capability that will compare equal to something that has provenance.

Yes, programmers need to be careful when dealing with untrusted input and use the exact equality intrinsic if this is relevant to their use case. But "it's easy for people to accidentally write new incorrect CHERI-specific code" doesn't trump large C/C++ corpus compatibility, because if CHERI C/C++ is too incompatible with how real-world code uses C/C++ that makes it less likely it'll be adopted and there won't be any CHERI platforms for you to write your potentially-buggy CHERI sealing code for outside of perhaps some niche microprocessors.

I think you would find it very hard to find a programmer who would find the semantics of this to be expected:

bool is_valid(uintptr_t val)
{
   return val == (uintptr_t)&obj;
}

void use_thing(uintptr_t val)
{
  if (is_valid(val))
  {
    SomeType *ptr = (SomeType*)val;
    ptr->someField = 12; // <- Crash because ptr is untagged, yet compares equal to `&obj`.
  }
}

This only works in non-CHERI C because of the special nature of pointer-to-integer casts in the de-facto (and proposed) provenance models; if you call use_thing with a one-past-the-end pointer that happens to alias obj, the only reason this isn't UB is because the cast of &obj to uintptr_t exposes it. This isn't testing the notion of equality so much as the notion of integer-to-pointer casts in non-CHERI C provenance models. If you change all your uintptr_ts to void * then use_thing is UB when called with a one-past-the-end pointer that aliases obj, and compilers optimise based on that. So really your concern is "uintptr_t is weird", which is true in CHERI C, because there are no good options, only empirically less bad ones, and rules like "don't break (uintptr_t)p & mask (in)equality" in order to not be incompatible with swathes of real-world code restrict our options. In a language without uintptr_t you probably could have more luck imposing stricter semantics, but I don't see how you can feasibly do that for C/C++. For us to change it, someone would have to do a large-scale study of how disruptive the change would be; I'm struggling to think of cases where we've had to patch upstream software because of the current (in)equality semantics. And of course those changes to use ptraddr_t would not be upstream able, so we'd have to maintain them until CHERI becomes a deployed architecture, because upstreams don't tend to be receptive to patches that introduce new weird types defined by niche research architectures.

I certainly have no come across anyone for whom this is expected behaviour.

Note that compliance with a de-facto C standard has always been a higher-priority goal than conformance with the letter of the de-jure C standard for CHERI C.

Plus, the use-after-free, whether realloc or not, is UB, and recent GCC fortification options break in the same way as CHERI if you skip updating pointers when the address doesn’t change after realloc.

Yes, I said almost that:

That said, the realloc comparison is actually UB and recent clang optimises it to false for non-CHERI platforms.

Subobject bounds isn't clear-cut like you portray. There are cases where including the bounds helps, but there are also cases where it hurts (e.g. using a pointer as a key and you're inconsistent about taking a pointer to a field vs casting/doing pointer arithmetic).

If you're using the pointer as a key then you probably want to be explicitly casting to the address explicitly. If you are using address identity and you set a value using a bounded pointer and query it with an unbounded pointer then things will probably work with address comparison, but the converse is tricky because you may then experience out-of-bounds accesses because you expect to be able to do the same operations on the queried value as on the stored key.

A tagged capability at address 0 isn't a real issue, you can't realistically have an object at address 0 in traditional de-facto C and CHERI C doesn't remove most of those footguns.

Neither of these is true:

  • In de-jure C, a null pointer is not necessarily a bit pattern of zeroes, it is simply the bit pattern that the implementation uses to represent the conversion from an ICE of 0 to a pointer type.
  • In de-facto C, 0 is often a mappable address. On AMD GPUs, it's a valid stack address, in kernels and embedded systems it's often a real location. Modern kernels typically map a no-access page there but I've encountered embedded systems that have an MMIO device starting at address 0. CHERI can differentiate between a null pointer and a pointer to this location, non-CHERI C has to be very careful about compiler optimisations when using it.

Point 3 is the only legitimate case, and for that we're free to define the semantics however we choose. We choose this because, empirically, this appears to work best. This is particularly true if uintptr_t is involved, where whether you get a tag or not can depend on the exact arithmetic being performed and the optimisation level. Otherwise you have things like if (((uintptr_t)p & PAGE_MASK) == 0) as standard upstream code which then doesn't work because the bounds get inherited on the left-hand side and, if you happen to have a pointer to a low page, probably stays representable and keeps its tag.

That's an interesting example, but one that could be trivially fixed by changing uintptr_t to size_t, whereas there is no trivial and portable fix for the cases where the user expects the converse.

You see it as desirable. I see it as "it depends", and in practice our experience is that the "comparing capabilities with different metadata but the same address" case is common. There are always going to be cases where using the other interpretation would help, but the concern is what's more compatible, both with the intent of the standard (where there's already the notion that two pointers can compare equal but have different provenance) and real-world code.

That is also my concern and my experience is that there is a rapid uptick in cases where the address comparison causes surprising and incredibly hard-to-debug failures once you start using CHERI features in the codebase. These failures depend on dynamic data values and the symptom is often very far from the cause. In contrast, the failures that I've seen from the converse are things that can be detected via compiler warnings (comparison of provenance-carrying uintptr_t can never succeed).

Point 3 is the only legitimate case, and for that we're free to define the semantics however we choose. We choose this because, empirically, this appears to work best. This is particularly true if uintptr_t is involved, where whether you get a tag or not can depend on the exact arithmetic being performed and the optimisation level. Otherwise you have things like if (((uintptr_t)p & PAGE_MASK) == 0) as standard upstream code which then doesn't work because the bounds get inherited on the left-hand side and, if you happen to have a pointer to a low page, probably stays representable and keeps its tag.

That's an interesting example, but one that could be trivially fixed by changing uintptr_t to size_t, whereas there is no trivial and portable fix for the cases where the user expects the converse.

I agree this is trivially fixable by changing the underlying types, however, there are cases where the compiler can't necessarily know what the correct behaviour should be (we can't easily look through intermediate variables (even constexpr ones) and performing a "all-bits" comparison will lead to surprising behaviour. It's a tradeoff on which code we want to break - and in my experience the alignment checking code is quite common.
Additionally changing uintptr_t to size_t may not necessarily be upstreamable since it's not guaranteed to be large enough for an address (although in practice it will be pretty much anywhere).

You see it as desirable. I see it as "it depends", and in practice our experience is that the "comparing capabilities with different metadata but the same address" case is common. There are always going to be cases where using the other interpretation would help, but the concern is what's more compatible, both with the intent of the standard (where there's already the notion that two pointers can compare equal but have different provenance) and real-world code.

That is also my concern and my experience is that there is a rapid uptick in cases where the address comparison causes surprising and incredibly hard-to-debug failures once you start using CHERI features in the codebase. These failures depend on dynamic data values and the symptom is often very far from the cause. In contrast, the failures that I've seen from the converse are things that can be detected via compiler warnings (comparison of provenance-carrying uintptr_t can never succeed).

I would be extremely happy if we could actually diagnose all these cases in the compiler, but as of today I am not convinced this is always possible. I fully agree that we should support your desired behaviour, but I don't think we can do this by default since I am certain flipping the default will silently break existing code.

That being said, I think we should fix -cheri-comparison=exact to work on Morello+RISC-V and you could enable that flag for all CHERI-aware software? I'd like to see it as the default, but with the current compiler I fear it breaks too much real-world code.

That being said, I think we should fix -cheri-comparison=exact to work on Morello+RISC-V and you could enable that flag for all CHERI-aware software? I'd like to see it as the default, but with the current compiler I fear it breaks too much real-world code.

This would definitely be great for the embedded code and we could turn it on by default in our build system.

I worry a bit about the ABI implications here for systems with more-separate compilation, especially with respect to ODR issues. std::set<void*>::insert, for example, would produce ODR violations if compiled in two compilation units with different settings, so we might need to either have an ELF flag indicating the mode or change name mangling if we want it to be more than an experimental flag.