CTSRD-CHERI/llvm-project

Less hacky approach for LLVM address spaces and purecap ABIs

jrtc27 opened this issue · 0 comments

LLVM defines the default address space as 0, with all others reserved for target-specific meaning. All our targets leave address space 0 as being DDC-relative/authorised integer addresses, and address space 200 (arbitrary, though some places still hard-code this) as being capabilities. The data layout allows the program, globals and alloca address spaces to be set to a non-zero value, but often none of those are relevant in places that hard-code an address space of 0, all of which are wrong when targeting a pure-capability ABI. This means we have hacks to hard-code address spaces or use one of the unrelated program/globals/alloca address spaces in nonsense places, and have to make various intrinsics polymorphic so they work with capabilities. The current state is not suitable for upstreaming. This leaves us with several options:

  1. Change the language to give less special meaning to address space 0 such that there is a configurable default address space (which could apply to the syntax, such that i8* means default address space not address space 0, or could not)
  2. Change our use of address spaces such that 0 is used for capabilities in purecap ABIs (and something else for integers if we want to keep that support)
  3. Introduce something orthogonal to address spaces to represent capabilities vs integers

3 seems like the conceptually nicest approach, but totally intractable. 2 is arguably what the current language mandates, but is a total pain for backends and may not be compatible with GlobalISel, since the address space for pointer types is a constant in TableGen sources and thus you'd need to have multiple copies of patterns, each predicated on the ABI type, but otherwise totally identical, and SelectionDAG is probably just as awful. That leaves 1, which seems the most doable, but requires buy-in from upstream that this is the right path forwards, as it is a significant change to the language specification.