riscv/riscv-isa-manual

Clarify PMP behaviour for accesses split across pages

Timmmm opened this issue · 16 comments

Suppose you have a virtual memory access that spans a page boundary of two pages that are not contiguously mapped to physical memory. This means you'll be doing two accesses. Suppose each of the sub-accesses is fully within two different PMP regions. So the first access is fully within PMP region 0, and the second access is fully within PMP region 1 (and they have the appropriate permission bits set). Like this:

image

The spec is not entirely clear whether this should succeed or not. It says:

The lowest-numbered PMP entry that matches any byte of an access determines whether that access succeeds or fails. The matching PMP entry must match all bytes of an access, or the access fails

But it's not clear if it is talking about the overall access (the blue box), or each red boxes.

It does say:

Note that a single instruction may generate multiple accesses, which may not be mutually atomic. ... Notably, instructions that reference virtual memory are decomposed into multiple accesses.

But it's not clear if this is talking about multiple accesses due to page table walking, or due to splitting across page boundaries.

Generally it would benefit from defining "access"!

This means you'll be doing two accesses.

Not necessarily. A memory-access instruction with a misaligned effective address may give rise to multiple accesses, but it also might not. (The page-crossing aspect is a red herring; it's possible and valid to implement page-crossing accesses as a single access.) It's also valid for this situation to give rise to multiple memory accesses, if the implementation so chooses.

And that gets to the heart of your question. If the implementation performs only one access, then the PMP constraint about the entire access fitting within the PMP applies to that access. If the implementation performs multiple accesses, then the PMP constraint applies to each individual access. That is to say, the blue box and the two red boxes are both legal outcomes.

Ok now I'm even more confused! :-D

So are you saying that in the example the overall physical read/write (both red boxes) could be a "single access" even though they are not contiguous? I always assumed an "access" would always be at least contiguous. What exactly is an "access"?

This is still very unclear to me.

No, what Andrew was saying is that an overall access by a load/store instruction may be performed as one memory access or may be broken up (aka decomposed) into pieces and performed as multiple memory accesses.

Yes but the example I gave MUST be decomposed into pieces because it is discontiguous in physical memory.

So the point is that a single memory access that straddles two PMP regions will be checked as one access

How? PMP checks can only be done on physical addresses, and the physical addresses accessed are two discontiguous chunks.

while individual "piece" accesses may each fall in just one region or the other - and will each be individually checked.
..
So, IFF decomposed, they each individually must pass or fail both PMP checks and MMU checks (regardless if they are contiguous or not).

Yes.. they are individually checked, but are they part of the same "access" or not. That's important because of this bit from the spec:

The lowest-numbered PMP entry that matches any byte of an access determines whether that access succeeds or fails. The matching PMP entry must match all bytes of an access, or the access fails

The fundamental issue is that the spec talks about "access" but we have at least two types of "access" and it isn't clear which it is talking about:

  • The access that the instruction requested (which may be in physical or virtual memory).
  • The actual accesses of physical memory, or which there can be 1 or more if it needs to be decomposed (or even if it doesn't).

Let me give me two alternative specifications that would specify the desired behaviour and you can tell me which one is the intended one. :-)

Option 1

The lowest-numbered PMP entry that matches any byte of an access determines whether that access succeeds or fails. The matching PMP entry must match all bytes of an access, or the access fails. If an instruction's memory access is decomposed into multiple physical memory accesses (for example because it crosses a virtual page boundary and the pages are not mapped to contiguous physical memory), then all bytes of all of the decomposed accesses must match a single PMP entry or the overall access fails.

Option 2

The lowest-numbered PMP entry that matches any byte of an access determines whether that access succeeds or fails. The matching PMP entry must match all bytes of an access, or the access fails. **If an instruction's memory access is decomposed into multiple physical memory accesses (for example because it crosses a virtual page boundary and the pages are not mapped to contiguous physical memory), then each physical memory access is independent. If any of them fail then the overall access fails, but they do not need to all match the same PMP entry.

@allenjbaum thanks for the table but I couldn't figure out the layout. Any chance you could post it again in markdown format or CSV?

It’s simply not true that physical discontinuity mandates that an access be broken up. Yes, it’s a practical choice to do so, but nothing in the spec says that it must be so. My earlier post lays out the valid options.

Ok but either way it could be decomposed into multiple accesses, so the question still stands. If it does decompose it into two physical accesses and they match different PMP regions, is the access required to fail, succeed, or either?

The second paragraph of my original post directly answers that question.

Ah ok, I think I didn't follow because you said the blue box is a valid outcome, but that's not a physical memory access. So would you say this is correct?

If an instruction's memory access is decomposed into multiple physical memory accesses then each physical memory access is independent. If any of them fail then the overall access fails, but they do not need to all match the same PMP entry.

A single physical access may read or write more than one discontinuous region or memory. For example if a virtual memory access crosses a page boundary and the pages are not mapped to contiguous physical memory, then two discontiguous regions of physical memory will need to be accessed. This may be done as a single access or decomposed into multiple accesses. If it is performed as a single discontiguous access then all the bytes in both regions (but not the bytes in-between) must be in a single PMP region in order for the access to succeed. From a software perspective the behaviour in this case is implementation defined.

Yeah, the blue box comment was confusing in retrospect. What I was trying to say is that not breaking up the access is valid.

Indeed I agree with that description, but I'll add two things for clarification:

  • Although "if any of them fail then the overall access fails" is true in the sense that an exception will be raised, it is also the case that a subset of the original access may be performed--namely, for the subset of accesses that pass the PMP check, side effects may be actioned and store data may be written to memory. (If you consider that misaligned accesses may be trapped and emulated using a sequence of byte accesses, it makes sense why this might happen.)
  • The "implementation-defined" characterization is true, but the set of valid behaviors is quite heavily restricted (to the set of behaviors we've been discussing in this thread).

That makes sense, thank you!

a subset of the original access may be performed

Ah is this what defines an "access" - something that will be performed in its entirety or not at all?

If so that makes it much clearer. I'll try to make a PR at some point to add clarifying text.

I think we should follow the RVWMO spec’s terminology, which differs from what we’ve informally used in this thread. In that spec, we say that a memory-access instruction gives way to potentially multiple memory operations. Yes, those operations are indivisible in the context of precise exceptions, coherent memory regions, etc.

Ah interesting. That seems slightly inconsistent with the PMP spec wording then. If a memory-access instruction (e.g. lw) leads to a single "access" that is performed using 1 or more atomic "operations", then this:

The lowest-numbered PMP entry that matches any byte of an access determines whether that access succeeds or fails. The matching PMP entry must match all bytes of an access, or the access fails.

suggests that all bytes must be in the same PMP entry, i.e. the example in my first comment MUST fail. But from this discussion that isn't the case so it should be worded something like this:

A memory access can be decomposed into one or more atomic memory operations (see Chapter ...). PMP checks are performed independently on each operation. The lowest-numbered PMP entry that matches any byte of a memory operation determines whether that operation succeeds or fails. The matching PMP entry must match all bytes of an operation, or the operation fails.

If any operation fails then the overall access will fail, however passing operations may cause side effects.

Yeah, RVWMO invented new terminology after the PMP spec was written. I figured that if we're going to end up tweaking the wording for clarity, we might choose to unify the terminology, too.