riscvarchive/riscv-platform-specs

Eventual completion for vacant memory

jonmasters opened this issue · 17 comments

The privileged spec describes that PMAs may be vacant. The current specifications are vague around handling of unpopulated (aka "vacant") memory. They must mandate a behavior, e.g. all zeros/all ones/error response

The privileged spec says:

Vacant regions are also classified as I/O regions but with attributes
specifying that no accesses are supported.

If no accesses are supported then you'll get an access fault.

I was going to say that this is already clear in the privileged spec but I think that you're right. Only precise PMA checker violations are access faults but the architecture allows for imprecise bus-error interrupts on things like vacant PMA regions. As you say, the platform should probably address this.

To add on to Paul's posts ... PMA violations (such as accessing a 'vacant' region) result in a precise Access Fault exception on the associated load/store instruction. That bit of arch is in good shape.

In contrast, handling of "bus errors" (i.e. an error response from the system or a slave device in response to a transaction sent out into the system) are imp-def and not standardized yet by RV arch. On reads one could translate that into a precise exception on the associated load. For writes one can only take either an imprecise exception or an interrupts (or an NMI).

This area is worth establishing a standard framework of options for reporting "bus errors". Also note that AIA reserves a local interrupt number for "bus errors", and the new "recoverable NMI" extension will establish some NMI-related standardization. But overall there is some work to be done.

To be clear, accesses to 'vacant' PMA regions are guaranteed to result in "completion" of the load/store instruction - in the form of a precise exception.

Greg - will the above additional wording regarding the handling of vacant PMA regions be pulled into the Priv spec?

Vacant PMA regions have no inherent difference from other types of PMA regions - other than allowing no accesses versus only certain types of accesses. Past that, the semantics of causing a precise Access Fault exception applies across the board when an access does not receive sufficient PMA access permission. If anything is missing or needs clarification in the Priv spec, it would be of a general nature that applies across all PMA-originating Access Faults (which I'm not seeing a need for).

One comment wrt Paul's last post which said that "the architecture allows for imprecise bus-error interrupts on things like vacant PMA regions". That scenario isn't possible. Since a vacant PMA region allows no accesses to proceed (and causes an Access Fault on all attempted accesses), a transaction is never sent out into the system to have a chance at encountering or receiving a bus-error response of any sort.

One comment wrt Paul's last post which said that "the architecture allows for imprecise bus-error interrupts on things like vacant PMA regions". That scenario isn't possible. Since a vacant PMA region allows no accesses to proceed (and causes an Access Fault on all attempted accesses), a transaction is never sent out into the system to have a chance at encountering or receiving a bus-error response of any sort.

The architecture explicitly says that it is possible in the PMA section of the priv spec:

Precise PMA traps might not always be possible, for example, when
probing a legacy bus architecture that uses access failures as part of the discovery mechanism. In
this case, error responses from peripheral devices will be reported as imprecise bus-error interrupts.

If the PMA checker on an implementation is entirely contained in the hart and it is impossible to have (for example) an AXI DECERR then you're right that imprecise bus errors are impossible on that implementation. But it would be difficult for the hart to be omniscient about every single register on every device and whether it could return DECERR.

The Priv text is maybe less than clear. The PMA architecture, insofar as what happens when there is a PMA violation, is clear cut - an Access Fault on the memory-accessing instruction occurs. What the text I believe is trying to say is that if an access is allowed by PMAs, and hence is allowed ot go out into the system, then the access may still fail due to a "bus error" - which may be reported as an imprecise "bus-error" interrupt.

In other words, the Priv spec is acknowledging that it may not be possible to catch all "bad" accesses at the PMA check level (and get precise PMA traps). Which is just what one would expect or hope to see acknowledged in the spec - namely that there is not an architectural expectation or requirement that ALL "bad" accesses can or must be caught by PMAs.

If people feel that this is an incorrect reading of the spec and that the text says something different, then I can raise this with the authors of that text and see what they say (and, if desired, a clarification can be added to the text).

I believe that the notion of PMA is broader than you are interpreting. PMA is something that is a combination of the hart (e.g. it only supports N physical address bits so everything above that is vacant), the interconnect (e.g. certain regions only support 32b accesses or don't support AMOs), and the end device or memory (e.g. a certain region is read-only).

If the PMA is only related to the hart's standpoint and the rest is called something else then I think that it should be described less abstractly and something needs to change about: "PMA traps might ... be reported as imprecise bus-error interrupts."

I think we're talking about very different arch models of what PMAs are and are not (corresponding to different interpretations of the "imprecise bus-error" words). I will raise this with the authors of that Priv section to get a clear resolution as to which model is the correct model (and then suitable clarification can be made to the text).

What happens on a "PMA checker violation" is well-defined in the priv spec (i.e. access fault), but what happens on an access to a memory location that does not result in a PMA checker violation, but does encounter some other error, is not well-defined. One might consider the PMA of the latter to be "vacant" even if the PMA checker doesn't say so.

Note that "vacant" PMA regions are explicitly defined as follows: "Vacant regions are also classified as I/O regions but with attributes specifying that no accesses are supported."

Must all vacant regions be represented by the PMA checker?

The Priv spec doesn't outright say that all of PA space must be covered by a PMA region, but it does say the following:

The physical memory map for a complete system includes various address ranges, some corresponding to memory regions, some to memory-mapped control registers, and some to vacant holes in the address space.

And:

The most important characterization of a given memory address range is whether it holds regular main memory, or I/O devices, or is vacant.

I would argue that the latter (bolstered by the former) essentially say that any given memory address region must be one of the three types. Plus, if there was a fourth type of address region that was simply not checked by the "PMA checker" within a hart, then one would expect explicit acknowledgement of this in the preceding statements.