Feedback on RISC-V N-Trace Specification 1.0.0 rc2.0 - chapter 8

Question

Feedback on RISC-V N-Trace Specification 1.0.0 rc2.0 - chapter 8

Closed this issue 4 months ago · 1 comments

ved-rivos commented 4 months ago

Section 8.1: "following Sv39/Sv48/Sv57 address generation rules"
needs to be updated. The paged virtual-memory system does not provide rules
for address generation but for address validity checks. Suggest update as follows:
"Address transmission in RISC-V N-trace is in compliance with the IEEE-5001
Nexus Standard. Per the IEEE-5001 Nexus Standard, the most significant
address bits that are zero are not transmitted. Additionally, for RV64,
RISC-V N-trace introduces an extension that allows for the omission of
identical most significant bits during address transmission. This extension
exploits the rules set forth by the RISC-V paged virtual-memory system for
valid virtual addresses when SXLEN=64. For further details on this
optimization, please refer to the 'Virtual Addresses Optimization' section.
section 8.1 :
"Rules when generating addresses" -> "Rules for generating addresses"
The bullet 4 is not rules for generating addresses. The bullet 4 can be made
a sub-bullet of bullet 3.
Section 8.2 - "Trace encoder must..." the rules in this paragraph is not
complete. Applying this rule at encoder as stated will not lead to the
desired result or match the examples. For example, applying this rule will
lead to add address=0xFFFF_FFFE_3FFF_FFFE being encoded as 0x1FFF_FFFF
which when decoded will lead to a wrong address.

A more detailed and correct algorithm should be included. For example
something like follows.

The process for detecting and skipping most significant identical bits in an
address A is as follows.
1. Let A be the XLEN (64 or 32) bit wide address.
2. Let i be XLEN-1
3. if bits i:i-5 are not identical to bit i-6 then goto step "vi"
4. Let i = i - 6
5. if i > 6 go to step "ii"
6. Let B[i-1:0] = A[i:1]
7. Let p = (i-1) % 6
8. if i < XLEN-1, then set bits B[i+p-1:i] to the value of bit B[i-1]
9. transmit B
Section 8.4 - the statement "Field I-CNT..." seems to conflate the field in the
message with the counter in the encoder.
"The I-CNT field, present in most messages, transmits the value of the
I-CNT counter, which counts the number of halfwords used to encode
retired instructions."

"The I-CNT counter in the trace encoder is reset to 0, in accordance with
the IEEE-5001 Nexus Standard, under one of the following two conditions:
- When tracing starts or is restarted for any reason.
- After the I-CNT counter value has been transmitted in a message."
section 8.4.1 -> remove reference to "prefetch fault" as there is no
such exception.
Table 26 should indicate which messages are used for each of the cases in
BTM and in HTM. The examples in section 8.8 do not provide all details.
Specifically, the details around treatment of HIST/HREPEAT/B-COUNT are not
provided in this section or in the examples. For BTM, if the choice between
use of a ProgramTraceSync with code=4 or ResourceFull message is a
implementation choice then some guidance for that choice should be provided;
and if it is not then under which circumstances the choice is made should be
specified.
Section 8.6: Timestamp recording -> Timestamp reporting

Editorial:

Section 8.1 - "Normally (with above bit enabled..." ->
"Normally (without the above bit enabled or implemented), addresses with many
most significant bits set to 1 will be sent as long packets (as variable size
fields skip only the most significant bit set to 0). The following address,
0xFFFF_FFFF_8000_31F4 (a real address from the Linux kernel), will be encoded
as F-ADDR=0x7FFF_FFFF_C000_18FA (with the least significant 0-bit skipped).
Such a 63-bit variable field value will require 11 bytes to be sent (as we
have 6 MDO bits in each byte)."
Section 8.3 and 8.3.1

"When operating in HTM mode, the encoder does not generate messages for
conditional branches. Instead, it maintains a HIST register or accumulator to
record the outcomes of these branches, whether taken or not. Each conditional
branch contributes a single bit to the HIST register, as follows:
- A bit with a value of 1 is appended at the least significant position for a
  taken conditional branch.
- A bit with a value of 0 is appended at the least significant position for a
  not-taken conditional branch.
The HIST register may be implemented as a left-shift register. Initially, when
the HIST register is empty, bit 0 of the register is set to 1, with all other
bits set to 0. Subsequent conditional branches cause the register to shift left,
recording each taken or not-taken outcome in bit 0. The transition of the most
significant bit from 0 to 1 indicates the register is full. At this point, the
entire register, including the most significant bit — which serves as the
stop-bit — is transmitted. After transmission, the register is reset to its
initial, empty state. The HIST register is transmitted using a ResourceFull
message with the RCODE field set to either 1 or 2.

Decoders must initiate the interpretation of the HIST field starting from the
second most significant bit. The most significant bit, designated as the
stop-bit, is invariably set to 1. This second most significant bit—immediately
following the stop-bit—encodes the outcome of the first conditional branch
captured in the HIST register. Conversely, the least significant bit represents
the outcome of the last conditional branch prior to the transmission of the HIST
register.

When a HIST register is full, if its value is the same as that of the HIST
field transmitted in the last ResourceFull message, then the encoder may
increment an internal HREPEAT counter (history repeat counter) instead of
generating a ResourceFull message if the Repeated History Optimization is
enabled. See Repeated History Optimization section for further details."
Section 8.4 "Every retired instruction..." and its sub bullets->
"Every retired instruction increments the I-CNT counter by 1 for 16-bit
instructions and by 2 for 32-bit instructions. This increment applies to
all forms of branch instructions as well. However, instructions that either
raise exceptions or are interrupted prior to retirement do not increment
the I-CNT counter.
Section 8.4 "When I-CNT counter is full..." ->
"When the I-CNT counter reaches its maximum value, becoming full, it can be
reported in one of two ways:
- By using a ResourceFull message with RCODE=0. This method is applicable
  to both BTM and HTM.
- By using a Synchronizing message with SYNC=4 (Sequential Instruction
  Counter), which is exclusive to BTM. For HTM, since no SYNC code exists
  to report HIST register overflow, employing a ResourceFull message is
  necessary. Thus, using a ResourceFull message in HTM for I-CNT overflow
  reporting ensures consistency across reporting mechanisms."
section 8.4.1 "(... does not matter)" -> "(Specific operations are abstracted
as "..." as they do not matter for this example)"
Table 26 "but HIST and I-CNT should provide the PC " ->
"The I-CNT and HIST fields may be used determine the PC of the last
instruction retired before reset".
"If timestamp is enabled all" -> "If the timestamp is enabled, all"
"The TSTAMP field is a variable-length field and most ..." ->
"The TSTAMP field is a variable-length field, and most significant bits set
to 0 will not be transmitted. This approach provides good compression for
both relative and absolute timestamps."
Section 8.6
"The following rules must be observed..."->
"The following rules must be observed:
- If timestamps are enabled, ALL Synchronizing Messages must include
  an absolute TSTAMP value.
- It is not required for all non-synchronizing messages to always
  report a timestamp. Doing so may be opted for saving trace bandwidth or
  in the case of sending back-to-back messages.
- The absolute timestamp cannot exceed 64 bits (even with 1ps resolution,
  64-bit counters will overflow in about 584 years).
  - Implementations may choose a smaller counter. Trace tools may assume
    the timestamp will not overflow in a single session, although adding
    support for overflow is not significantly challenging.
- It is suggested that in multi-hart systems, all Trace Encoders use a
  shared timestamp (for better trace correlation), but it is not mandatory.
- In all cases, when an address is provided, the timestamp should reflect
  the time when an event leading to that particular address being sent
  occurred."
"If the above is not possible..."->
"If the above is not feasible, timestamps should at least be reported
consistently, ensuring that the time distance between distant events (for
example, a periodic timer interrupt) can be reliably calculated. It is necessary
to ensure that the time reported at exceptions/interrupt handlers reflects the
moment when the exception or interrupt was observed."
Make section 8.8 a sub-section of 8.6. In section 8.8 correct spelling of
"ProgramTraceSyn". Either explain the red color coding or remove it if it is
not significant. The examples seem to refer to foot notes 1, 2, and 3 but
the foot notes are missing. For case 6, what does "After P" mean?
The case 6 seems to imply that "trTeInstTracing" has pulses periodically
causing it to transition from 1->0 and 0->1 at P intervals. Is that
intended? Same question about trTeInstTracing dropping to 0 when SYNC TRIGGER
occurs.

Answer 1 · 2024-04-01T11:00:51.000Z

Addressed in 1.0.0_rc25 (or even earlier ...) version.