riscv/riscv-isa-manual

Don't write in prose

Timmmm opened this issue · 8 comments

One aspect of the spec that makes it quite hard to read is the prose-based style. A good (bad) example of this is CSR fields, e.g. mstatus. It's easy enough to find the field diagram:

image

But then suppose you want to know what MPIE does. You basically have to scan through the entire mstatus section and notice it in these big paragraphs:

image

The style just makes quickly finding information quite tedious. The same is true for instructions:

image

This style just makes it difficult to jump to the information. Very small changes can greatly improve this:

image

If written in this style it also makes it easier to hyperlink to instructions, CSRs and CSR fields. For example CSR diagrams could have each field be a link to its definition.

This style would also make #1397 easier.

Very small changes can greatly improve this

The title of your PR indicates you're on a stylistic bent that we don't necessarily agree with.
How about you make a PR to prove your point so we can evaluate it.

The very small changes I was referring to are demonstrated in the screenshot.

I'll make a PR for the base instructions as I suggested, and I'll try and do one for mstatus too since that is a bigger change and probably more debatable!

I generally disagree that we want to change this aspect of the manual's style, but others can review the PR and see what they think.

Many of these changes will be superseded by the future addition of a one-instruction-per-page appendix. Both styles serve a purpose; which you want depends on whether you're using the manual for reference or for complete understanding.

I tend to agree with @Timmmm on that it's much harder to digest what particular instructions do.
See for example SLTI instruction description (https://github.com/riscv/riscv-isa-manual/blob/main/src/rv32.adoc?plain=1#L302):

SLTI (set less than immediate) places the value 1 in register rd if register rs1 is less than the sign-extended immediate when both are treated as signed numbers, else 0 is written to rd. SLTIU is similar but compares the values as unsigned numbers (i.e., the immediate is first sign-extended to XLEN bits then treated as an unsigned number). Note, SLTIU rd, rs1, 1 sets rd to 1 if rs1 equals zero, otherwise sets rd to 0 (assembler pseudoinstruction SEQZ rd, rs).

Even worse it's with what follows mixing multiple instructions in random order in the text:

ANDI, ORI, XORI are logical operations that perform bitwise AND, OR, and XOR on register rs1 and the sign-extended 12-bit immediate and place the result in rd. Note, XORI rd, rs1, -1 performs a bitwise logical inversion of register rs1 (assembler pseudoinstruction NOT rd, rs).

For comparison that's what Shakti folks made back in the day (https://shakti.org.in/docs/risc-v-asm-manual.pdf), see a screenshot below. What I like here is:

  • Clear layout of the mnemonic: slti rd, rs1, imm.
  • Clear explanation of operand roles: destination and sources register, immediate data.
  • Example of use with accompanying pseudocode: slti x5, x1, 2 # x5 ← x1 < 2

The pseudocode should reflect the operation, and the pseudocode for SLTI and SLTIU is identical, while the
semantics are clearly not.

I completely agree. And that example was not in a sense that Shakti documentation is better in any way. Instead, it's just a demonstration of certain aspects of that document. I.e. the way each instruction could be represented. Each instruction needs to have its own pseudocode precisely describing semantics.

Also the earlier pages imply the ABI requirements== architectural requirement

Again, I'm not saying we need to copy what Shakti folks did. Note, that document was last updated 3 or 4 years ago and so far I was not able to find its sources if they were ever published. So definitely there's a lot of room for improvement of that document. But if we get inspired by their work and improve on that in the upstream RISC-V documentation, there will be no need to invent derivatives.