Don't write in prose

Question

Don't write in prose

Timmmm opened this issue 2 months ago · 8 comments

One aspect of the spec that makes it quite hard to read is the prose-based style. A good (bad) example of this is CSR fields, e.g. mstatus. It's easy enough to find the field diagram:

But then suppose you want to know what MPIE does. You basically have to scan through the entire mstatus section and notice it in these big paragraphs:

The style just makes quickly finding information quite tedious. The same is true for instructions:

This style just makes it difficult to jump to the information. Very small changes can greatly improve this:

If written in this style it also makes it easier to hyperlink to instructions, CSRs and CSR fields. For example CSR diagrams could have each field be a link to its definition.

This style would also make #1397 easier.

Answer 1 · 2024-05-11T16:41:05.000Z

Very small changes can greatly improve this

The title of your PR indicates you're on a stylistic bent that we don't necessarily agree with.
How about you make a PR to prove your point so we can evaluate it.

Answer 2 · 2024-05-11T17:20:25.000Z

The very small changes I was referring to are demonstrated in the screenshot.

Answer 3 · 2024-05-13T09:42:28.000Z

I'll make a PR for the base instructions as I suggested, and I'll try and do one for mstatus too since that is a bigger change and probably more debatable!

Answer 4 · 2024-05-14T00:04:03.000Z

I generally disagree that we want to change this aspect of the manual's style, but others can review the PR and see what they think.

Many of these changes will be superseded by the future addition of a one-instruction-per-page appendix. Both styles serve a purpose; which you want depends on whether you're using the manual for reference or for complete understanding.

Answer 5 · 2024-05-16T20:48:11.000Z

I tend to agree with @Timmmm on that it's much harder to digest what particular instructions do.
See for example SLTI instruction description (https://github.com/riscv/riscv-isa-manual/blob/main/src/rv32.adoc?plain=1#L302):

SLTI (set less than immediate) places the value 1 in register rd if register rs1 is less than the sign-extended immediate when both are treated as signed numbers, else 0 is written to rd. SLTIU is similar but compares the values as unsigned numbers (i.e., the immediate is first sign-extended to XLEN bits then treated as an unsigned number). Note, SLTIU rd, rs1, 1 sets rd to 1 if rs1 equals zero, otherwise sets rd to 0 (assembler pseudoinstruction SEQZ rd, rs).

Even worse it's with what follows mixing multiple instructions in random order in the text:

ANDI, ORI, XORI are logical operations that perform bitwise AND, OR, and XOR on register rs1 and the sign-extended 12-bit immediate and place the result in rd. Note, XORI rd, rs1, -1 performs a bitwise logical inversion of register rs1 (assembler pseudoinstruction NOT rd, rs).

For comparison that's what Shakti folks made back in the day (https://shakti.org.in/docs/risc-v-asm-manual.pdf), see a screenshot below. What I like here is:

Clear layout of the mnemonic: slti rd, rs1, imm.
Clear explanation of operand roles: destination and sources register, immediate data.
Example of use with accompanying pseudocode: slti x5, x1, 2 # x5 ← x1 < 2

Answer 6 · 2024-05-16T21:56:44.000Z

Hmm, not so impressed with the Shakti docs. The pseudocode should reflect the operation, and the pseudocode for SLTI and SLTIU is identical, while the semantics are clearly not. Also the earlier pages imply the ABI requirements== architectural requirement

…

On Thu, May 16, 2024 at 1:48 PM Alexey Brodkin ***@***.***> wrote: I tend to agree with @Timmmm <https://github.com/Timmmm> on that it's much harder to digest what particular instructions do. See for example SLTI instruction description ( https://github.com/riscv/riscv-isa-manual/blob/main/src/rv32.adoc?plain=1#L302 ): SLTI (set less than immediate) places the value 1 in register *rd* if register *rs1* is less than the sign-extended immediate when both are treated as signed numbers, else 0 is written to *rd*. SLTIU is similar but compares the values as unsigned numbers (i.e., the immediate is first sign-extended to XLEN bits then treated as an unsigned number). Note, SLTIU *rd, rs1, 1* sets *rd* to 1 if *rs1* equals zero, otherwise sets *rd* to 0 (assembler pseudoinstruction SEQZ *rd, rs*). Even worse it's with what follows mixing multiple instructions in random order in the text: ANDI, ORI, XORI are logical operations that perform bitwise AND, OR, and XOR on register *rs1* and the sign-extended 12-bit immediate and place the result in *rd*. Note, XORI *rd, rs1, -1* performs a bitwise logical inversion of register *rs1* (assembler pseudoinstruction NOT *rd, rs*). For comparison that's what Shakti folks made back in the day ( https://shakti.org.in/docs/risc-v-asm-manual.pdf), see a screenshot below. What I like here is: - Clear layout of the mnemonic: slti rd, rs1, imm. - Clear explanation of operand roles: destination and sources register, immediate data. - Example of use with accompanying pseudocode: slti x5, x1, 2 # x5 ← x1 < 2 image.png (view on web) <https://github.com/riscv/riscv-isa-manual/assets/1618098/821be1f9-26a2-4a0b-8315-aef56ef69e33> — Reply to this email directly, view it on GitHub <#1396 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHPXVJXSRE4XZSX355EAQIDZCULTBAVCNFSM6AAAAABHR7WGG6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJWGE2TAMJSHE> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Answer 7 · 2024-05-16T22:15:56.000Z

The pseudocode should reflect the operation, and the pseudocode for SLTI and SLTIU is identical, while the
semantics are clearly not.

I completely agree. And that example was not in a sense that Shakti documentation is better in any way. Instead, it's just a demonstration of certain aspects of that document. I.e. the way each instruction could be represented. Each instruction needs to have its own pseudocode precisely describing semantics.

Also the earlier pages imply the ABI requirements== architectural requirement

Again, I'm not saying we need to copy what Shakti folks did. Note, that document was last updated 3 or 4 years ago and so far I was not able to find its sources if they were ever published. So definitely there's a lot of room for improvement of that document. But if we get inspired by their work and improve on that in the upstream RISC-V documentation, there will be no need to invent derivatives.

Answer 8 · 2024-05-16T23:08:04.000Z

Thanks for the clarification. Yes, the format is one that has been followed for the scalar crypto (using actual Sail code), and that is the direction that the doc team is trying to go to. It gets really complicated when there are so many options and parameters (e.g. load/store: Mode? VM?, Hypervisor? Bare?/SV32?/SV39?48?57? PMP? ePMP? PMP granularity? misaligned granularity? PA size?) (let's not talk about vectors yet)

…

On Thu, May 16, 2024 at 3:16 PM Alexey Brodkin ***@***.***> wrote: The pseudocode should reflect the operation, and the pseudocode for SLTI and SLTIU is identical, while the semantics are clearly not. I completely agree. And that example was not in a sense that Shakti documentation is better in any way. Instead, it's just a demonstration of certain aspects of that document. I.e. the way each instruction could be represented. Each instruction needs to have its own pseudocode precisely describing semantics. Also the earlier pages imply the ABI requirements== architectural requirement Again, I'm not saying we need to copy what Shakti folks did. Note, that document was last updated 3 or 4 years ago and so far I was not able to find its sources if they were ever published. So definitely there's a lot of room for improvement of that document. But if we get inspired by their work and improve on that in the upstream RISC-V documentation, there will be no need to invent derivatives. — Reply to this email directly, view it on GitHub <#1396 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHPXVJRKOBOVL4BERVK2IYTZCUV3DAVCNFSM6AAAAABHR7WGG6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJWGI4TKNJZGE> . You are receiving this because you commented.Message ID: ***@***.***>