tajmone/hugo-book

Ch 17. Grammar: Various Problems

Opened this issue · 2 comments

  • Get @tessman approval for these changes.
  • Fix document source.
  • Document the changes in CHANGES.md.
  • Document in ChangeLog.
  • Add commented annotation in source file.

In his annotated PDF, @roodyyogurt suggests that the following passage from Ch 17. Grammar is incorrect:

becomes:

000040: 2C 02 x2 x1 y2 y1
000046: 08 66 48 r2 r1
00004B: FF

where $r1r2 is the indexed routine address of DoGet.

The $FF byte marks the end of the current verb definition. Immediately following this is either another verb or xverb token, or a second $FF to indicate the end of the verb table.

and that the line:

000046: 08 66 48 r2 r1

… should be fixed to:

000046: 08 05 66 48 r2 r1
  • Check above amend proposal and fix text accordingly.

Furthermore, @roodyyogurt commented on the sentence "The $FF byte marks the end of the current verb definition.":

Actually there are no $FF bytes at the end of verb definitions. All definitions are run after one another. In each grammar line the asterisk ($08) is followed by a byte that indicates the length of the grammar line excluding the length byte itself.

and regarding the sentence "or a second $FF to indicate the end of the verb table", Roody adds:

The grammar table ends in a single $FF byte. (Also, "verb table" should be "grammar table".)

  • Fix "verb table" into "grammar table".
  • We need to understand how these last comments might impact the text in terms of required adjustments.

Related Sections

The above considerations seem to be related to the definitions of the object and xobject grammar tokens ($66 and $67) in App. H, which contain the following admonition notes:

Removed as a token after grammar table is compiled so that object can refer to the object global variable.

and

Removed as a token after grammar table is compiled so that xobject can refer to the xobject global variable.

Some clarifications on what is meant exactly by "Removed as a token after grammar table is compiled" would make the text less cryptic, and any references of which Hugo versions were affected by these changes would be of great help for coders trying to implement alternative interpreters (which need to be backward compatible for games created with older Hugo versions).

Extra Byte Might Affects Last Example-Line Too

Note that by adding the suggested fix to the original example:

000040: 2C 02 x2 x1 y2 y1
000046: 08 66 48 r2 r1
00004B: FF

it should then become:

000040: 2C 02 x2 x1 y2 y1
000046: 08 05 66 48 r2 r1
00004C: FF

i.e. affecting also the last line address, which should be bumped up one byte (00004B00004C).

Except that if we take into account the other fix proposal:

@roodyyogurt: Actually there are no $FF bytes at the end of verb definitions.

Then the third line should be removed altogether, leaving the example as:

000040: 2C 02 x2 x1 y2 y1
000046: 08 05 66 48 r2 r1

unless that was the (single) $FF indicating the grammar table end (which probably is, since it's in an independent line).

This whole ordeal it's a bit confusing, because we have to take into account also disposing of the "$FF bytes at the end of verb definitions" rule.

Probably the best thing is to compile a Hugo example and look at it's real binary output.

Length Byte Clarification

@roodyyogurt:

In each grammar line the asterisk ($08) is followed by a byte that indicates the length of the grammar line excluding the length byte itself.

In the fixed example this would be the 05 value:

000046: 08 05 66 48 r2 r1
           ^^

so, if I've understood correctly, this length-byte also includes the * token in its count, and all the rest of the line except itself. And in the above example it's 5 because the FF on the following line is not part of the grammar line, but the indicator of the grammar table end.

Are these fixes related to changes in how Hugo handled grammar encodings? i.e. the documentation wasn't update accordingly and still refers to previous Hugo versions?

Adding Comments to Hex Blocks?

I also think that adding comment next to those Hex previews could greatly improve the learning experience. E.g. something like:

000040: 2C 02 x2 x1 y2 y1   ;
000046: 08 05 66 48 r2 r1   ; '*' [line len] object routine# $r1r2
00004C: FF                  ; grammar table end

although finding a consistent convention might not be that easy.