commonmark/commonmark-spec

Unified Hyphen in [Hexadecimal numeric character references]

KiyanYang opened this issue · 2 comments

commonmark-spec/spec.txt

Lines 676 to 680 in 796666d

[Hexadecimal numeric character
references](@) consist of `&#` +
either `X` or `x` + a string of 1-6 hexadecimal digits + `;`.
They too are parsed as the corresponding Unicode character (this
time specified with a hexadecimal numeral instead of decimal).

The hyphen in paragraph about Hexadecimal numeric character references, which different from other places. So I suggest to change a string of 1-6 hexadecimal digits into a string of 1--6 hexadecimal digits. Here are examples of other locations:

commonmark-spec/spec.txt

Lines 661 to 667 in 796666d

[Decimal numeric character
references](@)
consist of `&#` + a string of 1--7 arabic digits + `;`. A
numeric character reference is parsed as the corresponding
Unicode character. Invalid Unicode code points will be replaced by
the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons,
the code point `U+0000` will also be replaced by `U+FFFD`.

commonmark-spec/spec.txt

Lines 1099 to 1109 in 796666d

An [ATX heading](@)
consists of a string of characters, parsed as inline content, between an
opening sequence of 1--6 unescaped `#` characters and an optional
closing sequence of any number of unescaped `#` characters.
The opening sequence of `#` characters must be followed by spaces or tabs, or
by the end of line. The optional closing sequence of `#`s must be preceded by
spaces or tabs and may be followed by spaces or tabs only. The opening
`#` character may be preceded by up to three spaces of indentation. The raw
contents of the heading are stripped of leading and trailing space or tabs
before being parsed as inline content. The heading level is equal to the number
of `#` characters in the opening sequence.

commonmark-spec/spec.txt

Lines 2993 to 2994 in 796666d

An HTML block of types 1--6 can interrupt a paragraph, and need not be
preceded by a blank line.

commonmark-spec/spec.txt

Lines 4106 to 4110 in 796666d

An [ordered list marker](@)
is a sequence of 1--9 arabic digits (`0-9`), followed by either a
`.` character or a `)` character. (The reason for the length
limit is that with 10 digits we start seeing integer overflows
in some browsers.)

I’d personally prefer to use unicode characters in the spec, rather than ancient ASCII-era typewriter pseudocharacters 😅

jgm commented

Yes, it should indeed be an -- which will turn into an en dash in the rendered version.