Divide section 3
r12a opened this issue · 3 comments
- Text segmentation
http://w3c.github.io/ilreq/#h_text_segmentation
i think section 3 could be divided into two subsections:
- word boundaries
- typographic units
Agree. And I found it confusing to see certain content here (mentioning Unicode code points, characters, extended grapheme clusters) seemingly should belong to section 2 "Indic orthographic syllable boundaries".
There's even such inaccurate and totally duplicating pieces:
A syllable includes a base consonant and any combination of the following characters in the text stream:
- sequences of consonants preceded by virama (i.e. conjuncts).
- vowel signs
- visarga, anusvara or candrabindu.
@lianghai's point seems to have been fixed in 33e2c15
(when a change is made to the document that fixes a particular issue, it would be helpful to say so in the comment that comes with that commit (including an link to the issue) - it took me a while to figure out why i couldn't find that text, and then to check where it was removed)
The division into subsections is still tbd.
The section 3 has been divided into two subsections.