w3c/ilreq

Divide section 3

r12a opened this issue · 3 comments

r12a commented
  1. Text segmentation
    http://w3c.github.io/ilreq/#h_text_segmentation

i think section 3 could be divided into two subsections:

  1. word boundaries
  2. typographic units

Agree. And I found it confusing to see certain content here (mentioning Unicode code points, characters, extended grapheme clusters) seemingly should belong to section 2 "Indic orthographic syllable boundaries".

There's even such inaccurate and totally duplicating pieces:

A syllable includes a base consonant and any combination of the following characters in the text stream:

  • sequences of consonants preceded by virama (i.e. conjuncts).
  • vowel signs
  • visarga, anusvara or candrabindu.
r12a commented

@lianghai's point seems to have been fixed in 33e2c15

(when a change is made to the document that fixes a particular issue, it would be helpful to say so in the comment that comes with that commit (including an link to the issue) - it took me a while to figure out why i couldn't find that text, and then to check where it was removed)

The division into subsections is still tbd.

slata commented

The section 3 has been divided into two subsections.