n8willis/opentype-shaping-documents

[Khmer] Undefined non-terminals in syllable regexp

adrianwong opened this issue · 2 comments

We have:

MATRA_GROUP	= Z? M N?
SYLLABLE_TAIL	= (SM SM?)?

where M and SM are not defined.

Yeah; I was reading up on the syllable structure last week and noticed that. Almost certainly _M_ is meant to be _matra_ and _SM_ meant to be _syllablemodifier_ (both from the identification classes above). I'll double check, but that should be a simple fix.

The bigger concern, as in #126, is that there are four (at least) different upstream regex definitions for Khmer syllables. Obviously the W3C text/community group is working on a way to iron out the inconsistencies, but I'm opening a standalone issue here to have a birds-eye-strategy discussion (for these docs) in one place.

Fixed via 068b1f4.

As per last week's W3C-cg Khmer meeting, whatever other issues there might be to resolve the regular expressions, consensus is clear that one dependent vowel and a max of two modifier signs is the limit, so this change should be stable.