[Khmer] Undefined non-terminals in syllable regexp
adrianwong opened this issue · 2 comments
adrianwong commented
We have:
MATRA_GROUP = Z? M N?
SYLLABLE_TAIL = (SM SM?)?
where M
and SM
are not defined.
n8willis commented
Yeah; I was reading up on the syllable structure last week and noticed that. Almost certainly _M_
is meant to be _matra_
and _SM_
meant to be _syllablemodifier_
(both from the identification classes above). I'll double check, but that should be a simple fix.
The bigger concern, as in #126, is that there are four (at least) different upstream regex definitions for Khmer syllables. Obviously the W3C text/community group is working on a way to iron out the inconsistencies, but I'm opening a standalone issue here to have a birds-eye-strategy discussion (for these docs) in one place.