schriftgestalt/GlyphsInfo

Non-AGL production names for Lao glyphs

Closed this issue · 8 comments

A few entries in GlyphData.xml have production names that do not comply with the Adobe Glyph List specification. These problematic glyphs are all for the Lao script. Do you have any background on them? Are these missing from Unicode?

<glyph name="gha-lao" category="Letter" script="lao" production="laGha_" altNames="laGha" />
<glyph name="cha-lao" category="Letter" script="lao" production="laCha_" altNames="laCha" />
<glyph name="jha-lao" category="Letter" script="lao" production="laJha_" altNames="laJha" />
<glyph name="nya-lao" category="Letter" script="lao" production="laNya_" altNames="laNya" />
<glyph name="tta-lao" category="Letter" script="lao" production="laTta_" altNames="laTta" />
<glyph name="ttha-lao" category="Letter" script="lao" production="laTtha_" altNames="laTtha" />
<glyph name="dda-lao" category="Letter" script="lao" production="laDda_" altNames="laDda" />
<glyph name="ddha-lao" category="Letter" script="lao" production="laDdha_" altNames="laDdha" />
<glyph name="nna-lao" category="Letter" script="lao" production="laNna_" altNames="laNna" />
<glyph name="dha-lao" category="Letter" script="lao" production="laDha_" altNames="laDha" />
<glyph name="bha-lao" category="Letter" script="lao" production="laBha_" altNames="laBha" />
<glyph name="sha-lao" category="Letter" script="lao" production="laSha_" altNames="laSha" />
<glyph name="ssa-lao" category="Letter" script="lao" production="laSsa_" altNames="laSsa" />
<glyph name="lla-lao" category="Letter" script="lao" production="laLla_" altNames="laLla" />

Likewise, I have trouble generating AGL-compliant production names for the following glyphs which glyphs lack a production attribute. For many other glyphs in the data, using altNames works fine when a glyph has no production, but in the following cases the value for altNames cannot be mapped to a Unicode sequence.

<glyph name="maiEk-lao.small" category="Mark" subCategory="Nonspacing" script="lao" altNames="laMaiEk.calt" />
<glyph name="maiTho-lao.small" category="Mark" subCategory="Nonspacing" script="lao" altNames="laMaiTho.calt" />
<glyph name="maiTi-lao.small" category="Mark" subCategory="Nonspacing" script="lao" altNames="laMaiTi.calt" />
<glyph name="maiCatawa-lao.small" category="Mark" subCategory="Nonspacing" script="lao" altNames="laMaiCatawa.calt" />
<glyph name="karan-lao.small" category="Mark" subCategory="Nonspacing" script="lao" altNames="laKaran.calt" />
<glyph name="niggahita-lao.small" category="Mark" subCategory="Nonspacing" script="lao" altNames="laNiggahita.calt" />

production="laGha_"

This is a mistake. Names are based of input from Ben Mitchell. I contacted him about his.

@mjansche pointed me to the Thai Wikipedia page about the Lao alphabet. It seems these extra letters are used in Sanskrit/Pali transliteration (similar to the extended/historical letters in Khmer etc.). But Unicode does not assign codepoints to them. Perhaps we should try to change this.

The Thai Wikipedia has images: gha cha jha nnya tta ttha dda ddha nna dha bha sha ssa lla

I don't have much information about the non-Unicode Lao characters, other than that they were introduced as a way of representing Pali-Sanskrit words more correctly than is otherwise possible using the current/official Lao alphabet.

There's a bit more :
here
and
here.

I'm hoping to uncover more resources and make a Unicode proposal at some point, but it'll not be for a while unfortunately.

FYI, Unicode has accepted a proposal for encoding these characters. But it will take a while to flow through the pipeline.

Are the slots fixed now, or is there any possibility the codepoints could change? If they're not going to change we can add the Pali characters to the Lao XML at any point. I was going to suggest reviewing the Lao XML anyway, as I'm not sure all the categories, subcategories and anchors are correct.

No, the codepoint assignments aren’t final yet.

They are properly encoded now.