Consolidation of Post-V2 Glyph Correction Suggestions
kenlunde opened this issue · 75 comments
For issues related to HKSCS-2016 and Hong Kong in general, please be sure to reference 香港電腦漢字參考字形 before posting an issue here.
The following table shows the glyphs that will be corrected in the next update, and unless otherwise noted, the corrections are from my own notes:
Character | Glyph Name | Description |
---|---|---|
U+50E7 僧 | uni50E7-HK | Adjust the lower-right component so that its middle horizontal stroke does not touch the right-most vertical stroke |
U+89E6 触 | uni89E6-CN | the vertical stroke of the 虫 component has uneven weight near the bottom, which appears only in the Light, Normal, and Regular weights |
U+8FD0 运 | uni8FD0-CN | The lower-left corner of the 云 component has a join error (a small gap) |
U+9EA4 麤 | uni9EA4-JP | The ExtraLight master needs to be adjusted so that the top-most vertical stroke does not penetrate the horizontal stroke below. |
U+25C4A 𥱊 | u25C4A-HK | Add the missing stroke to the top of the bottom component (see Issue #243) |
The following table shows the glyphs that were corrected in the Version 2.001 update, and unless otherwise noted, the corrections are from my own notes:
Character | Glyph Name | Description |
---|---|---|
U+3C54 㱔 | uni3C54-HK | Modify the 匕 component per HK conventions |
U+3D75 㵵 | uni3D75-HK | Modify the final stroke of the lower-right 人 component to curve inward |
U+451D 䔝 | uni451D-HK | Modify the 匕 component per HK conventions |
U+4894 䢔 | uni4894-CN | Remove the "feet" from the bottom of the box-like element |
U+48AE 䢮 | uni48AE-CN | Remove the "feet" from the bottom of the box-like element |
U+48B0 䢰 | uni48B0-CN | Remove the "feet" from the bottom of the box-like element |
U+4E31 丱 | uni4E31-CN | Modify the first stroke so that it has no foot |
U+56F9 囹 | uni56F9-TW | Modify the final stroke of the 人 component to curve outward |
U+585F 塟 | uni585F-HK | Modify the 匕 component per HK conventions |
U+58BA 墺 | uni58BA-TW | Add a hook to the lower-right of the 冂 component (see Issue #212) |
U+5BE9 審 | uni5BE9-TW | Modify the 番 component so that the 釆 and 田 subcomponents do not touch (see Issue #214) |
U+5DB4 嶴 | uni5DB4-TW | Add a hook to the lower-right of the 冂 component (see Issue #212) |
U+611F 感 | uni611F-TW | Modify the 咸 component so that the horizontal stroke above 口 does not touch the vertical stroke to its left (see Issue #213) |
U+64AD 播 | uni64AD-TW | Modify the 番 component so that the 釆 and 田 subcomponents do not touch (see Issue #214) |
U+64D9 擙 | uni64D9-TW | Add a hook to the lower-right of the 冂 component (see Issue #212) |
U+65DB 旛 | uni65DB-TW | Modify the 番 component so that the 釆 and 田 subcomponents do not touch (see Issue #214) |
U+66AD 暭 | uni66AD-CN | Modify the four short diagonal strokes of the 臯 component so that they do not touch |
U+68B5 梵 | uni68B5-TW | Modify the final stroke of the upper-right 木 component to curve outward |
U+6A4E 橎 | uni6A4E-TW | Modify the 番 component so that the 釆 and 田 subcomponents do not touch (see Issue #214) |
U+6F58 潘 | uni6F58-TW | Modify the 番 component so that the 釆 and 田 subcomponents do not touch (see Issue #214) |
U+6F5C 潜 | uni6F5C-HK | Change the 日 component to 曰 |
U+6FB8 澸 | uni6FB8-TW | Modify the 咸 component so that the horizontal stroke above 口 does not touch the vertical stroke to its left (see Issue #213) |
U+700B 瀋 | uni700B-TW | Modify the 番 component so that the 釆 and 田 subcomponents do not touch (see Issue #214) |
U+7131 焱 | uni7131-TW | Modify the final stroke of the top 火 component to curve outward |
U+720B 爋 | uni720B-CN | Modify the center vertical stroke so that it is thinner only within the box-like element (more obvious at heavier weights) |
U+72C5 狅 | uni72C5-TW | Modify the upper-right horizontal stroke to be slightly diagonal (see Issue #215) |
U+74A0 璠 | uni74A0-TW | Modify the 番 component so that the 釆 and 田 subcomponents do not touch (see Issue #214) |
uni74F5-TW 瓵 | uni74F5-TW | Modify the lower-left corner of the 瓦 so that it conforms to TW conventions |
U+76A4 皤 | uni76A4-TW | Modify the 番 component so that the 釆 and 田 subcomponents do not touch (see Issue #214) |
U+7FFA 翺 | uni7FFA-CN | Modify the four short diagonal strokes of the 臯 component so that they do not touch |
U+7FFB 翻 | uni7FFB-TW | Modify the 番 component so that the 釆 and 田 subcomponents do not touch (see Issue #214) |
U+8FCD 迍 | uni8FCD-JP | Remove the "feet" from the bottom of the box-like element |
U+8FDA 迚 | uni8FDA-JP | Remove the "feet" from the bottom of the box-like element |
U+8FE0 迠 | uni8FE0-JP | Remove the "feet" from the bottom of the box-like element |
U+8FE2 迢 | uni8FE2-JP | Remove the "feet" from the bottom of the box-like element |
U+8FE6 迦 | uni8FE6-JP, uni8FE6-CN & uni8FE6-JP90-JP | Remove the "feet" from the bottom of the box-like element |
U+8FE8 迨 | uni8FE8-JP | Remove the "feet" from the bottom of the box-like element |
U+8FEA 迪 | uni8FEA-JP &uni8FEAuE0101-JP | Remove the "feet" from the bottom of the box-like element |
U+8FEB 迫 | uni8FEBuE0101-JP | Remove the "feet" from the bottom of the box-like element |
U+8FF4 迴 | uni8FF4-JP & uni8FF4-CN | Remove the "feet" from the bottom of the box-like element |
U+8FFA 迺 | uni8FFA-JP & uni8FFAuE0101-JP | Remove the "feet" from the bottom of the box-like element |
U+8FFD 追 | uni8FFD-JP & uni8FFDuE0101-JP | Remove the "feet" from the bottom of the box-like element |
U+9002 适 | uni9002-JP | Remove the "feet" from the bottom of the box-like element |
U+9005 逅 | uni9005-JP & uni9005-CN | Remove the "feet" from the bottom of the box-like element |
U+9006 逆 | uni9006-JP & uni9006uE0101-JP | Remove the "feet" from the bottom of the box-like element |
U+900C 逌 | uni900C-JP | Remove the "feet" from the bottom of the box-like element |
U+900E 逎 | uni900E-JP & uni900EuE0101-JP | Remove the "feet" from the bottom of the box-like element |
U+9019 這 | uni9019-JP & uni9019-JP90-JP | Remove the "feet" from the bottom of the box-like element |
U+9020 造 | uni9020uE0101-JP | Remove the "feet" from the bottom of the box-like element |
U+9025 逥 | uni9025-CN | Remove the "feet" from the bottom of the box-like element |
U+9027 逧 | uni9027-JP & uni9027-CN | Remove the "feet" from the bottom of the box-like element |
U+902A 逪 | uni902A-JP | Remove the "feet" from the bottom of the box-like element |
U+902D 逭 | uni902D-JP & uni902D-CN | Remove the "feet" from the bottom of the box-like element |
U+9032 進 | uni9032-CN & uni9032uE0101-JP | Remove the "feet" from the bottom of the box-like element |
U+903C 逼 | uni903C-JP & uni903C-JP90-JP | Remove the "feet" from the bottom of the box-like element |
U+9041 遁 | uni9041-JP, uni9041-JP90-JP & uni9041uE0101-JP | Remove the "feet" from the bottom of the box-like element |
U+9052 遒 | uni9052-JP & uni9052uE0101-JP | Remove the "feet" from the bottom of the box-like element |
U+9053 道 | uni9053-JP & uni9053uE0101-JP | Remove the "feet" from the bottom of the box-like element |
U+905B 遛 | uni905B-JP | Remove the "feet" from the bottom of the box-like element |
U+9061 遡 | uni9061-JP & uni9061-JP90-JP | Remove the "feet" from the bottom of the box-like element |
U+9065 遥 | uni9065-JP | Remove the "feet" from the bottom of the box-like element |
U+9089 邉 | uni9089-CN | Remove the "feet" from the bottom of the box-like element |
U+9139 鄹 | uni9139-TW | Modify the final stroke of the right-most 人 component to curve outward |
U+9EF9 黹 | uni9EF9-CN | Revert to pre-V2 glyph; and add HK mapping override for U+2FCB ⿋ |
U+9F28 鼨 | uni9F28-TW | Modify the final stroke of the upper-right component to curve outward |
U+9FE9 鿩 | uni9FE9-CN | Change the upper-left component to be 魚 |
U+FA11 﨑 | uniFA11-CN | Adjust the 山 component to conform to CN conventions |
U+FE15 ︕ | uniFE15 | Make the dot smaller (see Noto CJK Issue #91) |
U+FE16 ︖ | uniFE16 | Compatibilize the dot of the ExtraLight and Heavy masters; and make the dot smaller (see Noto CJK Issue #91) |
U+FF01 ! | uniFF01 & uniFF01-CN | Make the dot smaller (see Noto CJK Issue #91) |
U+FF1F ? | uniFF1F & uniFF1F-CN | Compatibilize the dot of the ExtraLight and Heavy masters; and make the dot smaller (see Noto CJK Issue #91) |
U+1F250 🉐 | u1F250-JP | The upper-right component of the Heavy glyph is missing two subpaths |
U+202B7 𠊷 | u202B7-HK | Change the 攵 component to 夂 |
U+20C41 𠱁 | u20C41-HK | The final stroke of the 水 component should curve inward |
U+212FE 𡋾 | u212FE-HK | The upper-left component should be 𠮠 (⿱口⿰丿𠃌) |
U+21376 𡍶 | u21376-HK | Change the 攵 component to 夂 |
U+235CE 𣗎 | u235CE-HK | Fix the connection of the 入 component |
U+236BA 𣚺 | u236BA-HK | Change the left component to 未 |
U+27C12 𧰒 | u27C12-HK | The upper-left component should be 士 |
U+28455 𨑕 | u28455-JP | Remove the "feet" from the bottom of the box-like element |
U+2856B 𨕫 | u2856B-JP | Remove the "feet" from the bottom of the box-like element |
U+29B0E 𩬎 | u29B0E-HK | The top stroke of the bottom component should be horizontal |
U+4EFD 份 - HK should use CN version, currently mapped to TW.
Perhaps 首's feet in the KR 道 glyph should be removed for consistency? (Like how the KR 造 glyph was corrected in the 2.000 update.)
Also these hidden Kangxi-style glyphs. For this image, glyph 62875 as seen on Font Book on the Mac. So it should be uni9020uE0101-JP. Only the 口 feet, not 吿 (where the central vertical stroke goes over) as a whole, don't confuse this hidden glyph with the public KR version which has 告 (the central vertical stroke does not go over) and has no feet on 口.
So overall, maybe fix 62853 (uni8FEAuE0101-JP), 62854 (uni8FEBuE0101-JP), 62861 (uni8FFDuE0101-JP), 62867 (uni900EuE0101-JP), 62882 (uni9041uE0101-JP), 62891 (uni9052uE0101-JP) and 62892 (uni9053uE0101-JP, already mentioned), which have feet. Decide if you want more consistency to those hidden glyphs, some of them used as KR mappings.
Font Book only gave me glyph numbers (in the Repertoire section, hover and you see the glyph numbers, they are in order), but I checked the glyph numbers against the mapping file, for example, 造, which is correct on my end, so should be correct for everything else.
There are many characters with "feet" remaining in the KR glyph where they are absent in the other versions, a few of them are: 迫、追、逭 (feet in CN glyph as well)、适、迪、這
It seems that the different versions are rather inconsistent with the "feet" in characters with the 辶 radical (I don't know if this is an intended design difference or not.)
In order of which discards the feet the most: TW/HK > CN > JP > KR.
E.g. for 逅 the feet is retained in all but TW/HK.
9FA9 and 9FCA for CN should be 艹
ref http://www.gb688.cn/bzgk/gb/newGbInfo?hcno=BCBF3BC7DCED3629F5E41CE02D9CFD55 page 112
@CNMan For U+9FE9, I would claim that Unicode's representative glyph should also be corrected, especially when checking the original proposal (L2/13-009). In particular, see the kaishu form on page 5 (page 6 of the PDF) that is ⿰魯爾.
For U+9FA9 龩, U+9FB3 龳, and U+9FCA 鿊, no action is necessary, because these are single-source (HK) ideographs. Although no action is necessary for a similar reason, you missed U+9FEA (KR). And yes, I am fully aware that some single-source ideographs have multiple region-specific glyphs. When push comes to shove, which was necessary for the Version 2.000 update, such glyphs are removed in order to make room for higher-priority glyphs.
Also, I'd like to point out that following standards such as GB/T 22321.1-2018 is not within the scope of the Source Han projects. I think of such standards as attempts to hammer square pegs into round holes, meaning that regional conventions are applied to ideographs that are not actually used in that particular region. It would be nice to do, but when dealing with a glyph set that is already full, practicality becomes necessary.
For Source Han Sans Hong Kong, I noticed that the 胡 part of 鬍 is not consistent with other characters that have a 胡 component.
Here is an example of the difference between the 胡 part of 鬍 when contrasted with 胡 as a standalone character as well as other characters with a 胡 component:
In addition, it seems that the 鬍 character in Hong Kong's version of Source Han Sans may not conform with the "standard" found in 香港小學學習字詞表. I have attached a picture of the character 鬍 based on the online version of 香港小學學習字詞表 here:
Finally... on this Thanksgiving Day in the USA, I wanted to take a moment and thank @kenlunde and everyone involved in the creation of Source Han Sans and Source Han Serif for bringing these fonts to life for the community!
@pan-asian-wok Thank you, and you're welcome. Adding a new HK glyph for U+9B0D 鬍, uni9B0D-HK, is now noted in Issue #206.
BTW, I am using 香港電腦漢字參考字形 as the reference for HK glyphs. Its scope is the same as for the Source Han typefaces, specifically Big Five plus HKSCS-2016. The scope of 香港小學學習字詞表 is far more limited. In fact, I found an error in the former during Source Han Sans Version 2.000 development that I reported, which has been corrected, and is noted on its last page (page 1014).
For Source Han Sans Hong Kong, I think the 系 of 遜 should have a hook.
Reference: 香港電腦漢字參考字形
And I think the 山 part of CN glyph for 﨑 should be unified.
The size of dot in fullwidth ? and ! also should be unified.
@Buernia Adding a new HK glyph for U+905C 遜, uni905C-HK, is now noted in Issue #206. Adjusting the CN glyph for U+FA11 﨑, uniFA11-CN, is now noted in the table in this issue. The dot in the glyphs for U+FF1F ?, uniFF1F and uniFF1F-CN, and U+FE16 ︖, uniFE16, is not compatible in the ExtraLight and Heavy masters, which means that the ExtraLight and Heavy weights look okay, but the five intermediate weights have inconsistent or seemingly random weights. This is also now noted in the table in this issue.
For HK glyph variant, according to my feeling (need further verification)
For the 牙 component in characters like 韓 and 偉 and 雅, I don't think the upper horizontal stroke is supposed to extend beyond the slanted stroke?
For characters like 實 or 貫, shouldn't it be two dots in the middle instead of a single straight line?
For the character 低 I am not sure if it is the most common placement to place the dot below the character like this
For characters like 警 or 權 or 獲 or 護 or 觀, the grass root above the character is different from other characters.
For the character 曼 I suppose it should be the component of "sun" on top?
For the character 棉 should it be a slant on top?
For characters like 香 or 書 or 遭 or 會, the central line within the sun component is not connected on both sides, is that a design choice or?
For the character 嶼 I don't think in the center part it is supposed to be a tilted dot instead a vertical line?
For characters like 雷 or 霸 I think the rain component should be four dots toward same direction instead of toward center?
For the character 入 and characters that include this component like 內 or 納, I am not sure if it is a good idea to represent its difference against the character/component of 人 this way.
For the 王 component in characters like 徵, would it be better if those horizontal components are horizontal?
@c933103 Before dousing this issue with potentially meaningless comments, please first check against 香港電腦漢字參考字形.
And, after checking them all, I confirmed that none of your comments are actionable. Also, please don't post issues based on feelings. As I wrote above, please check the standard that is referenced in the previous paragraph. If you have an issue with that standard, take it up with the organization that is responsible for it, not with this project.
Apologies if I seem to be a bit harsh. I was in the midst of reading a new book entitled Zero Sum Game.
For Source Han Sans Hong Kong, the middle 八 in radical ⿋ should be 丷.
Reference: 香港電腦漢字參考字形
And HK glyph 戠 should use CN version.
Reference: 香港電腦漢字參考字形
@Marcus98T I am still thinking about this, hence the lack of a reply to you and to @lapomme.
I am also technically on vacation. 🥃
@kenlunde I also think the CN glyph of ⿋ should be replaced by the future HK glyph.
Below is the glyph in 通用规范汉字表 and GB 18030-2005.
@c933103 Congratulations, you have discovered that the standard forms as provided by EDB or provided in 香港電腦漢字參考字形 don't really match what many Hongkongers have learned since they were young. 🤷🏻♂️
May I suggest you contact 教育局課程發展處中國語文教育組 at the email ccdoc@edb.gov.hk and/or Chinese Language Interface Advisory Committee (CLIAC) at cliac@ogcio.gov.hk to complain about this issue.
@Buernia You're quite right. Actually, the CN glyph for U+9EF9 黹, uni9EF9-CN, was modified for V2, and needs to be reverted to its pre-V2 form, and also used for HK. U+2FCB ⿋ needs similar treatment, in terms of its HK mapping override. These changes are now reflected in the tables of the appropriate issues.
Codepoint | Region | Description |
---|---|---|
U+4E31 丱 | CN/TW/HK | The "└" component should be a single stroke for the 3 locales. |
U+55D7 嗗 | HK | The 骨 component is incorrect. JP glyph can be used instead. |
U+819A 膚 | HK | The 虍 component should look like the TW glyph and is different from the JP glyph. |
Codepoint | Region | Description |
---|---|---|
U+5DDF 巟 | HK | The last stroke of 乚 is incorrect. Seems that a new glyph is needed. |
U+614C 慌 | HK | Ditto |
(謊 included for reference)
Codepoint | Region | Description |
---|---|---|
U+3C54 㱔 | HK | The 匕 component needs to be adjusted to conform to HK's standard. |
U+451D 䔝 | HK | Ditto |
U+585F 塟 | HK | Ditto |
(此 and 死 included for reference)
@Buernia Adding a new HK glyph for U+905C 遜, uni905C-HK, is now noted in Issue #206. Adjusting the CN glyph for U+FA11 﨑, uniFA11-CN, is now noted in the table in this issue. The dot in the glyphs for U+FF1F ?, uniFF1F and uniFF1F-CN, and U+FE16 ︖, uniFE16, is not compatible in the ExtraLight and Heavy masters, which means that the ExtraLight and Heavy weights look okay, but the five intermediate weights have inconsistent or seemingly random weights. This is also now noted in the table in this issue.
I thought the question mark was adjusted in response to this issue? It may sound ironic, but I actually like the change when I saw it. It's fine for ExtraLight and Heavy to have a larger dots (they're usually used in short text and meant to be catchy) but for Light, Regular, and Bold which are usually used in documents or long text I would like the dot to be less eye-catching (incidentally I also had the dots of the full-width glyphs shrinked in my fork. The original version looks a little bit too large). I was amazed when I see this interpolation magic, and even planned to suggest similar changes to the full-width exclamation mark (but didn't as I need some more time to experience with it).
@tamcy Corrections for the CN glyph for U+4E31 丱 and the HK glyphs for U+3C54 㱔, U+451D 䔝, and U+585F 塟 are now reflected in the table at the beginning of this issue. Adding new HK glyphs for U+5DDF 巟, U+614C 慌, and U+819A 膚 is now reflected in the table at the beginning of Issue #206. And, adjusting the TW and HK mappings for U+55D7 嗗 is now reflected in the table at the beginning of Issue #202.
(Just FYI, I edited your post, to change "U+819A 丱" to "U+4E31 丱" in the first column of the first row of the first table.)
@tamcy With regard to the glyphs for U+FF01 ! and U+FF1F ?, to include their vertical presentation forms, U+FE15 ︕ and U+FE16 ︖, I will ask our designer to make the dot smaller, though I thought that I did as part of the Version 2.000 update. This is independent of making the dot in the glyphs for U+FE16 ︖ and U+FF1F ? compatible between the ExtraLight and Heavy masters. See the table at the beginning of this issue.
@extc The corrections for the HK glyphs for U+212FE 𡋾, U+235CE 𣗎, and U+27C12 𧰒 are reflected in the table at the beginning of this issue.
@Marcus98T & @lapomme I finally had time to deep-dive into the "foot" issue for ideographs that use Radical 162 (辶).
The following are the 17 affected glyphs, according to your list, and confirmed by me: uni9002-JP, uni9005-JP, uni9005-CN, uni900E-JP, uni9019-JP, uni902D-JP, uni902D-CN, uni9041-JP, uni9052-JP, uni8FEAuE0101-JP, uni8FEBuE0101-JP, uni8FFDuE0101-JP, uni900EuE0101-JP, uni9020uE0101-JP, uni9041uE0101-JP, uni9052uE0101-JP & uni9053uE0101-JP (and shown below in the same order).
I think that you missed three Extension A ideographs and one Extension B ideograph: uni4894-CN, uni48AE-CN (uni48AE-HK, shown in green, is okay), uni48B0-CN & u28455-JP.
My preliminary list only concerned the hidden JP glyphs covered in the Adobe Japan 1-6, but anyway thanks for the acknowledgement and some more uncovering. Yeah some of them were duplicates of the JP glyph and I also have even more affected glyphs (via Wiktionary, JP is default unless otherwise stated): 迍, 迚, 迠, 迢, 迦(JP and CN), 迨, 迪(KR), 迫(KR), 迴, 迺, 追(KR), 逆(JP and KR), 逌, 逥(CN), 逧(JP and CN), 逪, 進(KR), 逼, 遛, 遡, 遥 and 𨕫.
In Source Han Sans CN, the left feet is retained in 迵, 過, 適, 遹 but removed in 迥, 逈, 週, 避 (whereas the JP/KR ones are retained). The feet is here in 邉 whereas the JP/KR ones are removed.
TW/HK 週 might need to remove the left feet?
Makes me wonder if we have to remove all the feet for 口, 凵 and 𠃊 for the rest of all the glyphs, which obviously would take a long time to fix.
Another side note, there is 込 with a hidden 2 drops version, maybe assign it to the KR mapping.
EDIT:
@kenlunde OK. I have finished updating this list. The hole is already very deep and this is the final list.
@Marcus98T The more one continues to dig, the deeper the hole will become. As a result, I am extremely reluctant to to pursue this.
My apologies for double posting, but I have finally finished the list and I want to sleep. No more digging for the late night.
I think if you are reluctant to pursue further, then maybe reinstate the feet in the JP/KR versions and remove the feet in the CN/TW/HK versions. Otherwise this definitive list is to correct the feet in the 辶 radical.
@Marcus98T I'll take another stab at this later today.
@Marcus98T @kenlunde Sorry to chime in. I didn't join the discussion because I'm not particularly interested in this matter so I can't provide any valuable input. Having noticed that this may open a can of worms, I would like to suggest to inspect the affected glyphs more thoroughly before taking action too early.
Firstly, I'm not sure if removing the feet of 逆 (at least for JP/KR) is a good idea as that 凵 is far away from the bottom stroke that supports it. Yes, there isn't any foot in CN and TW's 逆, but check 縌 (U+78C0) and 磀 (U+7E0C). The feet are there.
This is similar to 過. There is a "left foot" on the 口 of 過 because the left side is relatively empty and the bottom of the 口 is still far away from the flat ㇏. The bottom-right foot is removed because it is occupied by the hook. So, if the left foot of 週 from the TW & HK glyph is to be removed, how about other glyphs like U+35FB 㗻 (CN/TW), U+3C05 㰅 (CN), U+4664 䙤 (CN), U+48AA 䢪 (CN)?
I can also spot characters that are not mentioned, e.g. 簉 (U+7C09 JP & KR).
Finally, the fate of certain types of "feet" have not been discussed:
Unless removing every foot of this kind is the ultimate goal of this project, getting rid of them prematurely may do harm more than good. Instead you may consider listing the possible scenarios (造遙 and 迥逆 can be treated differently), the glyphs concerned, and the action to be taken in the JP/KR group and CN/TW/HK group (the 辶廴 aren't shared anyway so the foot retaining rule could be different, e.g. retain the feet of 逆 for JP/KR but left out from CN/TW/HK; remove feet of 造 for all regions).
Having noticed that this may open a can of worms, I would like to suggest to inspect the affected glyphs more thoroughly before taking action too early.
@tamcy Thanks for the valuable input and supplement. While I'm not sure about the direction this design thing will take to make consistent, I should have just thought through rather than post fast and edit later. My apologies to @kenlunde and everyone else for rushing this issue.
Ah yes, it was late night and I have forgotten some other glyphs.
Maybe we should go step-by-step and just try solving the 辶 feet issue in 口 and 凵 first? Then move on to 咼, and so on in the next few months.
Edit: Thanks for reflecting them on the table. We'll do it slowly because I noticed the priority is to fix the HK glyphs.
U+807D 聽: it'd be more suitable for the HK font instance to use the TW glyph instead of the JP one. Seems that it was mapped to JP glyph because of the 十 component at the top-right corner (straight vertical stroke in HK vs slightly slanted in TW), but MoE Sung actually uses the vertical version too. This is just a style issue (unlike the bottom-left component which is a standard issue), both styles are acceptable for TW and HK. But I also included other characters having the 𢛳 component for investigation as they look inconsistent.
(EDIT 13 Apr 2019 : retracted and replaced with a new issue)
Strictly speaking, glyphs of the following characters are not correct for HK and TW.
Character | Region | Remarks |
---|---|---|
U+3D75 㵵 | HK=TW | The bottom-right stroke of the second 人 is different |
U+9139 鄹 | TW=HK | The bottom-right stroke of the second 人 is different (for this character, the SongTi font from TW MoE is wrong) |
U+56F9 囹 | TW=HK | Right stroke of 人 is different |
U+7131 焱 | TW=HK | The bottom-right stroke ㇏ of the first (top) 火. |
(1) U+9F28 鼨’s TW glyph could be adjusted to better match the convention.
(2) I noticed that the CN form of the lower part of 皐 and 臯 isn’t consistent:
It seems that the usual form used by U+66AD 暭, U+69F9 槹, U+7690 皐 and also U+7FFA 翺 are specified by the standardization body?
But I think U+66AD 暭, U+69F9 槹, U+7690 皐 and also U+7FFA 翺 can be and should be adjusted so that the form is synchronized. For U+69F9 槹 and U+7690 皐 HK glyphs could be used.
Some more HK glyph issues.
U+6F5C 潜: The 曰 component is different.
The following codepoints should be exclusive to HK, so I'm adding them here:
U+202B7 𠊷 and U+21376 𡍶: 夂 should be used instead of 攵.
U+20C41 𠱁: The last stroke of 水 needs adjustment (This is subtle, but as it's a common character in HK I'd hope that it could be fixed).
U+236BA 𣚺: The left side is incorrect, the character is composed with 未,成 and 母 (lit. not yet become a mother).
U+29B0E 𩬎: First stroke if the bottom component doesn't match HK's convention.
@tamcy The post-processing error of the Heavy glyph for U+1F250 🉐, u1F250-JP, has been noted. The actual Heavy master is okay.
To CN glyph, here's a comment about that "尔" was correct but the others were wrong.
However, it's all wrong with characters about "尔" now.
This is also a problem to TW glyph.
@celestialphineas Thank you. Too bad you didn't report this a week or two earlier, because we missed the opportunity to fix this in Version 2.001. Interestingly, his also revealed the same issue in the ExtraLight weight of Kozuka Gothic CID+15365, which has been lurking for nearly 20 years:
We also just updated Kozuka Gothic to Adobe-Japan1-7, so we also missed an opportunity to fix it for the current release.
@kenlunde , will the glyphs around "尔" be modified for CN, TW and HK in the future versions? The issue had been mentioned at /issues/99#issuecomment-162217951, and was mentioned again at this issues's comment.
@kenlunde, I know there's a stylistic difference between Sans and Serif, but the style of Sans (方體) is usually closer to Regular Script (楷體) than Serif's (宋體) is. In Regular Script, the "小" in "尔" is not connected to other components.
@al2m025304 It is worth pointing out that the regional conventions for HK use a connect form of 尔, which also applies when it is used as a component, meaning that your suggestion does not universally apply to Chinese (CN/TW/HK). For Version 2, we at least have consistency, though it is not in the direction that you prefer. And, in inspecting various Chinese fonts, I find plenty of inconsistency in whether the 小 connects to the horizontal stroke above or not.
In any case, to ensure that your comments are captured and considered for the next major revision, I added a new entry to the table at the beginning of Issue #222.
Yep, too late for Version 2.001. This will need to wait until the next update.
Which next update? v3 or v2.002?
I am also hoping the Kozuka Gothic "麤" can be fixed some other time in another minor update ASAP.
Unknown. I am hoping to issue a maintenance update later this year that will add mappings for the four Extension G ideographs, and possibly add two soon-to-be-registered GPOS features (to be tagged 'chws' and 'vchw'). These glyph corrections will be considered.
With regard to U+50E7 僧 uni50E7-HK, it is a fix to an existing glyph, which is why I edited out "major" in my reply. As soon as Version 2.001 has be released, I will edit the table at the top of this issue to reflect the adjustments for U+9EA4 麤 uni9EA4-JP (ExtraLight master) and U+50E7 僧 uni50E7-HK.
Kozuka Gothic, which is a commercial typeface, is on a completely different development cycle.
I see.
EDIT: I really don't understand how Source Han Sans and Kozuka font development cycle work.
@Marcus98T I didn't state that. Kozuka Gothic is much easier to update that Source Han Sans. There are only six fonts, and the U+9EA4 麤 issue affects only one of them. My point is that the glyph for this ideograph is in Supplement 4 (Adobe-Japan1-4), and the first version of Kozuka Gothic to support Supplement 4 was released in 2001, meaning that it took nearly 20 years for this issue to be found. In fact, it was found only because I decided to cross-reference the glyph per @celestialphineas' report against the corresponding Source Han Sans JP glyph. In other words, the fix is easy, and it shall be done the next time we update that typeface, but no one can make a remotely valid argument for the fix being an urgent one.
OK.
no one can make a remotely valid argument for the fix being an urgent one.
I have edited my above comment given my lack of understanding of the development and release cycle. Sorry for the misunderstanding.
@cathree3 Intentional. The KR glyph is actually a variant of the JP glyph, and maps to Adobe-Japan1-7 CID+13666, which corresponds to Adobe-Japan1 IVS <U+82B1,U+E0101>. A separate CN glyph is included for subtle balance adjustments.
A separate CN glyph is included for subtle balance adjustments.
I suggest adjusting the JP and KR versions to match the balance of the CN version for glyph sharing (EDIT: in the case of the KR version), so we have one less glyph to worry about.
We'll take it under advisement.
The KR form of 遭 have uneven thickness in the heavy master. Not an issue for the rest of the locales.
The 門 in the KR version of 閼 (red) is smaller than the JP version. Please adjust the 門 so that it’s the same size as the JP version (blue). Kozuka Gothic is affected as well.
For 甆, the 提 should not be touching the 乙 at the bottom (JP glyph only). Kozuka Gothic is also affected as well (right).
The CN version of 触 has a thickness error. To fix, do not try to have the thinner line between the 虫 rectangle in the heavy master, like in the JP version.
鷕 CN version (red) should be balanced based on JP glyph (blue), especially the 口.
For 名, I think the JP glyph should be adjusted such that the balance of 夕 is not so lopsided, and maybe the top of the 丶 should not touch the 丿, like Kozuka Gothic, in which this typeface is yet again affected in terms of the lopsidedness. Some Japanese fonts like Heisei Gothic and Hiragino Sans are not so lopsided.
And finally, feet removal time. I have decided that any 口 component within ⿵ like 門, or when 口 is on the left side of this ⿰ and there is a 丿 on the right side, then only the right feet be removed, keep the left feet.
For the 屰 component, I find the feet inconsistency between characters baffling, and I think they should be removed from these characters: 塑槊愬厥闕 (JP), 磀縌 (SC and TC/HK).
If you think it's not appropriate to deal with 屰 as it is admittedly a tough nut to crack, maybe revert the glyphs to 2.000 for 逆 and 遡 (JP and KR) (adding back the feet), then add the right feet for 阙闕 (SC version). I must have been too far over what feet in what components to remove.
I also find it a bit baffling that the 丿 is a bit lower in the 䒑 part, like 首前兹兼 (some glyphs like 前兼 are actually shared with CN), but 䒑 have symmetry in the JP version of 屰 and their components, so I was wondering if I could adjust the JP glyph for a little asymmetry to make consistent with the other characters.
Every component with 中 above the ⿱ forms should have the feet removed (JP and KR affected): 虫忠盅𨨩 (not exhaustive).
Also these characters need to have the feet removed: 慥 (JP, KR and CN) 糙簉 (JP and KR), 瓲 (JP only), 瀜 (虫 part, JP only), 起趈趡趞趦趥趭趯 (JP, KR and SC), 鼬鼫 (JP, KR, TC and HK), 鼯(TC only), 兘 (JP only), 廸廹廻廼廽 (JP only, as mentioned before), 搥縋槌磓鎚 (JP only), 螸 (the 欲 part), 訄 (CN only).
Remove only the right feet: 訚誾 (CN and JP), 勂, 斮 (EDIT: Also include 啟 (SC and TC)).
Then finally, for the 隹 component in all CN, TW and perhaps HK glyphs, the feet must be removed only when 隹 is on top of ⿱, similar to the JP glyphs which have no feet, so as to remove the noise when viewed on lower resolution screens.
Apologies for potential duplicates. Here are the affected glyphs, probably not exhaustive:
趡隼隻隽售集雋焦魋滙匯燞暹壅甕赝贋雘雟雥雧
焦 - 僬谯譙撨噍嶕潐憔嫶樵膲燋礁瞧穛蟭醮鐎劁鹪鷦顦蕉嶣癄趭
雋 - 儁擕懏檇臇觹鐫嶲寯
集 - 㗱襍潗㙫磼㠍㠎穕鏶雧
赝譍噟膺軈應鹰鷹
隼 - 㔼準㢑榫鎨鶽
壅饔罋甕㽫
隽 - 携槜镌鎸
隻 - 愯謢蒦篗
矍匷
矍攫玃戄彏欔䦆钁矡蠼躩貜籰
趯
雟巂儶攜孈瓗欈蠵纗觿讗鑴驨酅
蠽
奪奮
蒦擭嚄獲濩嬳瓁檴臒雘矆矱穫鹱鸌耯蠖艧彟彠護鑊鳠鱯韄頀劐籆
舊
戁臡
犫
犨
凖
犨
赝贋
璡
雙 and 讐 has the feet removed, so no change for those two. I'm not sure about 閵躙藺 as nothing is obstructing the feet.
In addition, the JP and KR glyphs are affected: 璡魋滙匯燞暹
Also I was thinking if the CN version of 异 (red, current) can be adjusted so that it matches the balance of the JP glyph (blue, my proposal based on the JP glyph with 己 becoming 巳).
For convenience, I will quote some of my proposed corrections from the other sections:
The three glyphs require minor correction.
Only SC affected: 疝嵆 - straighten up the 山 part.
For SC/TC/HK versions, The 山 of 峇 should not have the feet sticking out.
However, upon closer inspection, 专啭 actually need to have a slant at the top stroke to standardise with the others which have the slant, which should have fallen under glyph correction.
Standardise with the rest (SC glyphs only, TC/HK not affected; remove the decoration): 兓既旣黖
UPDATE: Some minor edits.
@Marcus98T In Heavy weights, it is not uncommon that a stroke becomes thinner when it passes through a box-like component. In fact, this is optically preferred (because making the stroke uniform, aka “mathematically correct”, actually looks uneven). Here are some examples showing the character U+8679 虹:
Typefaces used: FZLTHProGlobal Heavy (left) and AR ShuYuanSongH16GB HV (right)
From this point of view, U+906D 遭 may be a non-issue after all. The CN glyph of U+89E6 触, however, does seem to have an interpolation problem. Perhaps this is the same issue as the already documented U+720B 爋.
@RuixiZhang42 Yes, the issue that @Marcus98T pointed out about the KR glyph for U+906D 遭 is intentional, and a non-issue. CN glyph of U+89E6 触, on the other hand, is a real issue, and will be fixed like we did for the CN glyph for U+720B 爋.
@kenlunde @RuixiZhang42 Yes, your argument about the optical stroke thinness is correct, but the other U+906D 遭 glyphs (especially the JP glyph) do not have the thinner-stroke-within-a-box, and that bothers me. All I need is better consistency between the glyphs. Perhaps I suggest thinning the top two strokes of 曹 in the KR version so that it matches the JP version? Anyway I have found out that the top two strokes in the KR version are actually thicker than in the JP version, and this is an issue, because they are inconsistent.
I also found out Kozuka Gothic is affected.
And for U+89E6 触, as said before, your argument about the optical stroke thinness is still correct, but yet the JP glyph has no thinner-stroke-within-a-box within 虫. Therefore, as to how the CN version can be fixed, I want to bring over the JP component of 虫 to the CN version, like this mockup below.
I’ll be ok with having a thinner-stroke-within-a-box look, just that ALL locales need to have consistency, whether it’s JP, KR, CN, or any hidden Kangxi-style glyph. We can’t have a situation where one locale has the thinner-stroke-within-a-box while the others have uniform strokes.
@Marcus98T We have limited time and limited resources, and while perfection is a worthwhile goal, there are reasons, practical and otherwise, why it's not going to happen in the near future. If we had unlimited time and unlimited resources, it would be a different story, but alas we do not. I suggest that you focus your energies elsewhere.
With that said, the JP glyph for U+906D 遭 is based on Adobe-Japan1-7 CID+2810 (Supplement 0), and its working glyph name is uni906D-JP, and the KR glyph is based on Adobe-japan1-7 CID+13896 (Supplement 4), and its working glyph name is uni906DuE0101-JP. The reason why you see the same difference in Kozuka Gothic is because the Souce Han Sans glyphs were derived from that typeface. The Kozuka glyphs date back almost 20 years. While it is an inconsistency, fixing it is nowhere close to being a high priority.
I didn’t know even huge companies have limited resources. If so, the most important thing I’d like to see in v2.002 is completing the removal of the feet as mentioned earlier. Especially the CN/TW/HK 隹 components placed on top, and a few others like 糙.
(EDIT: I think maybe drop the "confusion" reaction and we do one thing at a time, like what I said above. I think I will bring up the unification of glyphs much later when the time is right)
The Taiwan glyph, and possibly also the Hong Kong glyph, for U+7742 睂 are inconsistent with Unicode Standard, although there are no government standards to reference. We can also see some discrepancies even between fonts intended for the same region (as in the CN column), but the China glyph is consistent with Unicode Standard. See image below for explanation.
Although there is no official reference at all for the Hong Kong glyph, I suggest using the same as Taiwan glyph (considering that the 華康標宋 design and the 新細明體 design is the same).
@S-Asakoto Being in CNS 11643 Plane 3 and not present in HKSCS, the ideograph U+7742 睂 is outside the Traditional Chinese scope of this project.
Some dot strokes become a straight line in the HK/TW version of the font, which is inconsistent with the Unicode standard. They're basically characters used in simplified Chinese, but regardless having no JP/KR source they show in JP/KR style. For example, U+4EB2 亲 and U+5E90 庐:
I know these are in CNS 11643 Plane 3 and are not present in HKSCS, so (as you would've said that they're "outside the Traditional Chinese scope of this project"), but they seem just inconsistent with other characters with such components in HK/TW standard.
On the other hand, U+5E9D 庝, being in CNS 11643 Plane 3, without JP/KR source and not present in HKSCS, the dot is preserved -- rather, the glyph shares among all 5 fonts:
I would like to hope this be an exception to the "out-of-scope" policy and a glyph shared by HK/TW font be created with the dot replacing the vertical bar.
@S-Asakoto I think it isn't very accurate to say that “dot strokes become a straight line” for the characters you mentioned. A better way to say it would be the characters are considered “out of scope” so adhering to the corresponding regional convention is not guaranteed.
As you know, the scope of Traditional Chinese for Taiwan in this project is limited to the characters defined in Big5 (i.e. CNS 11643 Planes 1 & 2), which means that only Big5 characters are guaranteed to adhere (mostly) to Taiwan MoE's conventions in the TW version. But the codepoint coverage of Source Han Sans is not limited to Big5, so it is very easy to a character beyond Big5 not adhering to Taiwan MoE's standard.
It looks to me that the developer tries to find the best-match from other regions for any out-of-scope characters. Take 擵 as an example:
擵 has two glyphs, namely uni64F5-JP (for JP region) and uni64F5-CN (for CN). This character falls beyond Big5, so there is no dedicated glyph for TW or HK, and unfortunately none of the existing glyphs conform to TW standard (the JP glyph uses a straight line for the dot; in the CN glyph 𣏟 is unified with 林. The developer decided to map the JP glyph but not CN one for TW, probably because the difference in the “𣏟” component is more apparant, so the JP glyph is considered the closest match.
Now, back to the three characters you mentioned.
U+4EB2 亲: JP and CN glyphs exist. JP is chosen for TW, probably because the design difference in 木 is more apparant (which I agree).
U+5E90 庐: JP and CN glyphs exist. JP is chosen for TW, probably because the design difference in 戶 is more apparant (which I also agree for TW. But HK should use CN glyph instead. It isn't the case probably due to historical reason that a separate HK version didn't exist before v2.000. Before that, TW was named TWHK, so the same mapping for out-of-scope characters was used).
U+5E9D 庝: Only CN glyph exists. There is no choice but maps all other regions to the CN glyph. Since the design of the “dot” is the same for CN and TW, the effect is that 庝 adheres to MoE standard even though it isn't in the supported range.
So yes, the dot is kind of “preserved” for U+5E9D 庝, but this is purely coincidental. Same for why U+5E0D 帍 adheres to MoE standard (戶's first stroke in 丿) but U+5554 啔 and U+623B 戻 doesn't even though all these are out of scope - because U+5E0D 帍 has a JP glyph to map from, and coincidently the JP design of 帍 is the same as (or very close to) that required by TW.
@tamcy Oh actually I had thought that Unicode standard would be followed anyway even for out-of-scope characters if T-/H-source is present. So that's why the character 睂 also looks that way for TW/HK font.
I suggest to adjust the following six HK glyphs:
- u25E49-HK 𥹉
- u25E81-HK 𥺁
- u25E82-HK 𥺂
- u25E83-HK 𥺃
- u25EA6-HK 𥺦
- u25EBC-HK 𥺼
The design of the 米 component for these 6 glyphs follows the TW form, where the two slanted strokes at the bottom do not touch the middle of 十, which is not consistent with other HK glyphs. For this component, HK, CN, JP and KR share the same form, where the two slanted strokes at the bottom are touching the middle of 十:
While at it, I hope that u25E49-HK 𥹉 can be adjusted further to enhance the poportion of the left and right components, and the spacing of the 又 component, as shown:
I'm not sure it is intended, but the JP version of 龜 U+9F9C
looks as if it has 18 strokes instead of 16, and it is inconsistent with the corresponding glyph of Source Han Serif, in which each "leg" (ヨ) of a turtle is connected to the "shell" (メ) with a single stroke:
(left: Source Han Sans v2.001, right: Source Han Serif v1.001)
龝 U+9F9D
and 龞 U+9F9E
have similar issues.
Hi, I'd like to ask if a definitive decision was made regarding making quotation marks (U+2018, U+2019, U+201C, U+201D) proportional width for Chinese?
Most of the discussion for this was in notofonts/noto-cjk#5; it seemed like this change was under consideration for 2.000, however there was no follow-up comment and the issue was closed for being stale. I have skimmed through the various consolidation issues and the only mention I found was a brief discussion ending at #99 (comment).
My (very limited) understanding regarding quotation marks in Chinese is that European-style quotation marks are used for Simplified Chinese but not Traditional Chinese. If these quotation marks cannot be made proportional for all Chinese because of this, then I would suggest at least making them proportional for Traditional Chinese.
Mixing Chinese and English text is somewhat common in Hong Kong, e.g. most HK websites will have Traditional Chinese and English versions. Having full-width quotation marks in English text is quite an eyesore and (from a web design/development standpoint) it means another font like Roboto or Helvetica needs to be included to display English text.