janlelis/unicode-display_width

Hangul Jamo Extended-B should be 0-width

ninjalj opened this issue · 1 comments

The Hangul Jamo Extended-B block at U+D7B0..U+D7FF contains jungseong and jongseong for Old Korean, and should be treated the same as U+1160..U+11F0.
glibc's wcwidth() treats that block as 0 width since:

commit 6e540caa21616d5ec5511fafb22819204525138e
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Jun 16 08:29:40 2020 +0200

Set width of JUNGSEONG/JONGSEONG characters from UD7B0 to UD7FB to 0 [BZ #26120]
Reviewed-by: default avatarCarlos O'Donell <carlos@redhat.com>

diff --git a/localedata/charmaps/UTF-8 b/localedata/charmaps/UTF-8
index 14c5d4fa33..8cce47cd97 100644
--- a/localedata/charmaps/UTF-8
+++ b/localedata/charmaps/UTF-8
@@ -48920,6 +48920,8 @@ WIDTH
 <UABE8>        0
 <UABED>        0
 <UAC00>...<UD7A3>      2
+<UD7B0>...<UD7C6>      0
+<UD7CB>...<UD7FB>      0
 <UF900>...<UFA6D>      2
 <UFA70>...<UFAD9>      2
 <UFB1E>        0

Hi @ninjalj,

thank you for the PR and the background info. I have adapted the index in cdbb5de and have released v2.2.0 which treats this block as zero-width.