Bold/Italic Unicode characters incorrect width
kalemi19 opened this issue Β· 3 comments
Example: ππΌπΉπ±
Javascript count this as 8 characters (just like emojis, each bold character has the length 2).
Ruby counts this word as 4 characters, causing an inconsistency with the frontend.
I just tried it with this Gem, but Unicode::DisplayWidth.of("ππΌπΉπ±")
still returns 4.
Is this a bug or is there something I need to do in order to make it work for my use case?
Thank you
Hi @kalemi19,
unfortunately, the Unicode standard does not provide a definite way how exact string width should be calculated. However, they do provide EastAsianWidth.txt which lists π
as a neutral/narrow letter.
Which method do you use to retrieve the character count in JavaScript?
The standard String.length() function.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/length
P.S. Sorry for the late reply
The length return by JavaScript is the number of code units required to represent the data in UTF-16. You can use the unibits utility to get a lower-level view of the data:
π πΌ
U+1D5D5 U+1D5FC
35 D8 D5 DD 35 D8 FC DD
00110101 11011000 11010101 11011101 00110101 11011000 11111100 11011101
πΉ π±
U+1D5F9 U+1D5F1
35 D8 F9 DD 35 D8 F1 DD
00110101 11011000 11111001 11011101 00110101 11011000 11110001 11011101
Each code point (i.e. character) is made of 4 bytes which resemble the lower and the higher code unit in UTF-16 (also see https://en.wikipedia.org/wiki/UTF-16)
What this library (unicode-display_width) does is assigning a width to each code point, using each code point's EastAsianWidth as a one major factor (see https://www.unicode.org/reports/tr11/#Overview). As stated in the first comment, the bold characters have no full-width property defined, which is why they are counted as being just on 1
terminal space wide.