ChrisCavs/t-writer.js

t-writer doesn't handle non-ascii characters

Closed this issue · 1 comments

Complex scripts render one symbol, glyph or grapheme for a cluster of characters.

For example: కృష్ణ.

T-Writer would type out the 5 individual unicode characters:

  1. క (\uc15)
  2. ృ (\uc43)
  3. ష (\uc37)
  4. ్ (\uc4d) and
  5. ణ (\uc23)

instead of typing out కృ and then ష్ణ.

This is probably because t-writer iterates over each character in the string instead of iterating over cluster of characters that make up a single symbol, glyph or grapheme displayed. Essentially: కృష్ణ is an array of characters ["క", "ృ", "ష", "్", "ణ"]

What makes it hard is probably because JavaScript doesn't provide a simple way to iterate over grapheme clusters.

I am still trying to find a JavaScript library that can handle complex scripts (like all Indian languages, Arabic, Hebrew, East-Asian languages and so on).

One approach to this would be passing an option that tells the typewriter to 'watch out' for unicode characters, and slightly changing the scripting. However, this library is not being used by enough people to warrant me investing time into developing it further. I appreciate you bringing this to my attention though.