Divide ligature letters such as Thai, Khmer letters and complex emoji into array of graphemes.
You can simply use this library instead of Array.from
to get graphemes.
$ npm install split-graphemes
// An emoji '๐จโ๐ฉโ๐ฆโ๐ฆ' consists of 4 people face emoji joined by Zero Width Joiners (ZWJ).
const chars = Array.from('๐จโ๐ฉโ๐ฆโ๐ฆ') // ['๐จ', ZWJ, '๐ฉ', ZWJ, '๐ฆ', ZWJ, '๐ฆ']
// It is interpreted exactly as one character!
const chars = splitGraphemes('๐จโ๐ฉโ๐ฆโ๐ฆ') // ['๐จโ๐ฉโ๐ฆโ๐ฆ']
Array.from('แแแปแแแแทแ') // ['แ', 'แ', 'แป', 'แ', 'แ', 'แ', 'แท', 'แ']
splitGraphemes('แแแปแแแแทแ') // ['แแแป', 'แแแแทแ']
splitGraphemes('ใใใใใซใใกใใฏใ') // ['ใใ', 'ใใ', 'ใซใ', 'ใกใ', 'ใฏใ']
splitGraphemes('ใใใใใใใใใใ') // ['ใใ', 'ใใ', 'ใใ', 'ใใ', 'ใใ']
splitGraphemes('Hello') // ['H', 'e', 'l', 'l', 'o']
The list of characters is at here.