melink14/rikaikun

Ignore invisible characters when playing text-to-speech

Opened this issue · 0 comments

Google Docs outputs Japanese characters with zero with spaces like ‌ between every kanji. TTS just sends the selected text to play which means it gets played as individual characters.

Selection is determined by match length but when we look up the word we of course santize the input.

Options:

  • Before playing speech, repeat sanitization from main lookup code.
  • Instead of sending back just match length and entries, we can send back the santized 'match' or headword that we used to perform the look up.
<span class="kix-wordhtmlgenerator-word-node" style="font-size:16px;font-family:'HiraMinPro-W3';color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;"> ‌私‌は‌先‌週、‌校‌長‌の‌畑‌佐‌先‌生‌に‌お‌話‌を‌う‌か‌がっ‌た。‌現‌在‌畑‌佐‌先‌生‌は‌日‌本‌語‌学‌校‌の‌校‌長‌で‌</span>