mwh/dragon

support for utf-8 text

AriaMoradi opened this issue · 4 comments

I tried to drag some utf-8(persian) text while running dragon with --target
but this was printed:

ÙÙØªÛ ÙÙÙز Ù
ردÙ
 Ù
ا تÙÙÛÙ
 ÙشدÙد ØÙÙ
عÙÛ ÙرÙØ·ÛÙ٠را ÙÙ
Û Ø¯Ø§ÙÙد ØÙباÛد تÙÙع داشت کرÙÙا ب٠اÛ٠زÙØ¯Û Ø§Ø² Ú©Ø´Ùر Ù
ا برÙدØبÛÙ Ù
رÛضÙØ§Û  Ù
٠اÙØ±Ø§Ø¯Û ÙستÙد ðÚ©Ù Û¶ÙرÙردÛ٠در

while the original text was:

وقتی هنوز مردم ما تفهیم نشدند ،ومعنی قرنطینه را نمی دانند ،نباید توقع داشت کرونا به این زودی از کشور ما برود،بین مریضهای  من افرادی هستند 🌀که ۶فروردین در

looks like the problem is caused by using printf in the code, mayne using a utf-8 compatible print function is going to fix the problem

dasJ commented

I tried dragging Unicode text from Firefox to dragon while dragon was operating under LANG=C, LANG=en_US.utf8, and LANG=ja_JA.utf8, and did not encounter any mojibake problems.

mwh commented

It also works fine for me with everything I've been able to throw at it. I'm not sure what the original issue could be: dragon just treats the text as opaque byte strings, so if the encoding matches on both ends there shouldn't be any issue (but if they don't, it will definitely fail). printf has no issue with UTF-8 and there's no reason it should (while wprintf would).

We'd need more detail to resolve it about the system/terminal/application encodings, which applications were involved, etc, in order to track it down. If someone can find a case where this happens under reproducible conditions where the terminal & application encodings match we can try to track it down, but otherwise I suspect the issue is outside dragon. I am going to close this for the moment, but if someone finds a reproducible case of this please comment or file another bug with it.