mrichards42/xword

Emoji clues are rendered incorrectly on Windows

Closed this issue · 2 comments

If I set a clue to an emoji character in a format which supports UTF-8 (like JPZ), it fails to render correctly. Not sure whether it's a wxWidgets limitation or if we're mangling the text before passing it on to be rendered.

Verified that Emoji render correctly on Mac (using the UTF-8 .puz support from #154).

I believe the problem here is the platform definition of std::wchar. On Mac/Linux, it's 4 bytes, and you must use UTF-32; this is what XWord appears to have implemented. But on Windows, it's only 2 bytes, and you're expected to use UTF-16. As a result, UTF-8 to Unicode conversions (e.g. as done by utf8_to_unicode in puzstring.cpp) may try to store a value that's too large to fit in one character as one character, leading to corruption for any characters too large to fit into 2 bytes.

We could have Windows-specific implementations of UTF-8 encoding/decoding here, though I'm not sure if there might be wider difficulties/consequences.

Ah good call, yes that's almost certainly it. Probably worth scrapping those hand-rolled functions and using a library. Looks like pugixml handles encodings, so that might be the lightest lift. I didn't realize it also has a wchar implementation that does the encoding and decoding for you or I might have used that in the jpz parser originally.