PyYoshi/cChardet

LookupError: unknown encoding: EUC-TW

meshy opened this issue · 3 comments

meshy commented

This seems similar in nature to #8, but unfortunately, I do not know what to recommend as an alternative to EUC-TW.

One can see that there is nothing that quite matches in Python's list of standard encodings.

I also thought that I should look through the other encodings mentioned in the readme, and found that there are a number of other codecs that did not come up in the list:

Do you have any recommendations for how I could decode strings that are detected as these types in python?

meshy commented

Further investigation has revealed that python wont fix EUC-TW and ISO-2022-CN encodings.

Just ran into this myself.

  • X-ISO-10646-UCS-4-2143

  • X-ISO-10646-UCS-4-3412

See https://stackoverflow.com/q/18518730