User info appears to be encoded with GBK, not UTF-8, on RC5

Question

User info appears to be encoded with GBK, not UTF-8, on RC5

aaugustin opened this issue 8 years ago · 5 comments

msg.py contains this code:

        try:
            self.user_info = user_info.decode("utf-8").rstrip("\x00")
        except UnicodeDecodeError as e:
            self.user_info = ""
        try:
            self.dev_num = dev_num.decode("utf-8").rstrip('\0x00')
        except UnicodeDecodeError as e:
            self.dev_num = ""

user_info was always coming out empty for my RC-5, while it was set to Salle à manger on the device.

Changing .decode("utf-8") to .decode("gbk") fixed the issue. I found this charset by trial and error, trying various Chinese charsets.

I don't own a RC-4 so I can't check if the behavior is the same there.

Answer 1 · 2017-02-05T20:10:12.000Z

If this change is made, for consistency, the two calls to .encode("utf8") must be changed to .encode("gbk") as well.

This will provide interoperability with the Rc Logger software provided by Elitech -- which may or may not be a goal of this project :-)

Using UTF-8 both ways works as long as Rc Logger isn't used because the RC-5 merely saves the bytes and sends them back.

Answer 2 · 2017-02-06T05:52:25.000Z

Thank you! I'll check it both RC-4 / RC-5.

Answer 3 · 2017-02-07T14:56:05.000Z

When I tried in Logger Data Management Software (V2.0), charset seemed 'MS932'. This is my Windows default charset.
I'll add optional parameter --encode (default: utf8).

elitech-datareader -c set --user_info "データ" --encode MS932 /dev/tty.SLAB_USBtoUAR
elitech-datareader -c devinfo --encode MS932 /dev/tty.SLAB_USBtoUAR

Answer 4 · 2017-02-07T15:19:32.000Z

I'm running Rc Logger on a Mac. It looks like the charset is OS-dependant :-/

At this point it may be best to stick to UTF-8 and to document this as a known limitation.

Answer 5 · 2017-02-07T15:25:41.000Z

Oh, I did not know the Mac version of Rc Logger!