civic/elitech-datareader

User info appears to be encoded with GBK, not UTF-8, on RC5

aaugustin opened this issue · 5 comments

msg.py contains this code:

        try:
            self.user_info = user_info.decode("utf-8").rstrip("\x00")
        except UnicodeDecodeError as e:
            self.user_info = ""
        try:
            self.dev_num = dev_num.decode("utf-8").rstrip('\0x00')
        except UnicodeDecodeError as e:
            self.dev_num = ""

user_info was always coming out empty for my RC-5, while it was set to Salle à manger on the device.

Changing .decode("utf-8") to .decode("gbk") fixed the issue. I found this charset by trial and error, trying various Chinese charsets.

I don't own a RC-4 so I can't check if the behavior is the same there.

If this change is made, for consistency, the two calls to .encode("utf8") must be changed to .encode("gbk") as well.

This will provide interoperability with the Rc Logger software provided by Elitech -- which may or may not be a goal of this project :-)

Using UTF-8 both ways works as long as Rc Logger isn't used because the RC-5 merely saves the bytes and sends them back.

civic commented

Thank you! I'll check it both RC-4 / RC-5.

civic commented

When I tried in Logger Data Management Software (V2.0), charset seemed 'MS932'. This is my Windows default charset.
I'll add optional parameter --encode (default: utf8).

elitech-datareader -c set --user_info "データ" --encode MS932 /dev/tty.SLAB_USBtoUAR
elitech-datareader -c devinfo --encode MS932 /dev/tty.SLAB_USBtoUAR

1__tmux
datalogger_software

I'm running Rc Logger on a Mac. It looks like the charset is OS-dependant :-/

At this point it may be best to stick to UTF-8 and to document this as a known limitation.

civic commented

Oh, I did not know the Mac version of Rc Logger!