User info appears to be encoded with GBK, not UTF-8, on RC5
aaugustin opened this issue · 5 comments
msg.py
contains this code:
try:
self.user_info = user_info.decode("utf-8").rstrip("\x00")
except UnicodeDecodeError as e:
self.user_info = ""
try:
self.dev_num = dev_num.decode("utf-8").rstrip('\0x00')
except UnicodeDecodeError as e:
self.dev_num = ""
user_info
was always coming out empty for my RC-5, while it was set to Salle à manger
on the device.
Changing .decode("utf-8")
to .decode("gbk")
fixed the issue. I found this charset by trial and error, trying various Chinese charsets.
I don't own a RC-4 so I can't check if the behavior is the same there.
If this change is made, for consistency, the two calls to .encode("utf8")
must be changed to .encode("gbk")
as well.
This will provide interoperability with the Rc Logger software provided by Elitech -- which may or may not be a goal of this project :-)
Using UTF-8 both ways works as long as Rc Logger isn't used because the RC-5 merely saves the bytes and sends them back.
Thank you! I'll check it both RC-4 / RC-5.
When I tried in Logger Data Management Software (V2.0), charset seemed 'MS932'. This is my Windows default charset.
I'll add optional parameter --encode (default: utf8).
elitech-datareader -c set --user_info "データ" --encode MS932 /dev/tty.SLAB_USBtoUAR
elitech-datareader -c devinfo --encode MS932 /dev/tty.SLAB_USBtoUAR
I'm running Rc Logger on a Mac. It looks like the charset is OS-dependant :-/
At this point it may be best to stick to UTF-8 and to document this as a known limitation.
Oh, I did not know the Mac version of Rc Logger!