Skyscanner/whispers

UnicodeDecodeError when parsing some unicode files.

Opened this issue · 2 comments

File "/usr/local/lib/python3.9/dist-packages/whispers/utils.py", line 104, in find_line_number
for line_number, line in enumerate(filepath.open().readlines(), 1):
File "/usr/lib/python3.9/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfe in position 0: invalid start byte

File "/usr/local/lib/python3.9/dist-packages/whispers/plugins/init.py", line 60, in load_plugin
if self.filepath.open("r").readline().startswith("<?xml "):
File "/usr/lib/python3.9/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbc in position 21: invalid start byte

Fix:
utils.py:
for line_number, line in enumerate(filepath.open(errors='ignore').readlines(), 1):

whispers/plugins/init.py:
if self.filepath.open("r", errors='ignore').readline().startswith("<?xml "):

hey @kyocooro thanks for raising. It is a known bug, and is addressed in the v2 release that is coming up shortly. regards