Patches cannot be decoded using utf-8
Closed this issue · 2 comments
Thomsch commented
I'm not able to read patches for Jsoup 52, Compress 7, and Lang 25 using utf-8. I get the following error:
File "/home/tschweiz/.pyenv/versions/3.8.15/lib/python3.8/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 13981: invalid continuation byte
This only happens on these patches. No other bug patches have this issue. Is it the intended behavior because these patches have special characters or is the encoding incorrect?
Replication
- Install Python package (https://pypi.org/project/unidiff/)[unidiff]
- Read patch with unidiff:
PatchSet.from_filename(path/to/patch)
jose commented