text encoding error
batrlatom opened this issue · 2 comments
batrlatom commented
Hi,
I am getting this error
Traceback (most recent call last):
File "train/train_mlm.py", line 113, in <module>
main(parser.parse_args())
File "train/train_mlm.py", line 69, in main
data_module.setup()
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/core/datamodule.py", line 428, in wrapped_fn
fn(*args, **kwargs)
File "/opt/perceiver-io/data/imdb.py", line 131, in setup
self.ds_train = IMDBDataset(root=self.root, split='train')
File "/opt/perceiver-io/data/imdb.py", line 42, in __init__
self.raw_x, self.raw_y = load_split(root, split)
File "/opt/perceiver-io/data/imdb.py", line 34, in load_split
raw_x.append(f.read())
File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 449: ordinal not in range(128)
it is probably related to the unicode encoding
krasserm commented
@batrlatom thanks for reporting, this should be fixed now. Please re-open this ticket if the problem persists.
batrlatom commented
It works now. Thanks