taolei87/rcnn

UnicodeDecodeError

Mahhos opened this issue · 0 comments

Hi @taolei87 . I am running your sentiment classification code. I used the dataset that was specified in the description. However, I am getting this error. First, I ran the code on my own train/dev/test sets and got the error. Then I tried your dataset and got the same issue. Do you have any idea of what is wrong. I am not changing anything in the code.

Traceback (most recent call last): File "main.py", line 539, in <module> main(args) File "main.py", line 394, in main embs = load_embedding_iterator(args.embedding) File "\rcnn-master\rcnn-master\code1\nn\basic.py", line 220, in __init__ for word, vector in embs: File "\rcnn-master\rcnn-master\code1\utils\__init__.py", line 14, in load_embedding_iterator for line in fin: File "\anaconda3\envs\py3\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2776: character maps to <undefined>

The command that I ran:
python main.py --embedding glove.6B.100d.txt --train stsa.binary.phrases.train --dev stsa.binary.dev --test stsa.binary.test --save output_model