_pickle.UnpicklingError: pickle data was truncated ---on bio_embedding.pk
SeekPoint opened this issue · 3 comments
mldl@ub1604:~/ub16_prj/AutoNER$ md5sum models/BC5CDR/bio_embedding.pk
dd549629b7ea9cf97d7df62cd16c0e9f models/BC5CDR/bio_embedding.pk
mldl@ub1604:/ub16_prj/AutoNER$ python3.6 preprocess_partial_ner/encode_folder.py --input_train models/BC5CDR/annotations.ck --input_testa data/BC5CDR/truth_dev.ck --input_testb data/BC5CDR/truth_test.ck --pre_word_emb models/BC5CDR/embedding.pk --output_folder models/BC5CDR/encoded_data/ub16_prj/AutoNER$
args.pre_word_emb is models/BC5CDR/embedding.pk
Traceback (most recent call last):
File "preprocess_partial_ner/encode_folder.py", line 263, in
w_emb = pickle.load(f)
_pickle.UnpicklingError: pickle data was truncated
mldl@ub1604:
It seems to me, the filename is wrong. It should be models/BC5CDR/embedding.pk, instead of models/BC5CDR/bio_embedding.pk.
Note that, in the Traceback information, you can see that args.pre_word_emb is models/BC5CDR/embedding.pk
The MD5 sum looks good to me.
~/AutoNER$ md5sum models/BC5CDR/embedding.pk
dd549629b7ea9cf97d7df62cd16c0e9f models/BC5CDR/embedding.pk
sure, I made stupid mistake
mldl@ub1604:~/ub16_prj/AutoNER$ python3.6 preprocess_partial_ner/encode_folder.py --input_train models/BC5CDR/annotations.ck --input_testa data/BC5CDR/truth_dev.ck --input_testb data/BC5CDR/truth_test.ck --pre_word_emb models/BC5CDR/embedding.pk --output_folder models/BC5CDR/encoded_data
args.pre_word_emb is models/BC5CDR/embedding.pk
Killed
looks still OOM