loding dataset error

Question

loding dataset error

Rock-L opened this issue 6 years ago · 4 comments

when i training the model (./autoner_train.sh), a error accured like :
Traceback (most recent call last):
File "train_partial_ner.py", line 66, in
dataset = pickle.load(open(args.eval_dataset, 'rb'))
FileNotFoundError: [Errno 2] No such file or directory: './models/BC5CDR/encoded_data/test.pk'

where can i find the test.pk?

and i find the file './models/BC5CDR/encoded_data/' is empty , so the train_0.pk is also missed

Answer 1 · 2018-11-07T19:51:15.000Z

Hmm, it looks odd to me. There is a dataset encoding step in the autoner_train.sh. Did you see that step completed successfully?

Answer 2 · 2018-11-08T07:45:04.000Z

no, it not complete because out of memery when pickle loading "embedding.pk" . it too big, so that the thread is autonomous killed.
with open(args.pre_word_emb, 'rb') as f:
w_emb = pickle.load(f)
i find this error but have no solution

Answer 3 · 2018-11-08T21:40:13.000Z

I see. If that's the case, one solution is to find a machine with a bigger memory. Another solution is to do one round of word embedding filtering, i.e., remove all embeddings of the words never appear in the corpus.

Answer 4 · 2018-11-08T22:39:03.000Z

Yes, we try to keep all pre-trained word embedding (in order to ensure the resulting model to be as powerful as possible). But this is not necessary for training.