bloomsburyai/question-generation

error when running eval.py

Closed this issue · 6 comments

Used commit recommended (be13417) and tried to run python eval.py --data_path ./data/
Got the following error:
sh-4.2$ python eval.py --data_path ./data/
WARNING:tensorflow:From /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
Traceback (most recent call last):
File "eval.py", line 265, in
tf.app.run()
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "eval.py", line 41, in main
dev_data = loader.load_squad_triples(FLAGS.data_path, dev=FLAGS.eval_on_dev, test=FLAGS.eval_on_test)
File "/home/ec2-user/SageMaker/question-generation-be134175652204f3bf51cb194454d7b72c8b8105/src/helpers/loader.py", line 32, in load_squad_triples
raw_data = load_squad_dataset(path, dev=dev, test=test, v2=v2)
File "/home/ec2-user/SageMaker/question-generation-be134175652204f3bf51cb194454d7b72c8b8105/src/helpers/loader.py", line 23, in load_squad_dataset
with open(path+filename) as dataset_file:
FileNotFoundError: [Errno 2] No such file or directory: './data/dev-v1.1.json'
sh-4.2$ python eval.py --data_path ../data/
WARNING:tensorflow:From /home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
ERROR Eval dataset is smaller than the num_eval_samples flag!
sh-4.2$

How can I run the demo without using flask app?
Thank you

FileNotFoundError: [Errno 2] No such file or directory: './data/dev-v1.1.json'

This is the important part. Make sure you have a dataset in the data directory. Either use the official SQuAD split, or this repo containing the split from Du et al 2017. Then, use --noeval_on_dev --eval_on_test to evaluate on the test set.

The demo is built in flask so you need flask to run the full demo.

Sorry, copied two errors instead only the last one. . The error of data was resolved. File dev-v1.1.json was used from repo be13417, however got the following error: ERROR Eval dataset is smaller than the num_eval_samples flag!

The files from the repo are based on my own split - they contain 4691 samples for dev or 5609 for test, you'll need to set the --num_dev_samples or --num_eval_samples flag accordingly.

Looks like that repo is missing some files (this folder models/saved...doesn't exist):
File "/home/ec2-user/SageMaker/question-generation-be134175652204f3bf51cb194454d7b72c8b8105/src/langmodel/lm.py", line 101,in load_from_chkpt
with open(path+'/vocab.json') as f:
FileNotFoundError: [Errno 2] No such file or directory: '../models/saved/lmtest/vocab.json'
Exception ignored in: <bound method LstmLmInstance.del of <langmodel.lm.LstmLmInstance object at 0x7f9d8b8cb550>>
Traceback (most recent call last):
File "/home/ec2-user/SageMaker/question-generation-be134175652204f3bf51cb194454d7b72c8b8105/src/langmodel/lm.py", line 98, in del
self.sess.close()
AttributeError: 'LstmLmInstance' object has no attribute 'sess'
Exception ignored in: <bound method QANetInstance.del of <qa.qanet.instance.QANetInstance object at 0x7f9d8b8cb518>>
Traceback (most recent call last):
File "/home/ec2-user/SageMaker/question-generation-be134175652204f3bf51cb194454d7b72c8b8105/src/qa/qanet/instance.py", line53, in del
self.sess.close()
AttributeError: 'QANetInstance' object has no attribute 'sess'

That repo contains a complete dataset - but eval.py uses a language model and QA model to score the generated questions, which you're missing (and I haven't released - I forgot they were required). The demo doesn't need these.

That repo contains a complete dataset - but eval.py uses a language model and QA model to score the generated questions, which you're missing (and I haven't released - I forgot they were required). The demo doesn't need these.

Could you upload the missing files? I'm trying to evaluate by running eval.py and found that I need these files.
Thanks!