lajanugen/zeshel

KeyError when running data processing script

peluz opened this issue · 0 comments

peluz commented

Greetings!

When running the scripts/create_training_data.sh script, I get the following KeyErrors:

Traceback (most recent call last):
File "create_training_data.py", line 430, in
tf.app.run()
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 300, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "create_training_data.py", line 410, in main
rng, is_training=FLAGS.is_training)
File "create_training_data.py", line 210, in create_training_instances
vocab_words, rng, is_training=is_training)
File "create_training_data.py", line 295, in create_instances_from_document
context_document = all_documents[context_document_id]['text']
KeyError: 'A1DB81433C6D9C29'
Traceback (most recent call last):
File "create_training_data.py", line 430, in
tf.app.run()
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 300, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "create_training_data.py", line 410, in main
rng, is_training=FLAGS.is_training)
File "create_training_data.py", line 210, in create_training_instances
vocab_words, rng, is_training=is_training)
File "create_training_data.py", line 295, in create_instances_from_document
context_document = all_documents[context_document_id]['text']
KeyError: '1F92A2A6993DB564'
Traceback (most recent call last):
File "create_training_data.py", line 430, in
tf.app.run()
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 300, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "create_training_data.py", line 410, in main
rng, is_training=FLAGS.is_training)
File "create_training_data.py", line 210, in create_training_instances
vocab_words, rng, is_training=is_training)
File "create_training_data.py", line 295, in create_instances_from_document
context_document = all_documents[context_document_id]['text']
KeyError: '804387C91894D89D'

As a result, I'm not able to prepare the training data: only the validation and test data are able to be processed.

Any idea about why this is happening and how to correct it? Thanks in advance.