How can I get "vocab.txt".
danlutan opened this issue · 1 comments
How can I get "vocab.txt". If I run , "make_dictionary()" will create a "vocab.txt" . But here only two words in the "txt" like "begin" abd "end". So when in "parse_one_file()", the function "map" will show the error "keyerror: @entity133".
Traceback (most recent call last): File "run.py", line 42, in <module> train.main(save_path, params) File "/home/root1/tld/ga-reader-master/ga-reader-master/train.py", line 27, in main data = dp.preprocess(dataset, no_training_set=False, use_chars=use_chars) File "/home/root1/tld/ga-reader-master/ga-reader-master/utils/DataPreprocessor.py", line 43, in preprocess training = self.parse_all_files(question_dir + "/training", dictionary, use_chars) File "/home/root1/tld/ga-reader-master/ga-reader-master/utils/DataPreprocessor.py", line 162, in parse_all_files questions =[ self.parse_one_file(f, dictionary, use_chars) + (f,) for f in all_files] File "/home/root1/tld/ga-reader-master/ga-reader-master/utils/DataPreprocessor.py", line 142, in parse_one_file qry_words = map(lambda w:w_dict[w], qry_raw) File "/home/root1/tld/ga-reader-master/ga-reader-master/utils/DataPreprocessor.py", line 142, in <lambda> qry_words = map(lambda w:w_dict[w], qry_raw) KeyError: '@entity133'
make_dictionary()
should add all the types in the dataset to the vocabulary. It is strange that you only get begin
and end
in it. Maybe you can try deleting the vocab.txt
and rerunning, so that it creates a new vocabulary file?
If that doesn't work, I would try putting a breakpoint in make_dictionary()
and check if the question files are being read properly.