argdict
riosempre opened this issue · 1 comments
I get an error while pre-processing the data by command wmt14en-de.sh. Upon investigation, the error is caused by the argument srcdict not being written in the parameters of the python command.
Traceback (most recent call last):
File "preprocess.py", line 359, in
cli_main()
File "preprocess.py", line 355, in cli_main
main(args)
File "preprocess.py", line 64, in main
raise FileExistsError(dict_path(args.source_lang))
FileExistsError: ../data-bin/wmt14_en_de_joined_dict/dict.en.txt
if not args.srcdict and os.path.exists(dict_path(args.source_lang)):
raise FileExistsError(dict_path(args.source_lang))_
Where is srcdict? Is it necessary? Should I change the code for it to work?
EDIT: I changed the code from:
python preprocess.py --source-lang en --target-lang de
--trainpref $prep/train --validpref $prep/valid --testpref $prep/test
--destdir ../data-bin/wmt14_en_de_joined_dict
--joined-dictionary
to
fairseq-preprocess --source-lang en --target-lang de
--trainpref $prep/train --validpref $prep/valid --testpref $prep/test
--destdir ../data-bin/wmt14_en_de_joined_dict
--srcdict ../data-bin/wmt14_en_de_joined_dict/dict.en.txt --tgtdict ../data-bin/wmt14_en_de_joined_dict/dict.de.txt
was that right?
thanks for reaching out, it seems that the error is file ../data-bin/wmt14_en_de_joined_dict/dict.en.txt
already exists.
You can fix this by: 1) delete the folder ../data-bin/wmt14_en_de_joined_dict
, and 2) ensure the folder ../data-bin
exists.
BTW, the preprocess bash file is designed to run at the pre-process
folder.