LUMIA-Group/rasat

Eval process issue

Opened this issue · 0 comments

Thanks for reading this issue. When I'm already in docker and run "CUDA_VISIBLE_DEVICES="2" python3 seq2seq/eval_run_seq2seq.py configs/cosql/eval_cosql_rasat_576.json", I got this error.

Truncation was not explicitly activated but max_length is provided a specific value, please use truncation=True to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to truncation.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 6.38ba/s]
01/07/2024 08:31:51 - WARNING - stanza - Can not find mwt: default from official model list. Ignoring it.
Traceback (most recent call last):
File "seq2seq/eval_run_seq2seq.py", line 310, in
main()
File "seq2seq/eval_run_seq2seq.py", line 177, in main
tokenizer=tokenizer,
File "/app/seq2seq/utils/dataset_loader.py", line 123, in load_dataset
**_prepare_splits_kwargs,
File "/app/seq2seq/utils/dataset.py", line 360, in prepare_splits
pre_process_function=pre_process_function,
File "/app/seq2seq/utils/dataset.py", line 324, in _prepare_eval_split
use_dependency=data_training_args.use_dependency
File "/app/seq2seq/preprocess/choose_dataset.py", line 12, in preprocess_by_dataset
preprocessing_generate_lgerels(data_base_dir, dataset_name, mode, use_coref, use_dependency)
File "/app/seq2seq/preprocess/process_dataset.py", line 81, in preprocessing_generate_lgerels
processor = Preprocessor(dataset_name, db_dir=db_dir, db_content=True)
File "/app/seq2seq/preprocess/common_utils.py", line 146, in init
self.nlp_tokenize = stanza.Pipeline('en', processors='tokenize,mwt,pos,lemma,depparse', tokenize_pretokenized = False, use_gpu=True)#, use_gpu=False)
File "/home/toolkit/.local/lib/python3.7/site-packages/stanza/pipeline/core.py", line 107, in init
self.load_list = add_dependencies(resources, lang, self.load_list) if lang in resources else []
File "/home/toolkit/.local/lib/python3.7/site-packages/stanza/resources/common.py", line 245, in add_dependencies
default_dependencies = resources[lang]['default_dependencies']
KeyError: 'default_dependencies'

Thanks for any solution. That will be really important for me.