the value for “model_path” in the "Generate Regional Answers" step
Closed this issue · 2 comments
Hi, thanks for sharing code for your great work.
I am trying to reproduce the result following inference.txt
.
I wonder that in the "Generate Regional Answers" step, what should the "model_path" argument be?
The example command line is
python -m triviaqa \
--train_dataset path/to/output/squad-wikipedia-dev-chunk-8000.json \ # loaded but not used
--dev_dataset path/to/output/squad-wikipedia-dev-chunk-8000.json \
--gpus 0 --num_workers 4 \
--max_seq_len 4096 --doc_stride -1 \
--save_prefix 'triviaqa-longformer-large' \
--model_path path/to/pretrained/longformer-large-4096 \ # loaded but not used
--resume_ckpt path/to/pretrained/triviaqa-longformer-large-4096 \
--prediction_file 'regional.answer.json' \
--test
I first tried setting model_path = 'allenai/longformer-large-4096-finetuned-triviaqa'
, which is the name for the pretrained longformer model from the huggingface model hub, and it raises AttributeError:
Traceback (most recent call last):
File "~/anaconda3/envs/ror/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "~/anaconda3/envs/ror/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "~/RoR/triviaqa.py", line 817, in <module>
main(args)
File "~/RoR/triviaqa.py", line 768, in main
model = TriviaQA(args)
File "~/RoR/triviaqa.py", line 293, in __init__
self.model = self.load_model()
File "~/RoR/triviaqa.py", line 303, in load_model
model = Longformer.from_pretrained(self.args.model_path)
File "~/anaconda3/envs/ror/lib/python3.7/site-packages/transformers/modeling_utils.py", line 655, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "~/RoR/longformer/longformer.py", line 19, in __init__
layer.attention.self = LongformerSelfAttention(config, layer_id=i)
File "~/RoR/longformer/longformer.py", line 78, in __init__
self.attention_dilation = config.attention_dilation[self.layer_id]
AttributeError: 'RobertaConfig' object has no attribute 'attention_dilation'
Then I set model_path = 'models/triviaqa-longformer-large/checkpoints/epoch_4_v2.ckpt'
, where I downloaded the checkpoint you provide, and it raises OSError:
Traceback (most recent call last):
File "~/anaconda3/envs/ror/lib/python3.7/site-packages/transformers/configuration_utils.py", line 243, in get_config_dict
raise EnvironmentError
OSError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "~/anaconda3/envs/ror/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "~/anaconda3/envs/ror/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "~/RoR/triviaqa.py", line 817, in <module>
main(args)
File "~/RoR/triviaqa.py", line 768, in main
model = TriviaQA(args)
File "~/RoR/triviaqa.py", line 293, in __init__
self.model = self.load_model()
File "~/RoR/triviaqa.py", line 303, in load_model
model = Longformer.from_pretrained(self.args.model_path)
File "~/anaconda3/envs/ror/lib/python3.7/site-packages/transformers/modeling_utils.py", line 587, in from_pretrained
**kwargs,
File "~/anaconda3/envs/ror/lib/python3.7/site-packages/transformers/configuration_utils.py", line 201, in from_pretrained
config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "~/anaconda3/envs/ror/lib/python3.7/site-packages/transformers/configuration_utils.py", line 252, in get_config_dict
raise EnvironmentError(msg)
OSError: Can't load config for 'models/triviaqa-longformer-large/checkpoints/epoch_4_v2.ckpt'. Make sure that:
- 'models/triviaqa-longformer-large/checkpoints/epoch_4_v2.ckpt' is a correct model identifier listed on 'https://huggingface.co/models'
- or 'models/triviaqa-longformer-large/checkpoints/epoch_4_v2.ckpt' is the correct path to a directory containing a config.json file
Hi, yu,
model_path is the argument of the location of pre-trained longformer-large (including model and configuration). The error "AttributeError: 'RobertaConfig' object has no attribute 'attention_dilation'" may be that the longformer model from the huggingface model hub is not compatible with our code.
I suggest that you can download longformer model from
https://ai2-s2-research.s3-us-west-2.amazonaws.com/longformer/longformer-large-4096.tar.gz.
Hi Zhao,
Thank you for your quick reply. It works now.