JD-AI-Research-NLP/RoR

the value for “model_path” in the "Generate Regional Answers" step

Closed this issue · 2 comments

Hi, thanks for sharing code for your great work.

I am trying to reproduce the result following inference.txt.
I wonder that in the "Generate Regional Answers" step, what should the "model_path" argument be?

The example command line is

python -m triviaqa  \
    --train_dataset path/to/output/squad-wikipedia-dev-chunk-8000.json  \  # loaded but not used
    --dev_dataset path/to/output/squad-wikipedia-dev-chunk-8000.json  \
    --gpus 0  --num_workers 4 \
    --max_seq_len 4096 --doc_stride -1  \
    --save_prefix 'triviaqa-longformer-large'  \  
    --model_path path/to/pretrained/longformer-large-4096  \  # loaded but not used
    --resume_ckpt   path/to/pretrained/triviaqa-longformer-large-4096  \
    --prediction_file 'regional.answer.json'  \ 
    --test   

I first tried setting model_path = 'allenai/longformer-large-4096-finetuned-triviaqa', which is the name for the pretrained longformer model from the huggingface model hub, and it raises AttributeError:

Traceback (most recent call last):
  File "~/anaconda3/envs/ror/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "~/anaconda3/envs/ror/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "~/RoR/triviaqa.py", line 817, in <module>
    main(args)
  File "~/RoR/triviaqa.py", line 768, in main
    model = TriviaQA(args)
  File "~/RoR/triviaqa.py", line 293, in __init__
    self.model = self.load_model()
  File "~/RoR/triviaqa.py", line 303, in load_model
    model = Longformer.from_pretrained(self.args.model_path)
  File "~/anaconda3/envs/ror/lib/python3.7/site-packages/transformers/modeling_utils.py", line 655, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "~/RoR/longformer/longformer.py", line 19, in __init__
    layer.attention.self = LongformerSelfAttention(config, layer_id=i)
  File "~/RoR/longformer/longformer.py", line 78, in __init__
    self.attention_dilation = config.attention_dilation[self.layer_id]
AttributeError: 'RobertaConfig' object has no attribute 'attention_dilation'

Then I set model_path = 'models/triviaqa-longformer-large/checkpoints/epoch_4_v2.ckpt', where I downloaded the checkpoint you provide, and it raises OSError:

Traceback (most recent call last):
  File "~/anaconda3/envs/ror/lib/python3.7/site-packages/transformers/configuration_utils.py", line 243, in get_config_dict
    raise EnvironmentError
OSError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "~/anaconda3/envs/ror/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "~/anaconda3/envs/ror/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "~/RoR/triviaqa.py", line 817, in <module>
    main(args)
  File "~/RoR/triviaqa.py", line 768, in main
    model = TriviaQA(args)
  File "~/RoR/triviaqa.py", line 293, in __init__
    self.model = self.load_model()
  File "~/RoR/triviaqa.py", line 303, in load_model
    model = Longformer.from_pretrained(self.args.model_path)
  File "~/anaconda3/envs/ror/lib/python3.7/site-packages/transformers/modeling_utils.py", line 587, in from_pretrained
    **kwargs,
  File "~/anaconda3/envs/ror/lib/python3.7/site-packages/transformers/configuration_utils.py", line 201, in from_pretrained
    config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "~/anaconda3/envs/ror/lib/python3.7/site-packages/transformers/configuration_utils.py", line 252, in get_config_dict
    raise EnvironmentError(msg)
OSError: Can't load config for 'models/triviaqa-longformer-large/checkpoints/epoch_4_v2.ckpt'. Make sure that:

- 'models/triviaqa-longformer-large/checkpoints/epoch_4_v2.ckpt' is a correct model identifier listed on 'https://huggingface.co/models'

- or 'models/triviaqa-longformer-large/checkpoints/epoch_4_v2.ckpt' is the correct path to a directory containing a config.json file

Hi, yu,
model_path is the argument of the location of pre-trained longformer-large (including model and configuration). The error "AttributeError: 'RobertaConfig' object has no attribute 'attention_dilation'" may be that the longformer model from the huggingface model hub is not compatible with our code.
I suggest that you can download longformer model from
https://ai2-s2-research.s3-us-west-2.amazonaws.com/longformer/longformer-large-4096.tar.gz.

Hi Zhao,
Thank you for your quick reply. It works now.