/tag-based-multi-span-extraction

The official implementation of EMNLP 2020, "A Simple and Effective Model for Answering Multi-span Questions".

Primary LanguagePython

A Simple and Effective Model for Answering Multi-span Questions

This repository contains the official implementation of the following paper:
Elad Segal, Avia Efrat, Mor Shoham, Amir Globerson, Jonathan Berant. "A Simple and Effective Model for Answering Multi-span Questions". In EMNLP, 2020.

Citation

@inproceedings{Segal2020TASE,
  title={A Simple and Effective Model for Answering Multi-span Questions},
  author={Segal, Elad and Efrat, Avia and Shoham, Mor and Globerson, Amir and Berant, Jonathan},
  booktitle={EMNLP},
  year={2020},
}

Use DROP Explorer to better understand DROP, Quoref and the models' predictions.

Usage

The commands listed in this section need to be run from the root directory of the repository.

First, install prerequisites with
pip install -r requirements.txt

Commands

  • Train a model:
allennlp train configs/[config file] -s [training_directory] --include-package src
  • Output predictions by a model:
allennlp predict model.tar.gz drop_data/drop_dataset_dev.json --predictor machine-comprehension \
--cuda-device 0 --output-file predictions.jsonl --use-dataset-reader --include-package src \
-o "{'validation_dataset_reader.pickle.action': 'None'}"
  • Evaluate a model (unofficial evaluation code, faster):
allennlp evaluate model.tar.gz drop_data/drop_dataset_dev.json --cuda-device 0 --output-file eval.json \
--include-package src -o "{'validation_dataset_reader.pickle.action': 'None'}"
  • Evaluate a model (official evaluation code, slower):

    python tools/generate_submission_predictions.py --archive_file model.tar.gz \
    --input_file drop_data/drop_dataset_dev.json --cuda-device 0 --output_file predictions.json \
    --include-package src
    
    python -m allennlp.tools.drop_eval --gold_path drop_data/drop_dataset_dev.json \
    --prediction_path predictions.json --output_path metrics.json
    

Trained Models