Code for our EMNLP 2023 paper.
The figure below illustrates our simple-to-complex progressive framework for document-level informative argument extraction. First, we calculate the difficulty of each event in a document
Please create a new virtual environment and ensure the dependencies below.
- pytorch=1.8.0
- transformers=3.1.0
- pytorch-lightning=1.0.6
- spacy=3.0
- sentence-transformers=2.1.0
- WikiEvents (download from this repo)
-
Training
bash scripts/train_kairos.sh
-
Confidence calibration
bash scripts/calibrate.sh
-
Testing
bash scripts/test_kairos.sh
If you find our work useful, please cite as follows:
@inproceedings{huang-etal-2023-simple,
title = "From Simple to Complex: A Progressive Framework for Document-level Informative Argument Extraction",
author = "Huang, Quzhe and
Zhang, Yanxi and
Zhao, Dongyan",
editor = "Bouamor, Houda and
Pino, Juan and
Bali, Kalika",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2023",
month = dec,
year = "2023",
address = "Singapore",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.findings-emnlp.408",
doi = "10.18653/v1/2023.findings-emnlp.408",
pages = "6129--6140",
abstract = "Document-level Event Argument Extraction (EAE) requires the model to extract arguments of multiple events from a single document. Considering the underlying dependencies between these events, recent efforts leverage the idea of {``}memory{''}, where the results of already predicted events are cached and can be retrieved to help the prediction of upcoming events. These methods extract events according to their appearance order in the document, however, the event that appears in the first sentence does not mean that it is the easiest to extract. Existing methods might introduce noise to the extraction of upcoming events if they rely on an incorrect prediction of previous events. In order to provide more reliable memory, we propose a simple-to-complex progressive framework for document-level EAE. Specifically, we first calculate the difficulty of each event and then, we conduct the extraction following a simple-to-complex order. In this way, the memory will store the most certain results, and the model could use these reliable sources to help the prediction of more difficult events. Experiments on WikiEvents show that our model outperforms SOTA by 1.4{\%} in F1, indicating the proposed simple-to-complex framework is useful in the EAE task.",
}