zhangyx0417/simple_to_complex

Code for the EMNLP 2023 paper "From Simple to Complex: A Progressive Framework for Document-level Informative Argument Extraction"

Python

From Simple to Complex: A Progressive Framework for Document-level Informative Argument Extraction

Code for our EMNLP 2023 paper.

Model Overview

The figure below illustrates our simple-to-complex progressive framework for document-level informative argument extraction. First, we calculate the difficulty of each event in a document $D$ and obtain a new prediction order for that event. Second, we reorder events in $D$ from simple to complex, and predict them accordingly. S2C denotes Simple-to-Complex, while F2B denotes Front-to-Back. Here, we plot the process of predicting the arguments of $E_2$.

Dependencies

Please create a new virtual environment and ensure the dependencies below.

pytorch=1.8.0
transformers=3.1.0
pytorch-lightning=1.0.6
spacy=3.0
sentence-transformers=2.1.0

Datasets

WikiEvents (download from this repo)

Running

Training
```
 bash scripts/train_kairos.sh
```
Confidence calibration
```
 bash scripts/calibrate.sh
```
Testing
```
 bash scripts/test_kairos.sh
```

Citation

If you find our work useful, please cite as follows:

@inproceedings{huang-etal-2023-simple,
    title = "From Simple to Complex: A Progressive Framework for Document-level Informative Argument Extraction",
    author = "Huang, Quzhe  and
      Zhang, Yanxi  and
      Zhao, Dongyan",
    editor = "Bouamor, Houda  and
      Pino, Juan  and
      Bali, Kalika",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2023",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.findings-emnlp.408",
    doi = "10.18653/v1/2023.findings-emnlp.408",
    pages = "6129--6140",
    abstract = "Document-level Event Argument Extraction (EAE) requires the model to extract arguments of multiple events from a single document. Considering the underlying dependencies between these events, recent efforts leverage the idea of {``}memory{''}, where the results of already predicted events are cached and can be retrieved to help the prediction of upcoming events. These methods extract events according to their appearance order in the document, however, the event that appears in the first sentence does not mean that it is the easiest to extract. Existing methods might introduce noise to the extraction of upcoming events if they rely on an incorrect prediction of previous events. In order to provide more reliable memory, we propose a simple-to-complex progressive framework for document-level EAE. Specifically, we first calculate the difficulty of each event and then, we conduct the extraction following a simple-to-complex order. In this way, the memory will store the most certain results, and the model could use these reliable sources to help the prediction of more difficult events. Experiments on WikiEvents show that our model outperforms SOTA by 1.4{\%} in F1, indicating the proposed simple-to-complex framework is useful in the EAE task.",
}