EDeR: A Dataset for Exploring Event Dependency Relations Between Events.
EDeR is a human-annotated dataset that extracts event dependency information from events and provides refined semantic role-labelled event representations based on this information. We also provide the code of related baseline models for further research.
argument | non-argument | overall | |||
required | optional | condition | independent | ||
train | 4096 | 2837 | 335 | 1861 | 9129 |
dev | 635 | 421 | 41 | 355 | 1452 |
test | 594 | 368 | 70 | 239 | 1271 |
overall | 5325 | 3626 | 446 | 2455 | 11852 |
data/train.json
, data/dev.json
and data/test.json
are the training, development and test sets, respectively. After loading each file, you will get a list of dictionaries. The format of the data is shown as the following example:
{'Event 1': "We {V: know} you teach the truth about God 's way",
'Event 2': "you {V: teach} the truth about God 's way",
'refined Event 1': NAN,
'label': 'required argument',
'Event 1 SRL': '{'ARG0': ['We'], 'V': ['know'], 'ARG1': ['you', 'teach', 'the', 'truth', 'about', 'God', "'s", 'way']}',
'Event 2 SRL': '{'ARG0': ['you'], 'V': ['teach'], 'ARG1': ['the', 'truth', 'about', 'God', "'s", 'way']}',
'sentence': '['We', 'know', 'you', 'teach', 'the', 'truth', 'about', 'God', "'s", 'way', '.']',
'Event-Event span': "We {V: know} you teach the truth about God 's way[SEP]you {V: teach} the truth about God 's way",
'Event-Event-SRL': "We {V: know} you teach the truth about God 's way[SEP]you {V: teach} the truth about God 's way[SRL]ARG1",
'Event-Event-SRL-DEP': "We {V: know} you teach the truth about God 's way[SEP]you {V: teach} the truth about God 's way[SRL]ARG1[DEP]parataxis",
'Marked-predicate sentence': "We [V1] know [\V1] you [V2] teach [\V2] the truth about God 's way ."}
Event 1
and Event 2
are the containing and contained event pair.
refined Event 1
is the refined Event 1 if label is condition or independent. Otherwise, it is NAN.
Event 1 SRL
and Event 2 SRL
are semantic role labels of the two events, respectively.
sentence
is the tokenized sentence that contains the two events. The four types of inputs are also included, details can be found in the paper.
Python 3.7+
transformers==4.16.2
scikit-learn==1.0.1
pytorch-lightning==1.5.10
pandas==1.3.5
pycorenlp==0.3.0
You can find the command lines to train and test baseline models on the data in run_sample.sh
.
Here are some important parameters:
--m
: name of the selected model, e.g., roberta.--i
: input type, e.g., Event-Event-SRL-DEP.--t
: task type, binary or three.
If you feel the dataset helpful, please cite:
@misc{li2023eder,
title={EDeR: A Dataset for Exploring Dependency Relations Between Events},
author={Ruiqi Li and Patrik Haslum and Leyang Cui},
year={2023},
eprint={2304.01612},
archivePrefix={arXiv},
primaryClass={cs.CL}
}