EDeR

EDeR: A Dataset for Exploring Event Dependency Relations Between Events.

EDeR is a human-annotated dataset that extracts event dependency information from events and provides refined semantic role-labelled event representations based on this information. We also provide the code of related baseline models for further research.

Dataset statistics

	argument		non-argument		overall
	required	optional	condition	independent
train	4096	2837	335	1861	9129
dev	635	421	41	355	1452
test	594	368	70	239	1271
overall	5325	3626	446	2455	11852

Data format

data/train.json, data/dev.json and data/test.json are the training, development and test sets, respectively. After loading each file, you will get a list of dictionaries. The format of the data is shown as the following example:

{'Event 1': "We {V: know} you teach the truth about God 's way",
 'Event 2': "you {V: teach} the truth about God 's way",
 'refined Event 1': NAN,
 'label': 'required argument',
 'Event 1 SRL': '{'ARG0': ['We'], 'V': ['know'], 'ARG1': ['you', 'teach', 'the', 'truth', 'about', 'God', "'s", 'way']}',
 'Event 2 SRL': '{'ARG0': ['you'], 'V': ['teach'], 'ARG1': ['the', 'truth', 'about', 'God', "'s", 'way']}',
 'sentence': '['We', 'know', 'you', 'teach', 'the', 'truth', 'about', 'God', "'s", 'way', '.']',
 'Event-Event span': "We {V: know} you teach the truth about God 's way[SEP]you {V: teach} the truth about God 's way",
 'Event-Event-SRL': "We {V: know} you teach the truth about God 's way[SEP]you {V: teach} the truth about God 's way[SRL]ARG1",
 'Event-Event-SRL-DEP': "We {V: know} you teach the truth about God 's way[SEP]you {V: teach} the truth about God 's way[SRL]ARG1[DEP]parataxis",
 'Marked-predicate sentence': "We [V1] know [\V1] you [V2] teach [\V2] the truth about God 's way ."}

Event 1 and Event 2 are the containing and contained event pair. refined Event 1 is the refined Event 1 if label is condition or independent. Otherwise, it is NAN. Event 1 SRL and Event 2 SRL are semantic role labels of the two events, respectively. sentence is the tokenized sentence that contains the two events. The four types of inputs are also included, details can be found in the paper.

Baseline models

Requirements

Python 3.7+

transformers==4.16.2

scikit-learn==1.0.1

pytorch-lightning==1.5.10

pandas==1.3.5

pycorenlp==0.3.0

Stanford CoreNLP tookit

Train and test baseline models

You can find the command lines to train and test baseline models on the data in run_sample.sh.

Here are some important parameters:

--m: name of the selected model, e.g., roberta.
--i: input type, e.g., Event-Event-SRL-DEP.
--t: task type, binary or three.

Citing us

If you feel the dataset helpful, please cite:

@misc{li2023eder,
      title={EDeR: A Dataset for Exploring Dependency Relations Between Events}, 
      author={Ruiqi Li and Patrik Haslum and Leyang Cui},
      year={2023},
      eprint={2304.01612},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}