Scripts for the analysis of reentrancies in AMR discussed in [1].
Given an AMR corpus (<amr_file>
), follow the instructions to quantify the main causes of reentrancies in your data:
-
Preprocess the corpus with the preprocessing scripts of AMREager (https://github.com/mdtux89/amr-eager-multilingual):
./preprocessing.sh -f <amr_file> python preprocessing.py --amrs -f <amr_file>
-
Compute the coreferences in the data with
coref.py
. It requires python3 and neuralcoref (https://github.com/huggingface/neuralcoref):python3 coref.py <amr_file>
-
Run the
reen_stats.py
script:python reen_stats.py -prefix <amr_file>
-
When using the option
-verbose
the script also outputs the examples where one of more reentrancies could not be classified. See [1] for details.
Given an AMR test set (<test_set>
) and the output of a parser (<test_set_parsed>
):
-
To get the results for all actions in isolation:
./oracle_script.sh <test_set> <test_set_parsed> (--silent)
-
To get the results for the combination of all actions:
./all_actions_byscore.sh <test_set> <test_set_parsed> (--silent)
[1] "The Role of Reentrancies in Abstract Meaning Representation Parsing", Ida Szubert, Marco Damonte, Shay B. Cohen and Mark Steedman. Findings of EMNLP (2020).