An implementation for ACL2022 paper "Graph Pre-training for AMR Parsing and Generation". You may find our paper here (Arxiv).
- python 3.8
- pytorch 1.8
- transformers 4.8.2
- pytorch-lightning 1.5.0
- Tesla V100 or A100
We recommend to use conda to manage virtual environments:
conda env update --name <env> --file requirements.yml
We also provide a docker image here.
You may download the AMR corpora at LDC.
We follow Spring to preprocess AMR graphs:
# 1. install spring
cd spring && pip install -e .
# 2. processing data
bash run-preprocess.sh
bash run-posttrain-bart-textinf-joint-denoising-6task-large-unified-V100.sh /path/to/BART/
For AMR Parsing, run
bash finetune_AMRbart_amrparsing.sh /path/to/pre-trained/AMRBART/ gpu_id
For AMR-to-text Generation, run
bash finetune_AMRbart_amr2text.sh /path/to/pre-trained/AMRBART/ gpu_id
For AMR Parsing, run
bash eval_AMRbart_amrparsing.sh /path/to/fine-tuned/AMRBART/ gpu_id
For AMR-to-text Generation, run
bash eval_AMRbart_amr2text.sh /path/to/fine-tuned/AMRBART/ gpu_id
If you want to run our code on your own data, try to transform your data into the format here, then run
For AMR Parsing, run
bash inference_amr.sh /path/to/fine-tuned/AMRBART/ gpu_id
For AMR-to-text Generation, run
bash inference_text.sh /path/to/fine-tuned/AMRBART/ gpu_id
Setting | Params | checkpoint |
---|---|---|
AMRBART-base | 142M | model |
AMRBART-large | 409M | model |
Setting | BLEU(tok) | BLEU(detok) | checkpoint | output |
---|---|---|---|---|
AMRBART-large (AMR2.0) | 49.8 | 45.7 | model | output |
AMRBART-large (AMR3.0) | 49.2 | 45.0 | model | output |
Setting | Smatch | checkpoint | output |
---|---|---|---|
AMRBART-large (AMR2.0) | 85.4 | model | output |
AMRBART-large (AMR3.0) | 84.2 | model | output |
- clean code
@inproceedings{bai-etal-2022-graph,
title = "Graph Pre-training for {AMR} Parsing and Generation",
author = "Bai, Xuefeng and
Chen, Yulong and
Zhang, Yue",
booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = may,
year = "2022",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "todo",
doi = "todo",
pages = "todo"
}