/trans_am

Code for paper: A Neural Transition-based Model for Argumentation Mining.

Primary LanguagePython

Code for our paper:

Jianzhu Bao, Chuang Fan, Jipeng Wu, Yixue Dang, Jiachen Du, and Ruifeng Xu. 2021. A Neural Transition-based Model for Argumentation Mining. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6354–6364, Online. Association for Computational Linguistics.

Prerequisites

python 3.6
pytorch 1.7.1
cuda 10.2
transformers 4.2.1
json 2.0.9
numpy 1.19.4
pandas 1.1.5
scikit-learn 0.24.0
tqdm 4.54.1
argparse 1.1

Descriptions

data - contains dataset.

saved_models - contains saved models, training logs and results.

utils - utils code.

  • config.py: parameter setting.
  • evaluation.py: evaluation procedure.
  • dataloader.py - load train, dev, test data.
  • trans_module.py: proposed transition-based model.
  • transitions.py: transform text into transitions.

preprocess_cdcp - utils for process cdcp dataset, adopted from (https://github.com/vene/marseille).

load_text_essays.py: load PE dataset. prepare_data.py - preprocess the input data.
run.py - train and evaluate the proposed transition-based model.
preprocess.sh - prepare data and models.

Usage

For PE dataset

bash process_pe.sh
python run.py --dataset-name PE --data-path ./data/PE_data_df.csv --split-test-file-path ./data/test_paragraph_index.json --vocab-path ./data/bow_vocab.json --batch-size 32 --showtime 2 --early-num 5 --max-AC-num 50 --AC-type-label-num 3

For CDCP dataset

bash process_cdcp.sh
python run.py --dataset-name CDCP --data-path ./data/cdcp_data_bert_df.csv --split-test-file-path ./data/cdcp_test_index.json --vocab-path ./data/bow_vocab_cdcp.json --batch-size 16 --showtime 10 --early-num 10 --max-AC-num 12 --AC-type-label-num 5