This is the official implementation of our paper accepted at WIESP2022: Full-Text Argumentation Mining on Scientific Publications (Preprint on Arxive).

Scholarly Argumentation Mining (SAM) has recently gained attention due to its potential to help scholars with the rapid growth of published scientific literature. It comprises two subtasks: argumentative discourse unit recognition (ADUR) and argumentative relation extraction (ARE), both of which are challenging since they require e.g. the integration of domain knowledge, the detection of implicit statements, and the disambiguation of argument structure. While previous work focused on dataset construction and baseline methods for specific document sections, such as abstract or results, full-text scholarly argumentation mining has seen little progress. In this work, we introduce a sequential pipeline model combining ADUR and ARE for full-text SAM, and provide a first analysis of the performance of pretrained language models (PLMs) on both subtasks. We establish a new SotA for ADUR on the Sci-Arg corpus, outperforming the previous best reported result by a large margin (+7% F1). We also present the first results for ARE, and thus for the full AM pipeline, on this benchmark dataset.


pip install -r requirements.txt
pip install -r requirements_analysis.txt


To run the experiments, you can follow the steps mentioned below. Note that scripts to reproduce the published results can be found in the experiments folder, especially here to generate the predictions and here to calculate the evaluation scores.


NOTE: To train with cross validation refer to cross_validation readme.

ADU Recognition

allennlp \
train \
-s experiments/training/adu/adu_best \
-f allennlp_configs/adu_best.jsonnet

Argumentative Relation Extraction

allennlp \
train \
-s experiments/training/rel/rel_best \
-f allennlp_configs/rel_best.jsonnet \
-o "{\"dataset_reader.add_negative_relations_portion\":-1.0}"

NOTE : To perform hyperparameter tuning follow the guide in hpt readme.


Note that scripts to reproduce the published results and their actual output can be found in experiments/prediction.

ADU Recognition

  1. Predicting ADUs and saving only GOLD ADUs
allennlp \
predagg \
--predictor brat-store \
-o "{\"dataset_reader.show_gold\":true,\"dataset_reader.show_prediction\":false,\"dataset_reader.num_shards\":null,\"dataset_reader.dataset_splits\":{\"test\":\"30:\"}}" \
--use-dataset-reader \
--cuda-device 0 \
--output-file experiments/prediction/adu/goldonly \
--batch-size 8 \
--silent \

Replace PATH/TO/ADU/MODEL with location where adu model is saved. For instance if you run training command for ADU detection mentioned above then model will be saved in experiments/training/adu/adu_best 2. Predicting ADUs and saving only predicted ADUs

allennlp \
predagg \
--predictor brat-store \
-o "{\"dataset_reader.show_gold\":false,\"dataset_reader.show_prediction\":true,\"dataset_reader.num_shards\":null,\"dataset_reader.dataset_splits\":{\"test\":\"30:\"}}" \
--use-dataset-reader \
--cuda-device 0 \
--output-file experiments/prediction/adu/predictiononly \
--batch-size 8 \
--silent  \

Argumentative Relation Extraction

  1. Predicting relations and saving only GOLD relations from GOLD ADUs
allennlp \
predagg \
--predictor brat-store \
-o "{\"data_loader.shuffle\":false,\"dataset_reader.show_gold\":true,\"dataset_reader.show_prediction\":false,\"dataset_reader.add_negative_relations_portion\":-1.0,\"dataset_reader.num_shards\":null,\"dataset_reader.dataset_splits.test\":\"30:\"}" \
--use-dataset-reader \
--cuda-device 0 \
--batch-size 128 \
--silent \
--output-file experiments/prediction/rel@gold_adus/goldonly \

Replace PATH/TO/REL/MODEL with location where REL model is saved. For instance if you run training command for relation extraction mentioned above then model will be saved in experiments/training/rel/rel_best 2. Predicting relations and saving only prediction relations from GOLD ADUs

allennlp \
predagg \
--predictor brat-store \
-o "{\"data_loader.shuffle\":false,\"dataset_reader.show_gold\":false,\"dataset_reader.show_prediction\":true,\"dataset_reader.add_negative_relations_portion\":-1.0,\"dataset_reader.num_shards\":null,\"dataset_reader.dataset_splits.test\":\"30:\"}" \
--use-dataset-reader \
--cuda-device 0 \
--batch-size 128 \
--silent \
--output-file experiments/prediction/rel@gold_adus/predictiononly \
  1. Predicting relations and saving GOLD and predicted relations from predicted ADUs
allennlp \
predagg \
--predictor brat-store \
-o "{\"data_loader.shuffle\":false,\"dataset_reader.show_gold\":true,\"dataset_reader.show_prediction\":true,\"dataset_reader.add_negative_relations_portion\":-1.0,\"dataset_reader.num_shards\":null,\"dataset_reader.dataset_splits.test\":\":\"}" \
--use-dataset-reader \
--cuda-device 0 \
--batch-size 128 \
--silent \
--output-file experiments/prediction/rel@predicted_adus/gold_and_prediction \

Replace PATH/TO/PREDICTED/ADUS/WITH/PREDICTION_ONLY with the location where predicted ADUS with only predictions are saved. For instance if you predict adus using command mentioned above then predicted ADUs with prediction only will be saved at experiments/prediction/adu/predictiononly

  1. Predicting relations and saving only prediction relations from predicted ADUs
allennlp \
predagg \
--predictor brat-store \
-o "{\"data_loader.shuffle\":false,\"dataset_reader.show_gold\":false,\"dataset_reader.show_prediction\":true,\"dataset_reader.add_negative_relations_portion\":-1.0,\"dataset_reader.num_shards\":null,\"dataset_reader.dataset_splits.test\":\":\"}" \
--use-dataset-reader \
--cuda-device 0 \
--batch-size 128 \
--silent \
--output-file experiments/prediction/rel@predicted_adus/predictiononly \


Note that scripts to reproduce the published results and their actual output can be found in experiments/evaluation.

  1. Using AllenNLP evaluate

ADU Recognition

allennlp \
evaluate \
-o "{\"dataset_reader.num_shards\":null,\"dataset_reader.dataset_splits\":{\"test\":\"30:\"}, \"model.calculate_weak_span_f1\":true}" \
--cuda-device 0 \
--batch-size 8 \
--output-file experiments/evaluation/using_allennlp/adu/metrics.json \

You can find evaluation results here.

Argumentative Relation Extraction

allennlp \
evaluate \
-o "{\"data_loader.shuffle\":false,\"dataset_reader.add_negative_relations_portion\":-1.0,\"dataset_reader.num_shards\":null, \"dataset_reader.dataset_splits\":{\"test\":\"30:\"}}" \
--cuda-device 0 \
--batch-size 128 \
--output-file experiments/evaluation/using_allennlp/rel@gold_adus/metrics.json \
  1. Using our evaluation pipeline (calculate_metric.py)

ADU Recognition

python analysis/calculate_metric.py \
--out_dir experiments/evaluation/using_pipeline/adu/metrics

Replace PATH/TO/PREDICTED/ADUS/WITH/GOLD_ONLY with location where predicted adus with only gold labels are saved. For instance, if you run prediction command mentioned above it will be saved at experiments/prediction/adu/goldonly.

If you want to replicate metrics calculated from our best model (can be found here) then replace PATH/TO/PREDICTED/ADUS/WITH/GOLD_ONLY with experiments/prediction/adu/best_uncased_10r5ge6a_goldonly and PATH/TO/PREDICTED/ADUS/WITH/PREDICTION_ONLY with experiments/prediction/adu/best_uncased_10r5ge6a_predictiononly

Argumentative Relation Extraction

  1. Evaluating relation extraction using GOLD ADUs
python analysis/calculate_metric.py \
--out_dir experiments/evaluation/using_pipeline/rel@gold_adus/best_uncased_257eyrv1

In order to replicate metric calculated from our best model which can be found here, you can replace PATH/TO/PREDICTED/REL@GOLD_ADU/GOLD_ONLY with experiments/prediction/rel@gold_adus/best_uncased_257eyrv1_goldonly and PATH/TO/PREDICTED/REL@GOLD_ADU/PREDICTION_ONLY with experiments/prediction/rel@gold_adus/best_uncased_257eyrv1_predictiononly

  1. Evaluating relation extraction using predicted ADUs
python analysis/calculate_metric.py \
--out_dir experiments/evaluation/using_pipeline/rel@predicted_adus/best_uncased_257eyrv1

In order to replicate metric calculated from our best model which can be found here, you can replace PATH/TO/PREDICTED/REL@GOLD_ADU/GOLD_ONLY with experiments/prediction/rel@gold_adus/best_uncased_257eyrv1_goldonly and PATH/TO/PREDICTED/REL@PREDICTED_ADU/PREDICTION_ONLY with experiments/prediction/rel@predicted_adus/best_uncased_257eyrv1_predictiononly