We investigate four different Transformer-based architectures for the different discourse coherence task: vanilla Transformer, hierarchical Transformer, multi-task learning based model and a model with fact based input representation. One can find more details, analyses, and baseline results in our paper.
This implementation is based on PyTorch 1.6.0 with Python 3.6.10.
Install the required packages:
pip install -r requirements.txt
Before training or evaluating model, download the dataset from here and unzip within the project directory.
It includes Grammarly Corpus of Discourse Coherence (GCDC), Wall Street Journal (WSJ) and Recognizing Textual Entailment (RTE) datasets
Different discouse coherence tasks defined for the datasets mentioned above:
- GCDC
- 3-way-classification : Given the document, the task is to classify it into 3 different label - high, medium and low on which denotes the textual coherence level of the given document.
- minority-classification : A binary classification task in which dataset is created from by labeling low coherent text through majority voting (if 2 expert agrees out of 3) and not low coherence otherwise.
- sentence-ordering : The document which are labelled with high coherence tags (in GCDC) are taken and 20 random permutation is obtained for each document. The dataset is created by pairing the original setence and permutated sentence. The task is of choosing original document from original-permuted pair.
- sentence-score-prediction : The coherence labels from 3 different experts are averaged out to create a gold score and the task is to predict this score. (regression task).
- WSJ
- sentence-ordering : All the document present in WSJ except the document containing one sentence are taken and 20 random permutation is obtained for each document. The dataset is created by pairing the original setence and permutated sentence. Task is of choosing original document from original-permuted pair.
To train different transformer based architecture models:
python main.py --arch <ARCH> --epochs <EPOCHS> --gpus <GPUS> --batch_size <BATCH_SIZE> --learning_rate <LR> --corpus <CORPUS> --sub_corpus <SUB_CORPUS> --model_name <MODEL_NAME> --freeze_emb_layer --online_mode <ONLINE_MODE> --task <TASK>
where,
corpus
can take one of 'gcdc' or 'wsj'.sub_corpus
can take anyone value from 'All', 'Clinton', 'Enron', 'Yelp' or 'Yahoo' given thatcorpus
is 'gcdc'arch
can take one of 'vanilla', 'hierarchical', 'mtl' or 'fact-aware'task
can take one of '3-way-classification', 'minority-classification', 'sentence-ordering' or 'sentence-score-prediction'for GCDC dataset and only 'sentence-ordering' for WSJ datasetmodel_name
defines transformer model to use. (by-default its's roberta-base)
To evaluate the trained model over test dataset:
python main.py --inference --arch <ARCH> --gpus <GPUS> --batch_size <BATCH_SIZE> --corpus <CORPUS> --sub_corpus <SUB_CORPUS> --model_name <MODEL_NAME> --freeze_emb_layer --online_mode <ONLINE_MODE> --task <TASK>'
Whole pipeline can be executed using the following bash file. User need to change the variables accordingly as mentioned in bash script.
./run.sh
One can cite it as follows:
@article{abhishek2021transformer,
title={Transformer models for text coherence assessment},
author={Abhishek, Tushar and Rawat, Daksh and Gupta, Manish and Varma, Vasudeva},
journal={arXiv preprint arXiv:2109.02176},
year={2021}
}