Code, models and Datasets for《Self-Explaining Structures Improve NLP Models》.
pip install -r requirements.txt
- Download the IMDB dataset, the official corpus can be found HERE.
We provide processed raw text which you can download HERE.
Save the processed raw text dataset at
[IMDB_PATA_PATH]
. - Download the SST-5 dataset, the official corpus can be found HERE.
We provide processed raw text which you can download HERE
Save the processed raw text dataset at
[SST_PATA_PATH]
. - Download the SNLI dataset, the official corpus can be found HERE.
Save the SNLI dataset at
[SNLI_PATA_PATH]
. - Download the vanilla RoBERTa-base model released by HuggingFace. Save the model at
[ROBERTA_BASE_PATH]
, it can be found HERE
In this paper, we utilize self-explaining structures in different NLP tasks. This repo contains all train and evaluate codes, but here, we only provide commands for SST-5 task as an example. For other tasks, you can reproduce the results simply by modifying the commands.
SST-5 is a task with five classes, so we should modify the Roberta-base config file.
Open [ROBERTA_BASE_PATH]\config.json
and set num_labels=5
. Then run the following commands.
cd explain
python trainer.py \
--bert_path [ROBERTA_BASE_PATH] \
--data_dir [SST_PATA_PATH] \
--task sst5 \
--save_path [SELF_EXPLAINING_MODEL_CHECKPOINTS] \
--gpus=0,1,2,3 \
--precision 16 \
--lr=2e-5 \
--batch_size=10 \
--lamb=1.0 \
--workers=4 \
--max_epoch=20
After training, the checkpoints and training log will be saved at [SELF_EXPLAINING_MODEL_CHECKPOINTS]
.
Run the following evaluation command to get the performance on test dataset.
After evaluation, you will get two output file at [SPAN_SAVE_PATH]
: output.txt
and test.txt
.
output.txt
records visual extract spans and prediction results.
text.txt
only records top-ranked span as span-base test data for next stage.
cd explain
python trainer.py \
--bert_path [ROBERTA_BASE_PATH] \
--data_dir [SST_PATA_PATH] \
--task sst5 \
--checkpoint_path [SELF_EXPLAINING_MODEL_CHECKPOINTS]/***.ckpt \
--save_path [SPAN_SAVE_PATH] \
--gpus=0, \
--mode eval
In previous stage, we got span-based test data. You can use the same method to get span-based train data.
To check the extracted span, we set four experiments which are full-full mode, full-span mode, span-full
mode and span-span mode. For example, full-span mode means we use origin SST-5 train data as train data,
and use span-based test data as test data.
You should save the origin SST-5 train data and span-base test data at [FULL_SPAN_PATH]
scp [SST_PATA_PATH]/train.txt [FULL_SPAN_PATH]
scp [SPAN_SAVE_PATH]/test/txt [FULL_SPAN_PATH]
cd check
python trainer.py \
--bert_path [ROBERTA_BASE_PATH] \
--data_dir [FULL_SPAN_PATH] \
--task sst5 \
--save_path [CHECK_MODEL_CHECKPOINTS] \
--gpus=0,1,2,3 \
--precision 16 \
--lr=2e-5 \
--batch_size=10 \
--lamb=1.0 \
--workers=4 \
--max_epoch=20