
Code, models and Datasets for《Self-Explaining Structures Improve NLP Models》.


pip install -r requirements.txt

Prepare Datasets and Models

  • Download the IMDB dataset, the official corpus can be found HERE. We provide processed raw text which you can download HERE. Save the processed raw text dataset at [IMDB_PATA_PATH].
  • Download the SST-5 dataset, the official corpus can be found HERE. We provide processed raw text which you can download HERE Save the processed raw text dataset at [SST_PATA_PATH].
  • Download the SNLI dataset, the official corpus can be found HERE. Save the SNLI dataset at [SNLI_PATA_PATH].
  • Download the vanilla RoBERTa-base model released by HuggingFace. Save the model at [ROBERTA_BASE_PATH], it can be found HERE

Reproduce paper results step by step

In this paper, we utilize self-explaining structures in different NLP tasks. This repo contains all train and evaluate codes, but here, we only provide commands for SST-5 task as an example. For other tasks, you can reproduce the results simply by modifying the commands.

1.Train the self-explaining model

SST-5 is a task with five classes, so we should modify the Roberta-base config file. Open [ROBERTA_BASE_PATH]\config.json and set num_labels=5. Then run the following commands.

cd explain
python \
--bert_path [ROBERTA_BASE_PATH] \
--data_dir [SST_PATA_PATH] \
--task sst5 \
--gpus=0,1,2,3  \
--precision 16 \
--lr=2e-5 \
--batch_size=10 \
--lamb=1.0 \
--workers=4 \

After training, the checkpoints and training log will be saved at [SELF_EXPLAINING_MODEL_CHECKPOINTS].

2.Evaluate the self-explaining model

Run the following evaluation command to get the performance on test dataset. After evaluation, you will get two output file at [SPAN_SAVE_PATH]: output.txt and test.txt. output.txt records visual extract spans and prediction results. text.txt only records top-ranked span as span-base test data for next stage.

cd explain
python \
--bert_path [ROBERTA_BASE_PATH] \
--data_dir [SST_PATA_PATH] \
--task sst5 \
--checkpoint_path [SELF_EXPLAINING_MODEL_CHECKPOINTS]/***.ckpt \
--save_path [SPAN_SAVE_PATH] \
--gpus=0, \
--mode eval

3.Check the extracted span

In previous stage, we got span-based test data. You can use the same method to get span-based train data.
To check the extracted span, we set four experiments which are full-full mode, full-span mode, span-full mode and span-span mode. For example, full-span mode means we use origin SST-5 train data as train data, and use span-based test data as test data.
You should save the origin SST-5 train data and span-base test data at [FULL_SPAN_PATH]

scp [SST_PATA_PATH]/train.txt  [FULL_SPAN_PATH]
cd check
python \
--bert_path [ROBERTA_BASE_PATH] \
--data_dir [FULL_SPAN_PATH] \
--task sst5 \
--gpus=0,1,2,3  \
--precision 16 \
--lr=2e-5 \
--batch_size=10 \
--lamb=1.0 \
--workers=4 \