Automatic Corpus-level and Concept-based Explanation for Text Classfication Models.
This repository is a pytorch implementation for the following arxiv paper:
A Concept-based Abstraction-Aggregation Deep Neural Network for Interpretable Document Classification
Tian Shi, Xuchao Zhang, Ping Wang, Chandan K. Reddy
- Python 3.6.9
- argparse=1.1
- torch=1.4.0
- sklearn=0.22.2.post1
- numpy=1.18.2
Please download processed dataset from here. Place them along side with DMSC_FEDA.
|--- ACCE
|--- Data
| |--- imdb_data
| |--- newsroom_data
| | |--- dev
| | |--- glove_42B_300d.npy
| | |--- test
| | |--- train
| | |--- vocab
| | |--- vocab_glove_42B_300d
|--- nats_results (results, automatically build)
|
Training, Validate, Testing
python3 run.py --task train
Testing only
python3 run.py --task test
Evaluation
python3 run.py --task evaluate
keywords Extraction
python3 run.py --task keywords_attnself
keywords Extraction
python3 run.py --task keywords_attn_abstraction
Attention Weight Visualization
python3 run.py --task visualization
If you want to run baselines, you may need un-comment the corresponding line in run.py
.
Model | BRIEF |
---|---|
CNN | Convolutional Neural Network |
RNNAttn | Bi-LSTM + Self-Attention |
RNNAttnWE | RNNAttn + Pretrained Word Embedding |
RNNAttnWECPT | RNNAttnWE + Concept Based |
RNNAttnWECPTDrop | RNNAttnWECPT + Attention Weights Dropout |
Bert* | Replace RNN with BERT |
Coming Soon.
@article{shi2020concept,
title={A Concept-based Abstraction-Aggregation Deep Neural Network for Interpretable Document Classification},
author={Shi, Tian and Zhang, Xuchao and Wang, Ping and Reddy, Chandan K},
journal={arXiv preprint arXiv:2004.13003},
year={2020}
}