This is the official code repository for iSEA: An Interactive Pipeline for Semantic Error Analysis of NLP Models, by Jun Yuan, Jesse Vig, Nazneen Rajani.
This repository contains the following two parts:
-
pre-process/
: This folder contains the code of pre-processing the text documents. We use the pre-trained DistilBERT as an example to demonstrate how we process the data in several Jupyter Notebook files. These notebooks include code for the following content:-
preprocessing of documents (tokenization, lemmatization, document embedding, etc.);
-
model performance;
-
high-level feature generation;
-
rule generation;
-
instance-level model explanation (SHAP values).
-
-
ui/
: This folder contains code and processed data of running the front-end.
We first pre-compute all the necessary information such as model output, analysis information, and error rules in the server. We then present this information in the user interface. Based on the user input, the server calculates subpopulation-level information (errors, document statistics, aggregated SHAP values, etc.) and returns this information back to the UI.
In the paper, we present two use cases with the following data and models:
-
For
MultiNLI
dataset, we first train aDistilBERT
model based on the government genre. We then analyze the model performance on the travel genre. The checkpoint can be found here. -
For the sentiment analysis task on
Twitter
dataset, we analyze the errors from the open-sourcedtwitter-roberta-base-sentiment
model on test data via our pipeline.
To apply iSEA to your own data/model, please follow the instructions in the pre-process/
folder for data preprocessing and the instructions in the ui/
.
When referencing this repository, please cite this paper:
@misc{yuan22isea,
title={iSEA: An Interactive Pipeline for Semantic Error Analysis of NLP Models},
author={Yuan, Jun and Vig, Jesse and Rajani, Nazneen},
year={2022},
eprint={2203.04408},
archivePrefix={arXiv},
primaryClass={cs.HC},
url={https://arxiv.org/abs/2203.04408}
}
This repository is released under the BSD-3 License.