CSE-472-Project-2: Understanding Text Classifiers with Counterfactual Explanation

Mitigate Impact of Spurious Correlation for Text Classification with Counterfactual Methods

Name Paper Code
CLP [AAAI 2019] Counterfactual Fairness in Text Classification through Robustness PyTorch
Corsair [ACL 2021] Counterfactual Inference for Text Classification Debiasing PyTorch
AGC [AAAI 2021] Robustness to Spurious Correlations in Text Classification via Automatically Generated Counterfactuals Pytorch

Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations

Name Paper Code
DiCE [FAT 2021] Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations Tensorflow/PyTorch/sklearn
Randomized sampling Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR DiCE
Genetic algorithm A fast and elitist multiobjective genetic algorithm: NSGA-II DiCE
KD-Tree Interpretable Counterfactual Explanations Guided by Prototypes DiCE
An explicit loss-based method [FAT 2021] Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations Tensorflow/PyTorch
VAE-based method Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers Tensorflow/PyTorch