The official code of Rashomon Capacity: A Metric for Predictive Multiplicity in Classification (NeurIPS 2022) [arXiv]
$ conda create --name <your_env_name> --file requirements.txt
$ source activate <your_env_name>
UCI-Adult/
: raw data adult.data , adult.names , adult.test [1].COMPAS/
: raw data compas-scores-two-years.csv [2]HSLS/
: k-NN imputed HSLS dataset [3] (Raw data and pre-processing)
./data
/UCI-Adult/
/adult.csv
/adult.data
/adult.names
/adult.test
/COMPAS/
/compas-scores-two-years.csv
/HSLS/
/hsls_df_knn_impute_past_v2.pkl
/cifar10/
adult-compas-hsls/
- command to run: python3 sample-all.py
cifar/
:- command to run: python3 sample-all.py
utils/
:- capacity.py: implementations of the Blahut-Arimoto (BA) algorithm to compute channel capacity.
- training.py:
adult-compas-hsls/
- command to run: python3 perturb-all.py
cifar/
:- command to run: python3 perturb-all.py
utils/
:- Python function load_data loads UCI-Adult and COMPAS datasets into PANDAS DataFrames.
- Python function load_hsls_imputed loads the HSLS dataset into PANDAS DataFrames.
- Python function load_cifar10 loads CIFAR-10 [5] into the
data/cifar10/
. - Python function perturb_all_weights3 performs AWP on multi-layer perceptrons (MLP) with UCI-Adult, COMPAS, and HSLS datasets.
- Python function perturb_all_weights_cv3 performs AWP on convolutional neural networks with the CIFAR-10 dataset.
@inproceedings{
hsu2022rashomon,
title={Rashomon Capacity: A Metric for Predictive Multiplicity in Classification},
author={Hsiang Hsu and Flavio P. Calmon},
booktitle={Advances in Neural Information Processing Systems},
year={2022},
url={https://arxiv.org/abs/2206.01295}
}
[1] Lichman, M. (2013). UCI machine learning repository.
[2] Angwin, J., Larson, J., Mattu, S., and Kirchner, L. (2016). Machine bias. ProPublica.
[3] Ingels, S. J., Pratt, D. J., Herget, D. R., Burns, L. J., Dever, J. A., Ottem, R., Rogers, J. E., Jin, Y., and Leinwand, S. (2011). High school longitudinal study of 2009 (hsls: 09): Base-year data file documentation. nces 2011-328. National Center for Education Statistics.
[4] Semenova, L., Rudin, C., and Parr, R. (2019). A study in rashomon curves and volumes: A new perspective on generalization and model simplicity in machine learning. arXiv preprint arXiv:1908.01755.
[5] Krizhevsky, A., Hinton, G., et al. (2009). Learning multiple layers of features from tiny images (technical report). University of Toronto.