Sample Selection for Fair and Robust Training

Authors: Yuji Roh, Kangwook Lee, Steven Euijong Whang, and Changho Suh

In Proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS), 2021

Introduction

The goal of the project is to perform model selection and evaluation of a research work, for the Model Selection for Large Scale Learning course at Grenoble INP - Ensimag, year 2021/2022.

This directory is for simulating fair and robust sample selection on the synthetic dataset. The program needs PyTorch, Jupyter Notebook, and CUDA.

Project organization

The directory contains a total of 6 files and 2 child directory:

this README
4 python files:
- FairRobustSampler.py defines the FairRobust sampler and a PyTorch dataset for sensitive data
- models.py contains logistic regression and SVM architecture, a test function and a plotting function
- utils.py contains utility functions for data generation
- main.py
a report as jupyter notebook
synthetic_data contains 11 numpy files for synthetic data. The synthetic data is composed of training set, validation set, and test set.
datasets contains a xls file, related to the real credit card clients dataset

Description

To simulate the algorithm, please use the jupyter notebook, which contains detailed instructions, or main.py.

The jupyter notebook will load the data and train the models with two different fairness metrics: equalized odds and demographic parity.