Dealing with Bias and Fairness in Data Science Systems: A Practical Hands-on Tutorial

Presenters

Pedro Saleiro, Feedzai
Kit T. Rodolfa, Carnegie Mellon University
Rayid Ghani, Carnegie Mellon University

Why this tutorial?

Tackling issues of bias and fairness when building and deploying data science systems has received increased attention from the research community in recent years, yet most of the research has focused on theoretical aspects with a very limited set of application areas and data sets. Today, we have a lack of:

Practical training materials
Methodologies to follow when building data science systems that are fair and equitable for people that are affected by them
Tools for researchers and developers working on real-world, ML-based decision-making system to deal with issues of bias and fairness.

Today, treating bias and fairness as primary metrics of interest, and building, selecting, and validating models using these metrics is not standard practice for data scientists. This tutorial is a step towards changing that.

What will we cover?

In this hands-on tutorial we will bridge the gap between research and practice, by exploring fairness at the systems and outcomes level, from metrics and definitions to practical case studies, including bias audits (using the Aequitas toolkit) and the impact of various bias reduction strategies. By the end of this hands-on tutorial, the audience will be familiar with bias audit and reduction frameworks and tools that will help them make informed design choices guided by the contexts in which their system will be deployed and used.

Schedule and Structure

Slides

Colab notebooks

Overall fairness and equity when building Data Science/ML systems
- Fairness in Systems and Outcomes
- (Existing) Baselines
- Sources of Bias in ML Systems
- Hands-on Session: Exploring Sources of Bias in Case Studies
From societal goals to fairness goals to ML fairness metrics
- ML Fairness Metrics
- The Fairness Tree: Mapping from Societal and Fairness Goals to ML Fairness Metrics
- Hands-on Session: Choosing Relevant Fairness Metric(s) in Case Studies
Audit bias and fairness of an ML-based decision-making system
- Overview of Aequitas, an open source bias and fairness audit tool
- What is needed to audit models and predictions?
- Hands-on Session: Auditing the predictions of an ML model
  - Static Python notebook on github
  - Interactive Colab notebook
Explore bias reduction strategies
- Overview of bias reduction strategies
  - Reducing bias through 'fairness-aware' Model Selection
  - 'Fixing' the data: Sampling approaches
  - 'Fixing' the model: Regularization approaches
  - Post-hoc adjustments to improve fairness
- Case Studies
- Hands-on Session: Try bias reduction strategies
  - 'Fixing' the data: Sampling approaches static notebook on github, interactive colab version
  - 'Fixing' the model: Regularization approaches static notebook on github, interactive colab version
  - Post-hoc adjustments to improve fairness static notebook on github, interactive colab version
Wrap-Up
- Tools
- Resources

Pre-Requisites

Programming (in Python).
Machine Learning background (understanding of and experience building ML models).
Caring about the world, fairness, and equity.

Resources

Aequitas: Bias Audit Toolkit
Bias and Fairness in ML (book chapter)