This repository provides resources, tools and notebooks for Fairness, Ethics, Explainability in AI and ML.
- Introduction
- Classification
- Legal background and normative questions
- Causality
- Testing discrimination in practice
- A broader view of discrimination
- Measurement
- Algorithmic interventions
- Appendix: Technical background
- Berkeley CS 294 (2017): Fairness in machine learning
- Cornell INFO 4270 (2017): Ethics and policy in data science
- Princeton COS 597E (2017): Fairness in machine learning
"Interpretability techniques are normally studied in isolation. We explore the powerful interfaces that arise when you combine them — and the rich structure of this combinatorial space."
OpenAI introduced Microscope – a collection of visualizations of layers and neurons of several common deep learning models that are often studied in interpretability. Microscope makes it easier to analyze the features that form inside these neural networks.
Source: OpenAI
Lime is about explaining what machine learning models are doing. It supports explaining individual predictions for text classifiers or classifiers that act on tables (numpy arrays of numerical or categorical data) or images, with a package called lime (short for local interpretable model-agnostic explanations). Lime is based on the work presented in the paper "Why Should I Trust You?": Explaining the Predictions of Any Classifier" (2016) by Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin.
Source: Lime, GitHub
Model Interpretability for PyTorch. [Tutorials]
- Supports interpretability of models across modalities including vision, text, and more
- Supports most types of PyTorch models and can be used with minimal modification to the original neural network
- Open source, generic library for interpretability research. Easily implement and benchmark new algorithms
Source: Captum
A Framework for Explaining Predictions of NLP Models by Eric Wallace, Jens Tuyls, Junlin Wang, Sanjay Subramanian, Matt Gardner, and Sameer Singh
AllenNLP Interpret is a toolkit built on top of AllenNLP for interactive model interpretations. The toolkit makes it easy to apply gradient-based saliency maps and adversarial attacks to new models, as well as develop new interpretation methods. It contains three components: a suite of interpretation techniques applicable to most models, APIs for developing new interpretation methods (e.g., APIs to obtain input gradients), and reusable front-end components for visualizing the interpretation results.
Source: AllenNLP Interpret
ELI5 is a Python library which allows to visualize and debug various Machine Learning models using unified API. It has built-in support for several ML frameworks and provides a way to explain black-box models.
- Source: ELI5 Documentation
- Tutorials: Notebook
Google released the ML Fairness Gym (2020), a set of components for building simple simulations that explore potential long-run impacts of deploying machine learning-based decision systems in social environments.
- "Fairness is not Static: Deeper Understanding of Long Term Fairness via Simulation Studies" [Paper]
- [ML Fairness GitHub repository]