Figuring out how to make your AI safer? How to avoid ethical biases, errors, privacy leaks or robustness issues in your AI models?
This repository contains a curated list of papers & technical articles on AI Quality & Safety that should help ๐
You can browse papers by Machine Learning task category, and use hashtags like #robustness
to explore AI risk types.
- Tabular Machine Learning
- Natural Language Processing
- Computer Vision
- Recommendation System
- Time Series
- General ML Testing
- Machine Learning Model Drift Detection Via Weak Data Slices (Ackerman et al., 2021)
#DataSlice
#Debugging
#Drift
- Automated Data Slicing for Model Validation: A Big Data - AI Integration Approach (Chung et al., 2020)
#DataSlice
- Interacting with Predictions: Visual Inspection of Black-box Machine Learning Models (Krause et al., 2016)
#Explainability
- Beyond Accuracy: Behavioral Testing of NLP Models with CheckList (Ribeiro et al., 2020)
#Robustness
- Pipelines for Social Bias Testing of Large Language Models (Nozza et al., 2022)
#Bias
#Ethics
- Why Should I Trust You?": Explaining the Predictions of Any Classifier (Ribeiro et al., 2016)
#Explainability
- A Unified Approach to Interpreting Model Predictions (Lundberg et al., 2017)
#Explainability
- Anchors: High-Precision Model-Agnostic Explanations (Ribeiro et al., 2018)
#Explanability
- Explanation-Based Human Debugging of NLP Models: A Survey (Lertvittayakumjorn, et al., 2021)
#Debugging
- SEAL: Interactive Tool for Systematic Error Analysis and Labeling (Rajani et al., 2022)
#DataSlice
#Explainability
- Holistic Evaluation of Language Models (Liang et al., 2022)
#General
- Learning to summarize from human feedback (Stiennon et al., 2020)
#HumanFeedback
- DOMINO: Discovering Systematic Errors with Cross-modal Embeddings Domino (Eyuboglu et al., 2022)
#DataSlice
- Explaining in Style: Training a GAN to explain a classifier in StyleSpace (Lang et al., 2022)
#Robustness
- Model Assertions for Debugging Machine Learning (Kang et al., 2018)
#Debugging
Contributions are welcome ๐
Contributions are welcome ๐
- Machine learning testing: Survey, landscapes and horizons (Zhang et al., 2020)
#General
- Quality Assurance for AI-based Systems: Overview and Challenges (Felderer et al., 2021)
#General
- Metamorphic testing of decision support systems: A case study (Kuo et al., 2010)
#Robustness
- A Survey on Metamorphic Testing (Segura et al., 2016)
#Robustness
- Testing and validating machine learning classifiers by metamorphic testing (Xie et al., 2011)
#Robustness
- The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction (Breck et al., 2017)
#General
- The Disagreement Problem in Explainable Machine Learning: A Practitionerโs Perspective (Krishna et al., 2022)
#Explanability
- InterpretML: A Unified Framework for Machine Learning Interpretability (Nori et al., 2019)
#Explainability
#General