/Interpretable_ML_tabular_and_image

How to enhance the interpretability of powerful black-box models?

Primary LanguageJupyter NotebookMIT LicenseMIT

Interpretable_ML_tabular_and_image

Employ popular model-agnostic interpretable machine learning methods, LIME and SHAP, to analyze and interpret the behaviors of models in the context of tabular data classification and image data classification tasks.

Problem: How to enhance the interpretability of powerful black-box models?

🔑 Importance: Bridge the gap between complex machine learning models and human understanding, enhancing transparency, trust, and accountability in the decision-making process.

🎉 Achievements: Apply LIME and SHAP techniques to shed light on the black box nature of machine learning models, providing insights into how they make predictions for both tabular and image data classification problems.

💪 Skills: Python, Pytorch, Deep learning, XAI

📚 Insights: Inner workings of complex machine learning models, the identification of important features and patterns that drive their predictions, facilitating model debugging, feature engineering, and the identification of potential biases or limitations.

Problem statement:

What is Interpretability? "Interpretability is the measure of how well a user can correctly and efficiently predict the model's results"

Models

Intrinsically Interpretable Models

  • Regression
  • Decision Tree

Model-Agnostic Methods

The benefit of using Model-Agnostic Methods:

The great advantage of model-agnostic interpretation methods over model-specific ones is their flexibility. Machine learning developers are free to use any machine learning model they like when the interpretation methods can be applied to any model. Anything that builds on an interpretation of a machine learning model, such as a graphic or user interface, also becomes independent of the underlying machine learning model.

Global

  • Partial Dependence Plot (PDP):
  • Accumulated Local Effects (ALE) Plot:

Local

Applications:

Tabular data prediction:

Dataset

Stroke Prediction

Pipeline

tabular_pipeline
Pipeline for tabular data

Result

Lime_tabular
Lime: tabular data
SHAP_tabular
SHAP: tabular data

Image data classification

Dataset

Kaggle-Dog Breed Identification

Pipeline

img_pipeline
Pipeline for image data

Result

Lime_tabular
Lime: image data
SHAP_tabular
SHAP: image data

Links:

Reference: