/Skater

Python Library for Model Agnostic Interpretation

Primary LanguagePythonMIT LicenseMIT

Skater

Skater is a python package for model agnostic interpretation of predictive models. With Skater, you can unpack the internal mechanics of arbitrary models; as long as you can obtain inputs, and use a function to obtain outputs, you can use Skater to learn about the models internal decision policies.

The package was originally developed by Aaron Kramer, Pramit Choudhary and internal DataScience Team at DataScience.com to help enable practitioners explain and interpret predictive "black boxes" in a human interpretable way.

Build Status

📖 Documentation

Overview Introduction to the Skater library
Installing How to install the Skater library
Tutorial Steps to use Skater effectively.
API Reference The detailed reference for Skater's API.
Contributing Guide to contributing to the Skater project.

💬 Feedback/Questions

Feature Requests/Bugs GitHub issue tracker
Usage questions Gitter chat
General discussion Gitter chat

Install Skater

Dependencies

Skater relies on numpy, pandas, scikit-learn, and the DataScience.com fork of the LIME package. Plotting functionality requires matplotlib, though it is not required to install the package. Currently we only distribute to pypi, though adding a conda distribution is on the roadmap.

pip

When using pip, to ensure your system is not modified by an installation, it is recommended that you use a virtual environment (virtualenv, conda environment).

pip install -U Skater

Usage

The code below illustrates a typical workflow with the Skater package.

import numpy as np
from scipy.stats import norm

#gen some data
B = np.random.normal(0, 10, size = 3)
X = np.random.normal(0,10, size=(1000, 3))
feature_names = ["feature_{}".format(i) for i in xrange(3)]
e = norm(0, 5)
y = np.dot(X, B) + e.rvs(1000)
example = X[0]

#model it
from sklearn.ensemble import RandomForestRegressor
regressor = RandomForestRegressor()
regressor.fit(X, y)


#partial dependence
from skater.core.explanations import Interpretation
from skater.model import InMemoryModel
i = Interpretation()
i.load_data(X, feature_names = feature_names)
model = InMemoryModel(regressor.predict, examples = X)
i.partial_dependence.plot_partial_dependence([feature_names[0], feature_names[1]],
                                            model)

#local interpretation
from skater.core.local_interpretation.lime.lime_tabular import LimeTabularExplainer
explainer = LimeTabularExplainer(X, feature_names = feature_names)
explainer.explain_instance(example,  regressor.predict).show_in_notebook()

Testing

1. If repo is cloned:

python skate/tests/all_tests.py

2. If pip installed:

python -c "from skater.tests.all_tests import run_tests; run_tests()"