/mlinspect

Inspect ML Pipelines in Python in the form of a DAG

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

mlinspect

mlinspect GitHub license Build Status codecov

Inspect ML Pipelines in Python in the form of a DAG

Run mlinspect locally

Prerequisite: Python 3.9

  1. Clone this repository

  2. Set up the environment

    cd mlinspect
    python -m venv venv
    source venv/bin/activate

  3. If you want to use the visualisation functions we provide, install graphviz which can not be installed via pip

    Linux: apt-get install graphviz
    MAC OS: brew install graphviz

  4. Install pip dependencies

    pip install -e .[dev]

  5. To ensure everything works, you can run the tests (without graphviz, the visualisation test will fail)

    python setup.py test

How to use mlinspect

mlinspect makes it easy to analyze your pipeline and automatically check for common issues.

from mlinspect import PipelineInspector
from mlinspect.inspections import MaterializeFirstOutputRows
from mlinspect.checks import NoBiasIntroducedFor

IPYNB_PATH = ...

inspector_result = PipelineInspector\
        .on_pipeline_from_ipynb_file(IPYNB_PATH)\
        .add_required_inspection(MaterializeFirstOutputRows(5))\
        .add_check(NoBiasIntroducedFor(['race']))\
        .execute()

extracted_dag = inspector_result.dag
dag_node_to_inspection_results = inspector_result.dag_node_to_inspection_results
check_to_check_results = inspector_result.check_to_check_results

Detailed Example

We prepared a demo notebook to showcase mlinspect and its features.

Supported libraries and API functions

mlinspect already supports a selection of API functions from pandas and scikit-learn. Extending mlinspect to support more and more API functions and libraries will be an ongoing effort. However, mlinspect won't just crash when it encounters functions it doesn't recognize yet. For more information, please see here.

Notes

  • For debugging in PyCharm, set the pytest flag --no-cov (Link)

Publications

License

This library is licensed under the Apache 2.0 License.