Data Drift Monitor: Comprehensive Analysis Extended

This ModelOp Center monitor runs and compares Kolmogorov-Smirnov, Epps-Singleton, Jensen-Shannon, Kullback-Leibler, and Pandas summary on input data. This "extended" version also provides detailed metrics for each unique value for each Categorical feature.

Input Assets

Type	Number	Description
Baseline Data	1	A dataset corresponding to training/reference data
Sample Data	1	A dataset corresponding to a slice of production data

Assumptions & Requirements

Underlying BUSINESS_MODEL being monitored has a job json asset.
Input data contains at least one numerical column or one categorical column.

Execution

init function accepts the job json asset and validates the input schema (corresponding to the BUSINESS_MODEL being monitored).
metrics function instantiates the Data Drift Monitor class and uses the job json asset to determine the numerical_columns and/or categorical_columns accordingly.
The Epps-Singleton, Jensen-Shannon, Kullback-Leibler, Kolmogorov-Smirnov, and Pandas.describe() data drift tests are run.
Each test result is appended to the list of data_drift tests to be returned by the model.

Monitor Output

{
    "data_drift": [
        <epps_singleton_test_result>,
        <jensen_shannon_test_result>,
        <kullback_leibler_test_result>,
        <kolmogorov_smirnov_test_result>,
        <pandas_summary_test_result>
    ]
}

modelop/driftComprehensive

Data Drift Monitor: Comprehensive Analysis Extended

Input Assets

Assumptions & Requirements

Execution

Monitor Output