mikekeith52/scalecast

Feature Imps not storing with Forecaster.tune_test_forecast

John-Miller12 opened this issue · 3 comments

Hello,

Thank you for your development and support of this valuable package.

I cannot get feature_importance=True and summary_stats=True to behave as expected. All models seem to be affected.

I ran --upgrade yesterday.

Environment:

  • Mac with M2 Pro and Sonoma 14.0
  • Jupyterlab running with AMD64 emulation on Docker - up-to-date :latest tag and no other package installs (I can think of, at least)

My forecaster objects are generated by:

import pandas as pd
import numpy as np
from scalecast.Forecaster import Forecaster
from scalecast import GridGenerator
from scalecast.multiseries import export_model_summaries
from datetime import datetime
import matplotlib.pyplot as plt
import seaborn as sns

forecasters= {}
for k,v in dfsd.items():
    f = Forecaster(y=v['GROSS_COUNT'],
                  current_dates = v.index)
    f.generate_future_dates(360)    
    f.set_test_length(.1)
    f.set_validation_length(28)
    f.add_seasonal_regressors(
        'dayofyear',
        'week',
        'month',
        'quarter',
        raw=False,
        sincos=True,
    )
    f.add_seasonal_regressors('year')
    f.add_time_trend()
    f.add_covid19_regressor(end=datetime(2022,2,1,0,0))
    forecasters[k] = f

tune_test_forecast is looped with:

for f in forecasters.values():
    f.tune_test_forecast(models, feature_importance=True, summary_stats=True)

The warnings/errors I get are below. Prophet verbose INFO has been removed:

/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2729: Warning: rf does not have summary stats.
  warnings.warn(
/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2705: Warning: Cannot set pfi feature importance on rf. Here is the error: cannot import name 'if_delegate_has_method' from 'sklearn.utils.metaestimators' (/opt/conda/lib/python3.11/site-packages/sklearn/utils/metaestimators.py)
  warnings.warn(
/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2729: Warning: gbt does not have summary stats.
  warnings.warn(
/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2705: Warning: Cannot set pfi feature importance on gbt. Here is the error: cannot import name 'if_delegate_has_method' from 'sklearn.utils.metaestimators' (/opt/conda/lib/python3.11/site-packages/sklearn/utils/metaestimators.py)
  warnings.warn(
/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2729: Warning: xgboost does not have summary stats.
  warnings.warn(
/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2705: Warning: Cannot set pfi feature importance on xgboost. Here is the error: cannot import name 'if_delegate_has_method' from 'sklearn.utils.metaestimators' (/opt/conda/lib/python3.11/site-packages/sklearn/utils/metaestimators.py)
/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2729: Warning: prophet does not have summary stats.
  warnings.warn(
/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2705: Warning: Cannot set pfi feature importance on prophet. Here is the error: 'Forecaster' object has no attribute 'X'
  warnings.warn(

I would expect FI for most of these sklearn models. Can you please help me understand this miss?

That is not working as expected. Give me a little bit to look into it. Thanks for raising the issue!

After investigating, I am sure that the root of the problem is with the eli5 library. See this issue. I can't say for sure if the developers of that package will ever update it so that it works with newer versions of scikit-learn, so maybe a work-around is needed for scalecast. I'm not sure what that would be as scikit-learn 1.3.1 is needed to do some things in scalecast. If you need feature importance while I try to figure something out for this, you can try setting method = 'shap' when using feature importance, but I believe it only works for tree-based models right now.

The fix for this is in 0.19.4. See the new save_feature_importance() documentation. Feature importance has been expanded in the package to include the two types previously offered, plus five additional methods, all through the shap package.

If you agree with the fix, we will close the issue.