Nixtla/hierarchicalforecast

Encountering LinAlgError despite validating data with isfinite() and isnan()

DennisToma opened this issue · 3 comments

I am working on a school project where I have created a wrapper for the hierarchicalforecast library. The wrapper enables team members to use the library without preprocessing the data and enabling or disabling the aggregation as we do forecasts for all levels. However, I am encountering an error in the .reconcile() function. The error message is as follows:

/anaconda/envs/azureml_py38/lib/python3.8/site-packages/statsmodels/stats/moment_helpers.py:252: RuntimeWarning: invalid value encountered in true_divide
  corr = cov / np.outer(std_, std_)
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/hierarchicalforecast/methods.py:413: RuntimeWarning: invalid value encountered in true_divide
  shape residuals_insample (n_hiers, obs)
397 self.Y_rec = self.hrec.reconcile(
    398     Y_hat_df=self.Y_hat_reshaped,
    399     Y_df=self.Y_fitted_df,
    400     S=self.summation_matrix,
    401     tags=self.tags)
    402 if self.Y_rec[self.forecasting_algorithm_used].equals(self.Y_rec.columns[-1]):
    403     print('Reconciliation did not change the prediction.')
...
    206 for a in arrays:
    207     if not isfinite(a).all():
--> 208         raise LinAlgError("Array must not contain infs or NaNs")

LinAlgError: Array must not contain infs or NaNs

I have already confirmed that the data is valid by using the following code:

# Check if data is finite or NaN
assert self.Y_fitted_df['y'].apply(
    np.isfinite).all(), 'Values in Y_insample are not finite'
assert self.Y_fitted_df[self.forecasting_algorithm_used].apply(
    np.isfinite).all(), 'Values in Y_hat_insample are not finite'
assert self.Y_hat_reshaped[self.forecasting_algorithm_used].apply(
    np.isfinite).all(), 'Values in Y_hat are not finite'
assert not self.Y_fitted_df['y'].apply(
    np.isnan).all(), 'Some values in Y_insample are NaN'
assert not self.Y_fitted_df[self.forecasting_algorithm_used].apply(
    np.isnan).all(), 'Some values in Y_hat_insample are NaN'
assert not self.Y_hat_reshaped[self.forecasting_algorithm_used].apply(
    np.isnan).all(), 'Some values in Y_hat are NaN'

As this is my first issue report, I kindly request your understanding and patience with my writing. Thank you in advance for any assistance you may provide in resolving this issue.

Hi @DennisToma,

From looking at the logs it seems that this line is having troubles:

corr = cov / np.outer(std_, std_)

Which means that some of your series might have std=0 => the series are a single constant.
Some of the methods may have troubles to deal with series without variance.
Some possible solutions:

  • Add some random Gaussian noise to them that would act almost like a ridge regression regularizer.
  • On our side add protections to check in advance for incompatibilities between constant series and reconciliation methods.

Would you be able to confirm the diagnosis?

Hi @kdgutier,

thank you for your fast reply. I am unable to confirm the diagnosis provided earlier. The standard deviation of all three series is greater than 0, and there are no 0 or negative values present. Additionally, the series are not constant.

I have tried splitting the series into subsets and running them, and this approach has successfully executed without any errors.

Thank you for your assistance in this matter.

What was the solution @DennisToma?