[enhancement] : hierarchicalforecast running so slow on big data

Question

[enhancement] : hierarchicalforecast running so slow on big data

Opened this issue a year ago · 1 comments

What happened + What you expected to happen

On Databricks cluster code is running so slow. How to use multiprocessing pool-based mechanism to speed up training process?

Versions / Dependencies

latest YM : 202306

Reproduction script

Take any dataset having ,
-- 5 levels
-- Y_test_df.shape (500,3)
-- Y_train_df.shape (6000,3)

Following default code is running slow

Y_hat_df = fcst.forecast(h=4, fitted=True)
Y_fitted_df = fcst.forecast_fitted_values()

Issue Severity

High: It blocks me from completing my task.

Answer 1 · 2023-06-12T13:51:42.000Z

Hi @vinaybridge,

Thanks for using HierarchicalForecast.
For the moment, I don't recommend using a Databricks cluster except for creating the Statsforecast's base predictions.

The HierarchicalForecast library is yet to include parallelized reconciliation methods. Additionally, HierarchicalForecast requires you to load the entire series in a single computer's memory due to the reconciliation methods requiring access to all the base forecasts simultaneously.