[enhancement] : hierarchicalforecast running so slow on big data
Opened this issue · 1 comments
What happened + What you expected to happen
On Databricks cluster code is running so slow. How to use multiprocessing pool-based mechanism to speed up training process?
Versions / Dependencies
latest YM : 202306
Reproduction script
Take any dataset having ,
-- 5 levels
-- Y_test_df.shape (500,3)
-- Y_train_df.shape (6000,3)
Following default code is running slow
Y_hat_df = fcst.forecast(h=4, fitted=True)
Y_fitted_df = fcst.forecast_fitted_values()
Issue Severity
High: It blocks me from completing my task.
Hi @vinaybridge,
Thanks for using HierarchicalForecast.
For the moment, I don't recommend using a Databricks cluster except for creating the Statsforecast's base predictions.
The HierarchicalForecast library is yet to include parallelized reconciliation methods. Additionally, HierarchicalForecast requires you to load the entire series in a single computer's memory due to the reconciliation methods requiring access to all the base forecasts simultaneously.