An end-to-end tutorial on using a global forecasting model (i.e., LightGBM) on a retail sales dataset (the M5 competition) using multi-step recursive forecasting.
This tutorial emulates a batch forecasting workflow, breaking the process into multiple steps:
- Obtain the raw data.
- Create the base dataset containing sales, price, promos etc. by product id and date.
- Create a feature engineering pipeline, create the training data, and store the pipeline and training data.
- Train the model and store it.
- Create a forecast using recursive forecasting.
- Plot the forecast, feature importance, and other model diagnostics.
For more on feature engineering for time series forecasting check out this course.
This tutorial requires:
- numpy
- pandas
- scikit-learn
- joblib
- matplotlib
- sktime
- jupyter
- pyarrow
- lightgbm
These can be installed by from the requirements.txt
.
pip install -r requirements.txt
The notebooks were run on Python 3.10.2.