/forecasting-toolkit

Forecasting Toolkit: A collection of notebooks and scripts for handling and forecasting time series

Primary LanguageJupyter Notebook

Forecasting Toolkit

Author: Christoph Schauer
Uploaded: 2019/11/16
Last update: 2020/08/01

Introduction

This repository is a collection of notebooks plus a package for helping with handling time series and forecasting in Python. At present it contains notebooks on handling time series data, exploratory analysis, and forecasting with a number of classic statistical and machine learning models, including gradient boosting regression, ARMA and VAR models, and Fourier Transforms.

Table of Contents

Notebooks

  • 01-data-prepatation.ipynb: A collection of common data preparation and transformation operations for time series analyis.
  • 02-exploratory-analysis.ipynb: A collection of commonly used exploratory analysis methods and visualizations for time series analysis.
  • 03-model-evaluation.ipynb: A collection of commonly used metrics and visualizations for evaluating the performance of time series forecasting models.
  • 11-linear-polynomial-trends.ipynb: Showcases the custom LinearTrend class for modeling and forecasting time series with linear and polynomial regression models.
  • 13-arma-models.ipynb: Showcases models of the ARMA family (ARIMA, SARIMA, and SARIMAX) using statsmodels.tsa.statespace.sarimax.SARIMAX.
  • 14-var-models.ipynb: Showcases models of the VAR family (VAR, VARMA, and VARMAX) using statsmodels.tsa.statespace.varmax.VARMAX.
  • 15-gradient-boosting-models.ipynb: Showcases the custom TimeSeriesGBR class for modelling and forecasting time series with gradient boosting regression models.
  • 16-fourier-models.ipynb: Showcases the custom FourierWave class for modeling and forecasting time series with Fourier Transforms.
  • 31-example-trend-fourier-sarima.ipynb: Showcases several exploratory techniques and how to aggregate (sort of; add really) three different univariate models (linear trend, Fourier Transform, SARIMA) to capture three different types of patterns in a time series and combine their predictions to one forecast.

Package forecasttk

The forecast toolkit package currently includes the following modules:

  • visualize.py: Contains functions for visualizing time series, seasonal decomposition, autocorrelation functions, model forecasts, and residuals. All functions for plotting include an argument for saving the plots as jpeg file.
  • evaluate.py: Contains functions for quickly printing out performance metrics.
  • lineartrend.py: Contains the custom class LinearTrend, a child class of sklearn.linear_model.LinearRegression, inheriting everything from this class. It extends this class with several attributes and methods for easy-to-use modeling and forecasting of time series with linear and polynomial regression models.
  • tsgbr.py: Contains the custom class SeasonalGBRX, a child class of sklearn.ensemble.GradientBoostingRegressor, inheriting everything from this class. It extends this class with several attributes and methods for easy-to-use modeling and forecasting of time series with gradient boosting regression models.
  • fourierwave.py: Contains the custom class FourierWave which encapsulates several attributes and methods for applying a Fourier Transform to a time series, visualizing its main frequencies, fitting a combination of cosine waves for selected frequencies, and generating a forecast with it.

References / a few good books, guides, and tutorials for further studies

Next Steps

Additions

  • Module for a "custom" SARIMA algorithm based on linear regression plus lagged variables to circumvent the periodicity/number of lags limits of the statsmodels module
  • Exponential smoothing notebook
  • Time series cross validation / forecast robustness notebook
  • Model blending notebook
  • Notebook for forecasting many targets in parallel
  • Facebook Prophet notebook

Updates

  • Data preparation: Expand datetime parts
  • Data preparation: Add section for doing maths with dates
  • Exploratory analysis: Add more methods
  • Model evaluation: Add more methods
  • Add more explanations for everything
  • Add more links to useful tutorials / guides from others
  • Update gradient boosting regression class to accept exogenous variables
  • Update Fourier Transform class to accept weekly data
  • Update all classes to accept data with datetime indices