M4 Statistical Methods
Applying various statistical methods to the M4 dataset:
- Theoretical background
- Statistical methods, i.e. benchmark methods of M4
Forecasting: Principles and Practice (FPP) files contain my notes of the respective chapters in the book, theoretical and practical aspects of the benchmark methods as well as general forecast aspects:
Hyndman, R. J., & Athanasopoulos, G. (2018) Forecasting: principles and practice, 2nd edition, OTexts: Melbourne, Australia.
Folder Structure
├── Jupyter Notebooks <- Jupyter Notebooks using py
│ └─── src <- Source code for use in notebooks
├── RMarkdown <- .Rmd files (i.e. forecasting methods, benchmarks)
│ └─── html_outputs <- Outputs of .Rmd files as html
├── R_benchmarks <- R scripts for creating benchmarks + ETSARIMA
│ └─── archive <- Discarded r scripts for benchmarks (too slow, unmodularized)
├── data <- Data used in this project (i.e. M4 by domain/period, example data)
├── Images <- Images used in Rmd and notebook files
├── results <- Results and tables
│ └─── M4_ETSARIMA
│ └─── M4_benchmarks8
│ └─── Reg_benchmarks8
├── src <- Source code for use in this project
└─── data <- Scripts to download
├── .gitattributes
├── .gitignore
├── MT TM1.Rproj <- .Rproj file for project structure
├── README.md <- Top-level README
Benchmarks
Traditional statistical methods
The 8 basic statistical benchmark methods used in the M4 Competition, namely:
- Naïve method: Simply the same value as yesterday
- Seasonal Naïve: Same value as last season
- Naive2: Naive method after deseasonalizing the series
- Simple Exponential Smoothing (SES)
- Holt
- Damped trend exponential Smoothing
- Theta method: Winner of M3
- Comb: Combination benchmark of SES, Holt and Damped trend (average)
Other widely used methods
In addition, I added the following forecast methods:
- Automatic ARIMA: Automated procedure to select an ARIMA model
- ETS: Automatic procedure to select an exponential smoothing model
- ETSARIMA: Average of ETS and Auto.arima
ML Methods
Two basic ML methods were used in the competition:
- RNN Bench: Sequential recurrent neural network (RNN) using
SimpleRNN
layer with 6 nodes in Keras. - MLP Bench: Multilayer perceptron (MLP) network using
MLPRegressor
with 6 units from scikit learn.
Both methods in the given formulation performed badly in the M4 competition