In this project several novel tecniques were tested on several datasets to predict next Time Series.
- Variant A: Train on 2600 rows, test on last 70 days for TRMF (novel method from Sk) and theta (M4)
- Variant B (Cloud Computing Provider/CDnow):
- Take clusters -> split dataset based on given clusters (around 2000 clusters) -> train on (all-70), test on (70)
- Train classifier to predict clusters, train on given 2000 clusters.
- Predict clusters for addtional 600/2000/5000/... rows
- Train again and test using given additional generated clusters for rows
- Clusterwise Regression (Electronics manufacturer/M4):
- Take clusters.
- Fit models on each of them
- Calculate training error for clusters (Each cluster has its own training error)
- Take observation x from cluster a, where error_x > error_a
- Move observation x to the cluster with the lowest training error
- Train again all clusters where there were changes
- Stop when error didn't improve several times
- Do the test on final setting
https://docs.google.com/document/d/1WbWhuo2dEpGf9a7NiyHRcz_wTnI-WXKYDAXUj65VxII/edit
The clusters were generated on methods from corresponding paper
- GMM_Pair
- GMM_Vote
- Topological Clusterwise Ensemble with Matrix Completion Methods for Demand Forecasting
- Topology-based Clusterwise Regression for User Segmentation and Demand Forecasting