Review of kaggle time series competition

Inspired by Learnings from Kaggle’s Forecasting Competitions by Casper Solheim Bojer & Jens Peder Meldgaard in 2020, I surveyed the top 3 solutions in the past kaggle time series competitions since 2014 to 2023.

If you find new time series competitions, please tell me by issues.

Table of Contents

List of competitions

# Year Title Data size
1 2014 Walmart Recruiting - Store Sales Forecasting 3.22MB
2 2015 Walmart Recruiting II: Sales in Stormy Weather 9MB
3 2015 Rossmann Store Sales 39.85MB
4 2016 Predicting Red Hat Business Value 26.74MB
5 2017 Web Traffic Time Series Forecasting 611.85MB
6 2018 TalkingData AdTracking Fraud Detection Challenge 11.27GB
7 2018 Corporación Favorita Grocery Sales Forecasting 479.88MB
8 2018 Recruit Restaurant Visitor Forecasting 27.3MB
9 2018 Google Analytics Customer Revenue Prediction 35.9GB
10 2019 LANL Earthquake Prediction 10.42GB
11 2019 Two Sigma: Using News to Predict Stock Movements Not available
12 2019 ASHRAE - Great Energy Predictor III 2.61GB
13 2020 University of Liverpool - Ion Switching 146.08MB
14 2020 M5 Forecasting - Accuracy 450.47MB
15 2020-2021 Jane Street Market Prediction Not available
16 2020-2021 Acea Smart Water Analytics 3.45MB
17 2021 Google Brain - Ventilator Pressure Prediction 698.79MB
18 2022 Optiver Realized Volatility Prediction 2.73GB
19 2022 G-Research Crypto Forecasting 3.12GB
20 2022 Ubiquant Market Prediction 18.55GB
21 2022 American Express - Default Prediction 50.31 GB
22 2022-2023 GoDaddy - Microbusiness Density Forecasting 10.93 MB

Top 3 most voted EDAs

To learn the characteristic of data given in each competition, EDA is one of the best way.
So top 3 most voted EDAs are listed.

> Go to the top

  1. EDA and Store Sales Predictions using XGB
  2. Walmart prediction - (1) EDA with time and space
  3. Wallmart Sales - EDA - feat eng [Future Update]

> Go to the top

NA

> Go to the top

  1. Time Series Analysis and Forecasts with Prophet
  2. EDA and forecasting with RFRegressor_FINAL_UPDATED
  3. How Does New Competition Affect Sales?

> Go to the top

  1. Time Travel (EDA)
  2. Redhat EDA
  3. RedHat Hack in plain English (EDA)

> Go to the top

  1. Wiki Traffic Forecast Exploration - WTF EDA
  2. Web Traffic Time Series Forecasting (EDA)
  3. Wikipedia Web traffic EDA

> Go to the top

  1. TalkingData EDA plus time patterns
  2. TalkingData EDA and Class Imbalance
  3. TalkingData: EDA to Model Evaluation | LB: 0.9683

> Go to the top

  1. Shopping for Insights - Favorita EDA
  2. Memory optimization and EDA on entire dataset
  3. Grocery EDA Dirty XGBoost, Arima,ETS,Prophet

> Go to the top

  1. Be my guest - Recruit Restaurant EDA
  2. Exhaustive Weather EDA/File Overview
  3. Recruit Restaurant EDA

> Go to the top

  1. R EDA for GStore + GLM + KERAS + XGB
  2. Google Analytics EDA + LightGBM + Screenshots
  3. A Very Extensive GStore Exploratory Analysis

> Go to the top

  1. Earthquakes FE. More features and samples
  2. LANL Earthquake EDA and Prediction
  3. Masters Final Project: EDA

> Go to the top

  1. EDA, feature engineering and everything
  2. 👨‍🔬 Bird Eye 👀 view of Two Sigma + NN Approach
  3. Simple EDA - Two Sigma

> Go to the top

  1. 🔌⚡ASHRAE -Start Here: A GENTLE Introduction
  2. EDA for ASHRAE
  3. A deep dive EDA into ALL variables

> Go to the top

  1. Ion Switching Competition : Signal EDA 🧪
  2. EDA - Ion Switching
  3. Simple EDA-Model

> Go to the top

  1. Back to (predict) the future - Interactive M5 EDA
  2. M5 Competition : EDA + Models 📈
  3. Time Series Forecasting-EDA, FE & Modelling📈

> Go to the top

  1. Jane Street: EDA of day 0 and feature importance
  2. Jane_street_Extensive_EDA & PCA starter 📊⚡
  3. EDA / A Quant's Prespective

> Go to the top

  1. Acea Smart Water: Full EDA & Prediction
  2. EDA: Quenching the Thirst for Insights
  3. Quick EDA | Reporting & Data Understanding

> Go to the top

  1. Ventilator Pressure Prediction: EDA, FE and models
  2. 🔥EDA +FE+TabNet 🧠🧠[Weights and Biases]
  3. Ventilator Pressure: EDA and simple submission

> Go to the top

  1. Optiver Realized: EDA for starter(English version)
  2. Optiver Realized Volatility Prediction - EDA
  3. Optiver; EDA XGBoost starter(日本語,Japanese)

> Go to the top

  1. 📊 G-Research Plots + EDA 📊
  2. To The Moon 🚀 [G-Research Crypto Forecasting EDA]
  3. 📈📊[G-crypto] Interactive Dashboard + Indicators

> Go to the top

  1. EDA- target analysis
  2. Ubiquant EDA and Baseline
  3. 🔥The most advanced analytics🔥

> Go to the top

  1. AMEX EDA which makes sense ⭐️⭐️⭐️⭐️⭐️
  2. AMEX Default Prediction EDA & LGBM Baseline
  3. American Express EDA

> Go to the top

  1. TBD
  2. TBD
  3. TBD

Top 3 solutions

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 💻 🔊
2 NA 🔊
3 💻 🔊

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 Lasso - - 💻 🔊
2 - - - NA NA
3 - - - NA NA

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 NA 🔊
2 NA NA
3 💻 🔊

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 NA 🔊
2 NA 🔊
3 NA 🔊

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 💻 🔊
2 💻 🔊
3 💻 🔊

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 NA 🔊
2 NA 🔊
3 NA 🔊

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 💻

💻
🔊
2 NA 🔊
3 NA 🔊

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 LightGBM - - 💻

💻
NA
2 - - - NA NA
3 - - - NA NA

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 💻 🔊
2 NA 🔊
3 - - - NA NA

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 💻

💻
🔊
2 NA 🔊
3 NA 🔊

> Go to the top

NA

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 CatBoost

LightGBM

MLP
NA 🔊
2 XGBoost

LightGBM

Catboost

Feed-forward Neural Network
NA 🔊
3 CNN

LightGBM

Catboost
NA 🔊

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 NA 🔊
2 💻 🔊

🔊
3 NA 🔊

🔊

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 LightGBM NA 💻
2 LightGBM NA 💻
3 DeepAR NA 💻

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 XGBoost

NN
💻 NA
3 49 layers MLPs No 15 ensembles of NN NA 🔊

NA for Pos #2

> Go to the top

NA

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 LSTM

Transformer
single architecture KFold 💻 🔊
2 Stacked LSTM ensembled by 7 models KFold NA 🔊
3 Conv1d

Stacked LSTM
random seed average Stratified K-Folds NA 🔊

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 LightGBM

MLP

CNN
equally weighd average GroupKFold 💻 🔊
3 LightGBM

MLP

TabNet
equally weighd average KFold 💻 🔊

NA for Pos #2

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 - - - - NA NA
2 LightGBM Single model NA 🔊
3 LightGBM Single model 💻 💻 🔊

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 LightGBM

TABNET
Average of (LGBM x 5 Folds) + (TABNET x 5 Folds) PurgedGroupTimeSeries

TimeSerieseSplit

KFold
NA 🔊
2 LightGBM - Purged K-FOLD cross validation with embargo NA 🔊
3 6 layers transformer 5 seeds ensemble - NA 🔊

> Go to the top

Pos Methods    FE       Ensemble       Split       Code Discussion
1 LightGBM

GRU
Ensembled by 4 models 💻 🔊
2 LGB/XGB/CTB

NN
NA 🔊
3 LGB/CTB Ensembled by 3 models NA 🔊

> Go to the top

Ongoing

Pos Methods    FE       Ensemble       Split       Code Discussion
1
2
3