Data Science Hackathon: Kickstarter

The following content is the final presentation for the Data Science Hackathon in Hong Kong. The dataset contains details of the kickstarter project ranging from 2009 to 2017.

Kickstarter Distribution in the Globe (2009 - 2017)

image.png

Kickstarter Goal & Pledged Amount Trend in the Globe (2009 - 2016, 2017)

image.png

Kickstarter Goal & Pledged Amount Trend in the Globe (2009 - 2017)

image.png

image.png

image.png

Logistic Regression

  • Target: successful or not

  • Features: category, country, goal (USD), length of campaign

image.png

Decision Tree

  • Target: successful or not
  • Features: Time to get funded, goal real (USD)
  • Methodology: Split data into train (75%) and test (25%) set. Apply 4 layer depth.
  • Accuracy Score: 66%

image.png

Light GBM

  • Target: successful or not
  • Features: category, main_category, currency, country, goal (USD), length of campaign, deadline month, deadline day, launch month, launch day
  • Methodology: Split data prior 2017 into train (70%) and test (30%) for modeling using LightGBM with a further random split (LightGBM uses cross validation in model development)

image.png

Feature Importance

image.png

Model accuracy by iteration

After first few iterations, cross validation results do not improve

image.png

CatBoost

image.png