/March-Madness

March Madness Kaggle Challenge Solution

Primary LanguageJupyter Notebook

March Madness Prediction Challenge - Kaggle

https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge

Overview and Work Pipeline

  • Preprocessing
  • Feature Extraction
  • Model Selection
  • Ensemble
  • Repeat and Win

Structure of Repo

code - contains all the preprocessing, features, models, and ensembles we have done. Only commit and push to the masters code if your model will be good for ensembling. All other models and code should be left in your branch

input - has all the inputs

notes - useful and unuseful notes about the contest, also has images and graphs that might help for feature extraction

submissions - output files go here

Work Flow

Please update this README.md file continually so we know what models have already been tested. That way we won't repeat ourselves. Post the link and general description of the model you have tested below (accuracy, results, findings, and anything you think would be useful)

Model 1: Pomeroy Submission (Chuck) Pomeroy model from Ken Pomeroy's rankings for offensive and defensive ratings

Future Work

-https://www.kaggle.com/captcalculator/a-very-extensive-ncaa-exploratory-analysis great visualization

  1. trained data on hundreds of game played each week
  2. used features like Las Vegas Spread, Ken Pomeroy rating, and offensive vs defensive rating. (three most important, core features (gives you a shot)) How to calculate elo matters -- think about rankings ... and other stuff
  3. Loss function -> logistic regression (e^0.17x / 1 + e^0.17x) -> really simple
  4. Ensembled simple weighted average of the three features. (try two things, feature + ensemble, just feature)

Repeat with other less important features At the end try different models

Notes and ideas