Use python 3.6
Necessary libraries:
- pandas
- numpy
- scikit-learn
Place input data in input folder
Season | Data Type | Stats | Log Loss |
---|---|---|---|
2003-2013 | Regular Season | ORtg, DRtg, Seed Diff | 0.542679214062 |
2003-2013 | Regular Season | ORtg, DRtg, NetRtg, FourFactors, Seed Diff | 0.542104885388 |
2010-2013 | Regular Season | ORtg, DRtg, Seed Diff | 0.549534941948 |
2010-2013 | Regular Season | ORtg, DRtg, NetRtg, FourFactors, Seed Diff | 0.549105422882 |
- Use single season data only as NCAA has one and done rule. Since team roster will vary a lot, there is no point training only several seasons.
- Include advanced statistics for best players in each respective team. As basketball is heavily influenced by "superstars", it makes sense to include advanced statistics of the best players for training