Predicting-Baseball-Statistics

This repository contains the prediction of baseball statistics using MLB Statcast Metrics.

Goals

Classification

Build and train models to predict home runs and extra-base hits implementing the following approaches:
- Logistic Regression
- k-Nearest Neighbors
- Decision-Classification Tree
- Random Forest Classification
- Support Vector Machine Classification
- XGBoost Classification
Implement over-sampling for imbalanced data to improve the quality of predictive modeling (i.e., generalizability).
Apply regularization and cross-validation techniques for model evaluation, selection, and optimization.

Regression

Build and train models to predict hit distance implementing the following approaches:
- Linear Regression
- Decision-Regression Tree
- Random Forest Regression
Apply regularization (Ridge, Lasso, Elastic Net) and cross-validation (k-fold) techniques for model evaluation, selection, and optimization.

tweichle/Predicting-Baseball-Statistics