/MLB-Analytics-Pipeline

This Python project scrapes and cleans MLB data, using pivot tables to visualize scenarios such as home vs away or opponents. It models and visualizes statistical trends, including regression analysis and league leaders, for any specified statistic and scenario. Detailed analysis of individual and team performances is provided.

Primary LanguageJupyter Notebook

MLB-Analytics-Pipeline

This Python project is a web scraper that retrieves MLB data for a specified season, team, or player. The data is then cleaned and transformed into pandas dataframes, which are further divided into pivot tables based on various scenarios such as home vs away, right-handed vs left-handed pitchers, opponents, and months. The project includes functions to help visualize these scenarios. Additionally, the project incorporates multiple functions to model and visualize statistical trends, such as league leaders, R-squared comparisons, regression analysis/visualization, and distribution visualization, for any specified statistic and scenario. Overall, this project provides a comprehensive analysis of MLB data, allowing for detailed examination of individual and team performances.

Libraries Utilized: Pandas, Numpy, Matplotlib, Scikit-Learn, Statsmodels