/Thesis_Work

Wake Forest University Senior Year Thesis Work

Primary LanguageR

Thesis Work

Noah Edwards-Thro

This is for the purpose of completing an Honors Thesis in Mathematical Business at Wake Forest University. My thesis was completed on NBA Player Clustering Using Gaussian Mixture Modeling, specifically using the mclust package in R. It was inspired by a paper presented at the 2020 MIT Sloan Sports Analytics Conference by Samuel Kalman and Jonathan Bosch called “NBA Lineup Analysis on Clustered Player Tendencies: A new approach to the positions of basketball & modeling lineup efficiency of soft lineup aggregates”. My work expands on their paper by analyzing how their clustering model performs on the most recent three years of NBA data and by investigating potential ways to achieve variable reduction with similar results.

The data can be found in the Data folder, further broken down into Raw Data, Cleaned Data, and Predictions Folders.

The code can be found in the Code folder, further broken down into Data Scraper, Data Cleaner, Models, Predictions, Z Value Analysis, and Figures R files.

There is an RDAs folder that is used to either bring in some figures or to bring in to the Final Report RMD some of the intermediate data that is not raw but also not fully cleaned.

The final paper can be found in the Final Paper pdf.