This repository hosts the R code and data file for bootstrapping. With a finite number of samples, the bootstrap method comes in handy and provides valid inferences about the population distribution. However, there are several pitfalls that beginners often fail to account for.
Treating the NBA Game 6 between the Celtics and Heat as an example, I walk through the process of creating bootstrap samples along with the model interpretations. The entire write-up is available at https://towardsdatascience.com/a-practical-guide-to-bootstrap-with-r-examples-bd975ec6dcea.
This project is conducted in the R environment, and you have to pre-install the boot library.
This dataset is the last post-season game between two NBA teams -- Celtics and Heat -- at DraftKings. The dataset contains four variables: Player, Roster Position, %Drafted, and FPTS.
Player: NBA players' names
Roster Position: the position the player held in the DraftKings lineups, whether as Captain (CPT) or Utility (UTIL). DraftKings has a 1.5 multiplication power over the captain.
%Drafted: the percentage of people drafted this player
FPTS: Fantasy Points
This project focuses on the last two variables -- %Drafted and FPTS -- and tries to understand how strongly they correlate with repeated samplings.
Leihua Ye is a Ph.D. Researcher at the UC, Santa Barbara. He has received extensive training in Causal Inference, Research Design, Machine Learning, Big Data, and Machine Learning.
He receives his B.A. and M.A. from the Uni. of Nottingham.
Email: yeleihua@gmail.com
LinkedIn: www.linkedin.com/in/leihuaye
Tech Blog: https://leihua-ye.medium.com