/PO_R_For_DataScience

Exploration of methods using R for Data Science

Primary LanguageHTML

PO_R_For_DataScience

Exploration of methods using R for Data Science

In this project, I go through methods described in R for Data Science by Hadley Wickham and Garrett Grolemund, as well as Advanced R by Hadley Wickham.

I will also explore some concepts more deeply as I go through the books.

Steps of data science:

  1. Data importation: bring data from other places into R or other analytic software
  2. Tidying: organize the data into a format usable by statistical packages
  3. Transformation: select and prepare the data needed for the analysis
  4. Visualisation: explore questions through visualisation
  5. Modelling: answer questions through modelling
  6. Communication: communicate the results

Tidying + Transformation = Wrangling

Hypothesis generation vs. Hypothesis confirmation

  1. Hypothesis generation: explore the data to uncover patterns, ask questions about processes, and explain the patterns
  2. Hypothesis confirmation: develop a model that can reproduce the patterns in the data and confirm the theory