/intro_to_data_science_R

Introduction to Data Science with R course

Primary LanguageHTML

R corporate training course

This repository contains a corporate training course in R. The course starts with the basics and adds some more advanced aspects of R. The course was developed by Adi Sarid, of the Sarid Research Institute LTD.

See more at Adi's blog, and at Sarid Research Institute. Get in touch: Twitter @SaridResearch, LinkedIn The course was built in cooperation with Naya College.

License

The course as a whole is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0

Syllabus

The course covers the following topics:

  • Introduction, RStudio IDE, Syntax (base vs. tidyverse), functions, (base-r) loops, data types (Intermediate: + efficiency and benchmarking)
  • “Telling stories with charts” Visualizations with ggplot2 (theory and practice) (Intermediate TBD we might add one of: gganimate, plotly, leaflet)
  • Basic data import read/write (csv, excel)
  • Introduction to the tidyverse (dplyr, tidyr, tibbles) + Data preparation and transformation
  • Tidyverse continued: exercises + extensions (dates, strings, factors) + Exploratory data analysis, handling outliers
  • Solving business problems with R (Modelling, optimization, classification/regression, ROC)
  • purrr-ing functions (functional programming and iterations with map()) (Intermediate: + make your functions tidyverse friendly - non standard evaluation)
  • Solving business problems with R - part 2: Additional exercises + Modelling extended
  • TBD: Either time series OR RMarkdown OR Dashboards + Summary + Learning more.

How this repository is arranged

The repository contains presentations (slides) in pptx and pdf formats, exercises (in RMarkdown and in pdf), and accompanying datasets needed for the exercises.

├── presentations
|   ├── pptx
|   ├── pdf
├── exercises
|   ├── answers
├── datasets

Additional sources