/intro-tidyverse-2022

Course materials and website for "Introduction to Data Science with R and Tidyverse" for January 2022 Course

Primary LanguageRCreative Commons Attribution Share Alike 4.0 InternationalCC-BY-SA-4.0

Introduction to Data Science with R and Tidyverse

This repository contains all materials for the course Introduction to Data Science with R and Tidyverse, offered for GRADE Brain and other GRADE Centers at Goethe University in January 2022. Additionally, it serves the course website for students, which you can access here.

Course Objective

Most academic fields require proficiency in at least one data-centered analysis tool. For many, the R programming language has become the tool of choice.

However, the first steps in coding can be intimidating and discouraging — primarily if you have never worked with a programming language before. This course aims at providing a results-oriented, applied, and hands-on introduction to the most critical parts of a Data Science project in R. We will introduce the libraries and frameworks necessary for your analysis and focus on teaching you the implementation and application of those tools with small examples that you can work on yourself.

Our goal is to show you the scope of possibilities within R and leave you with the impression that you can confidently implement your empirical projects in R. We will focus on the Tidyverse ecosystem, a consistent and intuitive framework for building your data analysis from start to finish. After completing this course, you will know how to apply the essential Tidyverse tools for everyday Data Science tasks in R — primarily data wrangling, data visualization, and communicating results.

Course Description

This course aims at beginners who are either completely new to R as a programming language and/or want to learn about the Tidyverse ecosystem. The course is structured in the following way:

Introduction to the Tidyverse

  • Reading data into tibbles with readr and a short primer on data types
  • Plotting with ggplot2: aesthetics, geoms and the grammar of graphics
  • Data wrangling with dplyr: mutate(), select(), filter(), group_by(), summarize(), …_join(), pipe-operator
  • Communicating your analyses with RMarkdown in a reproducible way

Short primer on modeling with R

  • Univariate and multivariate linear regression with lm()
  • Visualizing regressions with ggplot2

Next steps on your journey with R

We will not cover deeper statistical or theoretical concepts in this course, as the focus will lie on applied coding.

Methods

The course will alternate between short introductions to a concept or method and small do-it-yourself coding exercises. In between the three sessions, you are encouraged to work on provided exercises that further deepen your understanding

Conditions

  • No prior coding experience needed. This is a beginner-friendly course. You are also more than welcome to participate if you have experience in R but want to learn more about the Tidyverse.
  • An RStudio Cloud account. Since we do not want to waste precious course time on the technical setup, we will use the RStudio Cloud as a simple and already set up development environment. We will send out detailed instructions and an invitation link in advance.

Trainers

In the last three years, your trainers have developed and taught TechAcademy’s Data Science with R program at Goethe University. They use Data Science methods and R on a daily basis in their academic and non-academic jobs.

  • Lukas Jürgensmeier, M.Sc., PhD Student in Quantitative Marketing and Member of the Executive Board at TechAcademy e.V.
  • Lara Zaremba, M.Sc., Trainee Data Science at the European Central Bank and R Teacher and Course Designer at TechAcademy e.V.
  • Karlo Lukic, M.Sc., PhD Student in Quantitative Marketing, R Teacher and Course Designer at TechAcademy e.V.