This repository contains all student materials (assignments, data, ...) for the programming data science course in summer 2019.
If you participate in the course, please send an email to Jörn Grahl and ask for access.
I grant students read-only access as a collaborator. Please fetch updates regularly.
Here is the tentative schedule for summer 2019:
# | Date | Topic |
---|---|---|
1 | 04.04.2019 | Fundamentals Organization, dates, groups, project leaders. Tools and accounts. Source control. Reproducibility. Coding and reporting. Method chaining. Folders. |
2 | 11.04.2019 | What can computers learn from data? Mapping questions to model classes, statements, and tests. The ladder of causality. |
3 | 25.04.2019 | Visualizations (plots) Grammar of graphics |
4 | 02.05.2019 | Coding 1 Data wrangling 1/2 & method chaining |
5 | 09.05.2019 | Coding 2 Data wrangline 2/2 |
6 | 16.05.2019 | Flexible tables for descriptive statistics (and everything else), regression results |
7 | 23.05.2019 | Comparisons: basic inference, tests, p-values, multiple comparisons |
8 | 06.06.2019 | Choosing the best model Overfitting, bias-variance tradeoff, resampling schemes, choosing the best ML model, cross-validation, paired T-tests, training and evaluation errors |
9 | 27.06.2019 | In-depth linear regression |
10 | 04.07.2019 | In-depth decision trees |
11 | 11.07.2019 | Buffer |