Welcome to your community-sourced data science repo! The overarching goal here is to provide anyone interested in learning data science with a wealth of open source, industry-best learning materials and learning tracks.
This repo is a work in progress. Please check back for updates. @momiji15, @Annu-07, and I are collaborating on the structure for this repo. If you would like to be involved in that process, please file an issue in this repo and we will add you to our Slack channel.
This repo is motivated by recent incidents. The data science community deserves better, and this repo is an attempt to provide a platform for the excellent learning resources available.
- Dataquest
- Software Carpentry Lessons
- Data Carpentry Lessons
- Chromebook Data Science
- Business Science University
Many instructors have admirably advocated against taking their own DataCamp courses. Often, these instructors have suggested other ways in which learners can access the same material. The suggested replacements for their courses are listed below:
- Also see here
Working with the RStudio IDE (Part 1)
Working with the RStudio IDE (Part 2)
- Also see here
Importing & Cleaning Data in R: Case Studies
- See Guided Project: NYC Schools Perceptions
Working with Dates and Times in R
Categorical Data in the Tidyverse
- Also see here
- Next, learn about iteration
Data Manipulation in R with dplyr
Data Analysis in R, the data.table Way
Building Processing Pipelines in data.table
- Also see the
usethis
documentation
Foundations of Probability in R
-
See weeks 3 and 4
-
Also see here
Dealing With Missing Data in R
Advanced Dimensionality Reduction in R
Fundamentals of Bayesian Data Analysis in R
Structural Equation Modeling with lavaan in R
Introduction to Machine Learning
Supervised Machine Learning: Case Studies in R
- Also see a book on the
caret
package
Differential Expression Analysis in R with limma
Bayesian Regression Modeling with rstanarm
- Also see a walkthrough article and a practical example
Introduction to Time Series Analysis
- Also see here
Forecasting Product Demand in R
- Also see here
Nonlinear Modeling in R with GAMs
Marketing Analytics in R: Choice Modeling
- Please see Chapter 13
- Also see the
mlr
package docs and theh2o
package docs
Exploratory Data Analysis in R: Case Study
- Please see week 4
Visualization Best Practices in R
Data Visualization with ggplot2 (Part 1)
Data Visualization with ggplot2 (Part 2)
Building Dashboards with shinydashboard
Building Dashboards with flexdashboard
Interactive Data Visualization with rbokeh
Interactive Maps with leaflet
in R
Working with Geospatial Data in R
Building Web Applications in R with Shiny
Building Web Applications in R with Shiny: Case Studies
Introduction to Text Analysis in R
Sentiment Analysis in R: The Tidy Way
Analyzing Election and Polling Data in R
Single-Cell RNA-Seq Workflows in R
Data science for the medical and biomedical sciences (ds4biomed)
- Also see here
Introduction to Data Science in Python
Intermediate Python for Data Science
Object-Oriented Programming in Python
Analyzing Police Activity with pandas
Interactive Data Visualization with Bokeh
- Also see here (requires subscription)
- Please see the Intermediate SQL section
- Also see here (requires subscription)
Introduction to Git for Data Science
-
Also see here
-
Also see git branching
Introduction to Shell for Data Science
Please feel free to submit a pull request. The full list of DC courses can be found here