- Overview
- Software & Programming
- Data Science Software Development
- Data Science Ethics
- Developing interactive applications
- Visualisation
- GIS and Spatial Data Science
- Time series analysis
- Generalised Additive Modelling (GAMs)
- Statistics
- Data Science community groups
- Natural language processing
- Machine Learning
- Special Topics
๐ง This page is a work in progress!
The goal of this page is to gather resources and learning materials across a broad range of popular data science topics and arrange them thematically. Resources have been selected because they are
- High quality
- Free of charge
- Donโt require readers to sign up
Remember that material that is offered freely on the web is paid for by the authorโs time - if you find a resource particularly useful, consider supporting them in whatever way they prefer. If you find this page useful please share it and spread the word! If you find a mistake or broken link, please file an issue or submit a pull request.
Key to resource types
โถ๏ธ = Video or webinar- ๐ = Course
- ๐ = Tutorial or blog post
- ๐ = Book or book chapter
- ๐ฅ = Community or user forum
- ๐ = Journal or technical article
- ๐ก = Cheat sheet
- ๐ Modern Dive: Getting Started by Chester Ismay and Albert Y. Kim. The very first of first steps. Install R & RStudio and what to do after that.
- ๐ Basic Basics by R Ladies Sydney. Tour of RStudio, installing and using packages and getting data into RStudio.
- ๐ Teacups, Statistics and Giraffes by Hasse Walum and Desirรฉe de Leon
โถ๏ธ A Gentle Introduction to Tidy Statistics in R by Thomas Mock, RStudio. Webinar covering exploratory data analysis, tidyverse, statistical testing and plotting.- ๐ The R Bootcamp by Ted Laderas and Jessica Minnier. A tidyverse-centric interactive course for data manipulation, graphics, data reshaping, and statistical modelling.
- ๐ Ready for R by Ted Laderas
- ๐ RStudio Primers by RStudio. Interactive tutorials from RStudio covering data manipulation, visualisation and programming with R.
- ๐ Swirl: Learn R, in R by Ismael Fernรกndez, Nick Carchedi and Sean Kross
- ๐ Using R for Data Journalism by Andrew Ba Tran
- ๐ R for Data Science by Garrett Grolemund and Hadley Wickham
- ๐ก Base R Cheat Sheet by Mhairi McNeill. Quick overview of basic R functionality.
- ๐ Tidynomicon - A Brief Introduction to R for People Who Count From Zero by Greg Wilson. An introduction to R for Python users.
- ๐ Hands-on Programming with R by Garrett Grolemund. A friendly introduction to the R language for non-programmers.
- ๐ R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics by James (JD) Long, Paul Teetor. Recipes and worked examples for performing core tasks in R.
- ๐ R package primer: a minimal tutorial by Karl Broman. Overview of R packages development.
- ๐ R Packages by Hadley Wickham and Jennifer Bryan. Comprehensive guide to how R packages work and how to write your own.
- ๐ Efficient R programming by Colin Gillespie and Robin Lovelace. Comprehensive introduction to writing faster and more efficient R code.
- ๐ Advanced R by Hadley Wickham
โถ๏ธ RStudio Webinars by RStudio- ๐ An Introduction to R by W. N. Venables, D. M. Smith and the R Core Team
- ๐ / ๐ Data science for economists by Grant McDermott
- ๐ / ๐ Big Data in Economics by Grant McDermott
- ๐ Install Python and Anaconda by Anaconda.
- ๐ Free interactive introduction to Python and pandas.
- ๐ Quick reference to Python in a single script and notebook by Kevin Markham.
- ๐ An Introduction to Python and Programming by Alexander Hess.
- ๐ Learn Python by Ron Reiter.
- ๐ก Pandas Cheat Sheet by the Pandas development team.
- ๐ 10 minutes to pandas by the Pandas development team
- ๐ Python Data Science Handbook by Jake VanderPlas.
- ๐ Python for Everybody: Exploring Data Using Python 3 by Charles R. Severance
- ๐ Learn Shell by Ron Reiter.
- ๐ The Unix Shell by Software Carpentry
- ๐ The Beginnerโs Guide to Shell Scripting: The Basics by Yatri Trivedi
- ๐ Beginners/BashScripting by Ubuntu Documentation
โถ๏ธ How to Write a Shell Script using Bash Shell in Ubuntu by FS Tutorial
- ๐ RegexOne: Learn Regular Expressions with simple, interactive exercises. by RegexOne
- ๐ Regular Expressions 101: Online Regular Expression Tester and Debugger by Firas Dib
- ๐ก Data Science Cheat Sheet: Python Regular Expressions by Dataquest
- ๐กRegular Expressions Cheat Sheet by Dave Child
- ๐ Getting started with Git and GitHub: the complete beginnerโs guide by Anne Bonner
- ๐ An introduction to Git and how to use it with RStudio by Franรงois Michonneau
- ๐ก Git Cheat Sheet by GitHub
- ๐ Pro Git by Scott Chacon and Ben Straub
- ๐ Happy Git and GitHub for the useR by Jenny Bryan, the STAT 545 TAs, Jim Hester
- ๐ก PySpark Cheat Sheet by Kevin Schaich
- ๐ Mastering Spark with R by Javier Luraschi, Kevin Kuo and Edgar Ruiz
โถ๏ธ R & Spark: How to Analyze Data Using RStudioโs Sparklyr by Nathan Stephens- ๐A Gentle Introduction to Spark by DataBricks
- ๐ Learn JS by Ron Reiter
- ๐ JavaScript for Data Science by Maya Gans, Toby Hodges, and Greg Wilson
- ๐ / ๐ The SQL Tutorial for Data Analysis by mode.com. Tutorials and interactive excercies teaching fundamentals of SQL.
- ๐ SQLBolt: Learn SQL with simple, interactive exercises.
- ๐ / ๐ SQLZoo: SQL Tutorial. Wikibook with interactive exercises.
- ๐ Intro to SQL: Querying and managing data by Khan Academy
- ๐ LearnSQLOnline by Ron Reiter
- ๐ Software development skills for data scientists by Trey Causey
- ๐ Hidden Technical Debt in Machine Learning Systems
- ๐ How rOpenSci uses Code Review to Promote Reproducible Science by Noam Ross, Scott Chamberlain, Karthik Ram and Maรซlle Salmon
- ๐ Best Practices for Computational Science: Software Infrastructure and Environments for Reproducible and Extensible Research by Victoria Stodden and Sheila Miguez
- ๐ Journalism as a Professional Model for Data Science by Brian C. Keegan
- ๐ Cookiecutter Data Science by drivendata
- ๐ A Code of Ethics for Data Science by DJ Patil
- ๐ The Ethical Data Scientist Cathy Oโ Neil
- ๐ An ethics checklist for data scientists by drivendata
- โฏ / ๐ Learn Shiny by RStudio
- ๐ A gRadual intRoduction to Shiny by Ted Laderas and Jessica Minnier
- ๐ Dashboards by Yihui Xie, J. J. Allaire, Garrett Grolemund. Chapter 5 from โR Markdown: The Definitive Guideโ.
- ๐ Leaflet for R by RStudio
- ๐ Dash User Guide by Plotly
- ๐ Fundamentals of Data Visualization by Claus O. Wilke
- ๐ ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham
- ๐ 3D Mapping and Visualization with R and Rayshader by Tyler Morgan-Wall
- ๐ Forecasting: Principles and Practice by Rob J Hyndman and George Athanasopoulos
- ๐ 11 Classical Time Series Forecasting Methods in Python (Cheat Sheet) by Jason Brownlee
- ๐ GAMs in R by Noam Ross Interactive course introducing Generalised Additive Models (GAMs).
- ๐ Resources for Learning About and Using GAMs in R by Noam Ross
- ๐ Statistical Inference via Data Science: A Modern Dive into R and the tidyverse by Chester Ismay and Albert Y. Kim
- ๐ Think Stats Exploratory Data Analysis in Python by Allen B. Downey
- ๐ Learning statistics with R: A tutorial for psychology students and other beginners Danielle Navarro
- ๐ Probabilistic Programming & Bayesian Methods for Hackers by Cameron Davidson-Pilon
- ๐ From Algorithms to Z-Scores: Probabilistic and Statistical Modeling in Computer Science by Norm Matloff
- ๐ Theory of Statistics by James E. Gentle
- ๐ Core Statistics by Simon Wood
- ๐ฅ PyData Meetup Groups
- ๐ฅ PyLadies by PyLadies
- ๐ฅ Directory of R User Groups by Jumping Rivers
- ๐ฅ Complete list of R-Ladies groups by R-Ladies Global
- ๐ฅ R for Data Science Online Learning Community
- ๐ฅ Tidy Tuesday A weekly podcast and community activity brought to you by the R4DS Online Learning Community
- ๐ฅSatRdays SatRdays are R-focused conferences that are held on Saturdays
- ๐ Text Mining with R: A Tidy Approach by Julia Silge and David Robinson
- ๐ Advanced NLP with SpaCy by Ines Montani
- ๐ 100 Must read papers in NLP by Masato Hagiwara
- ๐ Stanford CS 124: From Languages to Information by Dan Jurafsky
- ๐ Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit by Steven Bird, Ewan Klein, and Edward Loper.
- ๐ A Code-First Intro to Natural Language Processing by fast.ai. The course is taught in Python with Jupyter Notebooks, using libraries such as sklearn, nltk, pytorch, and fastai.
- ๐ Speech and Language Processing by Dan Jurafsky and James H. Martin
โถ๏ธ BERT Research Series by Chris McCormick
- ๐ Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani and Jerome Friedman (2017)
- ๐ Computer Age Statistical Inference: Algorithms, Evidence and Data Science by Bradley Efron and Trevor Hastie (2017).
- ๐Mathematics for Machine Learning by Marc Peter Deisenroth, A. Aldo Faisal, Cheng Soon Ong
- ๐ Distill
- ๐ Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman, Jeff Ullman
- ๐ Supervised Machine Learning Case Studies in R by Julia Silge.
- ๐ / ๐ฎ Introduction to machine learning with scikit-learn by Justin Markham
- ๐ scikit-learn User Guide by scikit-learn
- ๐ Introduction to Machine Learning for Coders by Jeremy Howard.
- ๐ Interpretable Machine Learning: A Guide for Making Black Box Models Explainable by Christoph Molnar (2020)
- ๐ Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani