A list of data science resources.
Resource | Comments |
---|---|
R for Data Science | Introduction to modern R programming using the tidyverse |
Text Mining with R | Guide to a modern approach to text mining in R |
The caret package | Guide to performing machine learning in R using the caret package |
An Introduction to Statistical Learning | A book on machine learning with examples in R |
Shiny | Learning for to use the Shiny package to create interactive dashboards |
Advanced R | Advanced guide to R, particularly good is the style guide |
R packages | Guide to writing packages |
The reticulate package | Website of the reticulate package that allows you to use Python functions within R. For example, see this blog where it is used to embed a Python model within a Shiny web app. (Similarily look at the feather package for passing dataframes between R and Python.) |
R-bloggers | Blogs about the use of R in analytics |
knitr in a knutshell | A short introduction to the knitr package for reproducible research |
Resource | Comments |
---|---|
bl.ocks | Website showing popular D3 examples |
Observable | A notebook approach to written D3 and javascript |
Search the Bl.ocks | Search D3 examples produced my others (great for inspiration!) |
D3 Tips and Tricks | A good book about D3 |
D3 in Depth | A good introduction to writing D3 |
D3 tutorial list | A list of D3 tutorials from the D3 website |
A better way to structure D3 code | Interesting blog post on how to strucutre D3 code |
Eloquent JavaScript | Knowing a bit of JavaScript is a prerequisite for mastering D3 and this book is a good introduction |
A Tour Through the Visualization Zoo | A good introduction to a wide range of visualisations you could do in D3 (though here they have been done in a precusor to D3) |
dc.js | A library that combines D3 and crossfilter that makes it easier to create interactive dashboards |
Resource | Comments |
---|---|
scikit-learn | sklearn is the go-to Python package for machine learning and the documentation is a worth of information, not only on usage but also about the techniques themselves |
Modern Pandas | A guide to using pandas dataframes |
imbalanced-learn | A package to deal with classifying imbalanced data with excellent documentation |
Natural Language Processing with Python | This is a book on NLP in Python from the team behind the NLTK package. For text mining you should also look into spaCy and gensim (for topic modelling) |
Requests: HTTP for Humans | Library for making HTTP requests from Python, great way of making API calls |
Flask | Python framework for creating web apps |
Seven Strategies for Optimizing Numerical Code | Slides on different approaches to speeding up Python code |
Comparing Python Clustering Algorithms | Does what it says on the tin! |
Style Guide for Python Code | This is PEP 8, the official style guide for Python. One incentive to following its guidance is that your code will better integrate with IDEs |
Resource | Comments |
---|---|
Feature Engineering and Selection | Guide to feature engineering and model selection |
Kaggle Ensembling Guide | A guide to combining models to approve performance |
Elements of Statistical Learning | The classic text on machine learning |
Resource | Comments |
---|---|
Neural Networks and Deep Learning | Simple introduction to neural networks |
Convolutional Neural Networks for Visual Recognition | Stanford course on convolutional neural networks |
Understanding Convolutional Neural Networks for NLP | Article explaining CNNs in the context of NLP |
On word embeddings | Introduction to word embeddings |
fast.ai | Online AI course |
Resource | Comments |
---|---|
Towards Data Science | Interesting articles about data science |
Data Science Weekly | Weekly data science newsletter that aggregates articles on data science |
Why Use Make | Thoughts from Mike Bostock on using make for reproducible research |
Statistical Modeling: The Two Cultures | Leo Breiman's article on the difference between statistical models and algorithmic models |