Examples from the book and a few other hacks inspired by Joel Grus' Data Science from Scratch.
I had a lot of fun with this book. Data science lends itself to the hacker's approach of diving in and getting your hands dirty with a breadth of topics.
That said, the book gives a fly-over view of some fairly deep subjects, leaving the reader with a good lay of the land but also an intimidating sense of how much there is left to learn.
I had little problem using Python 3 to work through the book even though it's done in Python 2. You'll have to add some parentheses here and there and be aware that map is a generator in Python 3. The 2nd Edition is fully Python 3 and in the works now.
- Introduction
- A Crash Course in Python
- Visualizing Data
- Linear Algebra
- Statistics
- Probability
- Hypothesis and Inference
- Gradient Descent
- Getting Data
- Working with Data
- Machine Learning
- k-Nearest Neighbors
- Naive Bayes
- Simple Linear Regression
- Multiple Regression
- Logistic Regression
- Decision Trees
- Neural Networks
- Clustering
- Natural Language Processing
- Network Analysis
- Recommender Systems
- Databases and SQL
- MapReduce
-
Joel's Jupyter-con talk I don't like notebooks and slides.
-
100+ Interesting Data Sets for Statistics by Robb Seaton
I divided things up by chapter, which makes imports difficult. You'll have to do some ridiculous thing like this:
export PYTHONPATH=./chapter_01:./chapter_03:./chapter_04:./chapter_05:./chapter_06:./chapter_07
or this:
import os.path
import sys
book_dir = '/Users/CBare/Documents/projects/data-science-from-scratch'
sys.path.extend(os.path.join(book_dir, 'chapter_{:02d}'.format(i)) for i in [3,4,5,6,7])