/data-science-from-scratch

code for Data Science From Scratch book

Primary LanguagePythonThe UnlicenseUnlicense

Data Science from Scratch

Here's all the code and examples from the book Data Science from Scratch by Joel Grus. The code directory contains Python 2.7 versions, and the code-python3 direction contains the Python 3 equivalents. (I tested them in 3.5, but they should work in any 3.x.)

Each can be imported as a module, for example (after you cd into the /code directory):

from linear_algebra import distance, vector_mean
v = [1, 2, 3]
w = [4, 5, 6]
print distance(v, w)
print vector_mean([v, w])

Or can be run from the command line to get a demo of what it does (and to execute the examples from the book):

python recommender_systems.py

Additionally, all the links from the book are also provided.

And, by popular demand, I made an index of functions defined in the book, by chapter and page number. The data is in a spreadsheet, or I also made a toy (experimental) searchable webapp.

Table of Contents

  1. Introduction
  2. A Crash Course in Python
  3. Visualizing Data
  4. Linear Algebra
  5. Statistics
  6. Probability
  7. Hypothesis and Inference
  8. Gradient Descent
  9. Getting Data
  10. Working With Data
  11. Machine Learning
  12. k-Nearest Neighbors
  13. Naive Bayes
  14. Simple Linear Regression
  15. Multiple Regression
  16. Logistic Regression
  17. Decision Trees
  18. Neural Networks
  19. Deep Learning
  20. Clustering
  21. Natural Language Processing
  22. Network Analysis
  23. Recommender Systems
  24. Databases and SQL
  25. MapReduce
  26. Data Ethics
  27. Go Forth and Do Data Science!