python-machine-learning-book

Python Machine Learning code repository.

What you can expect are 400 pages rich in useful material just about everything you need to know to get started with machine learning ... from theory to the actual code that you can directly put into action! This is not yet just another "this is how scikit-learn works" book. I aim to explain all the underlying concepts, tell you everything you need to know in terms of best practices and caveats, and we will put those concepts into action mainly using NumPy, scikit-learn, and Theano.

You are not sure if this book is for you? Please checkout the excerpts from the Foreword and Preface, or take a look at the FAQ section for further information.

^{1st edition, published September 23rd 2015

Paperback: 454 pages

Publisher: Packt Publishing

Language: English

ISBN-10: 1783555130

ISBN-13: 978-1783555130

Kindle ASIN: B00YSILNL0}

Feedback & Reviews

Sebastian Raschka’s new book, Python Machine Learning, has just been released. I got a chance to read a review copy and it’s just as I expected - really great! It’s well organized, super easy to follow, and it not only offers a good foundation for smart, non-experts, practitioners will get some ideas and learn new tricks here as well.
– Lon Riesberg at Data Elixir

Superb job! Thus far, for me it seems to have hit the right balance of theory and practice…math and code!
– Brian Thomas

I've read (virtually) every Machine Learning title based around Scikit-learn and this is hands-down the best one out there.
– Jason Wolosonovich

Table of Contents and Code Notebooks

Simply click on the ipynb/nbviewer links next to the chapter headlines to view the code examples (currently, the internal document links are only supported by the NbViewer version). Please note that these are just the code examples accompanying the book, which I uploaded for your convenience; be aware that these notebooks may not be useful without the formulae and descriptive text.

Excerpts from the Foreword and Preface
Instructions for setting up Python and the Jupiter Notebook

Machine Learning - Giving Computers the Ability to Learn from Data [dir] [ipynb] [nbviewer]
Training Machine Learning Algorithms for Classification [dir] [ipynb] [nbviewer]
A Tour of Machine Learning Classifiers Using Scikit-Learn [dir] [ipynb] [nbviewer]
Building Good Training Sets – Data Pre-Processing [dir] [ipynb] [nbviewer]
Compressing Data via Dimensionality Reduction [dir] [ipynb] [nbviewer]
Learning Best Practices for Model Evaluation and Hyperparameter Optimization [dir] [ipynb] [nbviewer]
Combining Different Models for Ensemble Learning [dir] [ipynb] [nbviewer]
Applying Machine Learning to Sentiment Analysis [dir] [ipynb] [nbviewer]
Embedding a Machine Learning Model into a Web Application [dir] [ipynb] [nbviewer]
Predicting Continuous Target Variables with Regression Analysis [dir] [ipynb] [nbviewer]
Working with Unlabeled Data – Clustering Analysis [dir] [ipynb] [nbviewer]
Training Artificial Neural Networks for Image Recognition [dir] [ipynb] [nbviewer]
Parallelizing Neural Network Training via Theano [dir] [ipynb] [nbviewer]

Bonus Notebooks (not in the book)

Logistic Regression Implementation [dir] [ipynb] [nbviewer]
A Basic Pipeline and Grid Search Setup [dir] [ipynb] [nbviewer]
An Extended Nested Cross-Validation Example [dir] [ipynb] [nbviewer]
A Simple Barebones Flask Webapp Template [view directory][download as zip-file]
Reading handwritten digits from MNIST into NumPy arrays [GitHub ipynb] [nbviewer]
Scikit-learn Model Persistence using JSON [GitHub ipynb] [nbviewer]
Multinomial logistic regression / softmax regression [GitHub ipynb] [nbviewer]

Note

I have set up a separate library, mlxtend, containing additional implementations of machine learning (and general "data science") algorithms. I also added implementations from this book (for example, the decision region plot, the artificial neural network, and sequential feature selection algorithms) with additional functionality.

Dear readers,
first of all, I want to thank all of you for the great support! I am really happy about all the great feedback you sent me so far, and I am glad that the book has been so useful to a broad audience.

Over the last couple of months, I received hundreds of emails, and I tried to answer as many as possible in the available time I have. To make them useful to other readers as well, I collected many of my answers in the FAQ section (below).

In addition, some of you asked me about a platform for readers to discuss the contents of the book. I hope that this would provide an opportunity for you to discuss and share your knowledge with other readers:

Google Groups Discussion Board

(And I will try my best to answer questions myself if time allows! :))

The only thing to do with good advice is to pass it on. It is never of any use to oneself.
— Oscar Wilde

FAQ

General Questions

Questions about the Machine Learning Field

Questions about ML Concepts and Statistics

Cost Functions and Optimization

Regression Analysis

What is the difference between Pearson R and Simple Linear Regression?

Tree models

Model evaluation

Logistic Regression

Neural Networks and Deep Learning

Preprocessing, Feature Selection and Extraction

Naive Bayes

Other

Programming Languages and Libraries for Data Science and Machine Learning

Questions about the Book

Contact

I am happy to answer questions! Just write me an email or consider asking the question on the Google Groups Email List.

If you are interested in keeping in touch, I have quite a lively twitter stream (@rasbt) all about data science and machine learning. I also maintain a blog where I post all of the things I am particularly excited about.

purgna/python-machine-learning-book

python-machine-learning-book

Feedback & Reviews

Links

Translations

Literature References & Further Reading Resources

Image Gallery

Errata