python-machine-learning-book

Python Machine Learning code repository.

What you can expect are 400 pages rich in useful material just about everything you need to know to get started with machine learning ... from theory to the actual code that you can directly put into action! This is not yet just another "this is how scikit-learn works" book. I aim to explain all the underlying concepts, tell you everything you need to know in terms of best practices and caveats, and we will put those concepts into action mainly using NumPy, scikit-learn, and Theano.

You are not sure if this book is for you? Please checkout the excerpts from the Foreword and Preface, or take a look at the FAQ section for further information.

^{1st edition, published September 23rd 2015

Paperback: 454 pages

Publisher: Packt Publishing

Language: English

ISBN-10: 1783555130

ISBN-13: 978-1783555130

Kindle ASIN: B00YSILNL0}

Sebastian Raschka’s new book, Python Machine Learning, has just been released. I got a chance to read a review copy and it’s just as I expected - really great! It’s well organized, super easy to follow, and it not only offers a good foundation for smart, non-experts, practitioners will get some ideas and learn new tricks here as well.
– Lon Riesberg at Data Elixir

Superb job! Thus far, for me it seems to have hit the right balance of theory and practice…math and code!
– Brian Thomas

I've read (virtually) every Machine Learning title based around Scikit-learn and this is hands-down the best one out there.
– Jason Wolosonovich

Feedback & Reviews

Table of Contents and Code Notebooks

Simply click on the ipynb/nbviewer links next to the chapter headlines to view the code examples (currently, the internal document links are only supported by the NbViewer version). Please note that these are just the code examples accompanying the book, which I uploaded for your convenience; be aware that these notebooks may not be useful without the formulae and descriptive text.

Excerpts from the Foreword and Preface

Machine Learning - Giving Computers the Ability to Learn from Data [dir] [ipynb] [nbviewer]
Training Machine Learning Algorithms for Classification [dir] [ipynb] [nbviewer]
A Tour of Machine Learning Classifiers Using Scikit-Learn [dir] [ipynb] [nbviewer]
Building Good Training Sets – Data Pre-Processing [dir] [ipynb] [nbviewer]
Compressing Data via Dimensionality Reduction [dir] [ipynb] [nbviewer]
Learning Best Practices for Model Evaluation and Hyperparameter Optimization [dir] [ipynb] [nbviewer]
Combining Different Models for Ensemble Learning [dir] [ipynb] [nbviewer]
Applying Machine Learning to Sentiment Analysis [dir] [ipynb] [nbviewer]
Embedding a Machine Learning Model into a Web Application [dir] [ipynb] [nbviewer]
Predicting Continuous Target Variables with Regression Analysis [dir] [ipynb] [nbviewer]
Working with Unlabeled Data – Clustering Analysis [dir] [ipynb] [nbviewer]
Training Artificial Neural Networks for Image Recognition [dir] [ipynb] [nbviewer]
Parallelizing Neural Network Training via Theano [dir] [ipynb] [nbviewer]

Bonus Notebooks (not in the book)

A Basic Pipeline and Grid Search Setup [dir] [ipynb] [nbviewer]
An Extended Nested Cross-Validation Example [dir] [ipynb] [nbviewer]

FAQ

General Questions

Questions about the Machine Learning Field

Questions about ML Concepts and Statistics

Cost Functions and Optimization

Regression Analysis

What is the difference between Pearson R and Simple Linear Regression?

Tree models

Model evaluation

Logistic Regression

Neural Networks

Unsupervised Learning

What are some of the issues with clustering?

Preprocessing

Naive Bayes

Other

Questions about the Book

Contact

I am happy to answer questions! Just write me an email or consider asking the question on the Google Groups Email List.

If you are interested in keeping in touch, I have quite a lively twitter stream (@rasbt) all about data science and machine learning. I also maintain a blog where I post all of the things I am particularly excited about.

ondrej-tucek/python-machine-learning-book