python-machine-learning-book

Python Machine Learning code repository.

What you can expect are 400 pages rich in useful material just about everything you need to know to get started with machine learning ... from theory to the actual code that you can directly put into action! This is not yet just another "this is how scikit-learn works" book. I aim to explain all the underlying concepts, tell you everything you need to know in terms of best practices and caveats, and we will put those concepts into action mainly using NumPy, scikit-learn, and Theano.

You are not sure if this book is for you? Please checkout the excerpts from the Foreword and Preface, or take a look at the FAQ section for further information.

^{1st edition, published September 23rd 2015

Paperback: 454 pages

Publisher: Packt Publishing

Language: English

ISBN-10: 1783555130

ISBN-13: 978-1783555130

Kindle ASIN: B00YSILNL0}

Citing this Book

You are very welcome to re-use the code snippets or other contents from this book in scientific publications and other works; in this case, I would appreciate citations to the original source:

BibTeX:

@Book{raschka2015python,
 author = {Raschka, Sebastian},
 title = {Python Machine Learning},
 publisher = {Packt Publishing},
 year = {2015},
 address = {Birmingham, UK},
 isbn = {1783555130}
 }

MLA:

Raschka, Sebastian. Python machine learning. Birmingham, UK: Packt Publishing, 2015. Print.

Feedback & Reviews

Sebastian Raschka’s new book, Python Machine Learning, has just been released. I got a chance to read a review copy and it’s just as I expected - really great! It’s well organized, super easy to follow, and it not only offers a good foundation for smart, non-experts, practitioners will get some ideas and learn new tricks here as well.
– Lon Riesberg at Data Elixir

Superb job! Thus far, for me it seems to have hit the right balance of theory and practice…math and code!
– Brian Thomas

I've read (virtually) every Machine Learning title based around Scikit-learn and this is hands-down the best one out there.
– Jason Wolosonovich

Table of Contents and Code Notebooks

Simply click on the ipynb/nbviewer links next to the chapter headlines to view the code examples (currently, the internal document links are only supported by the NbViewer version). Please note that these are just the code examples accompanying the book, which I uploaded for your convenience; be aware that these notebooks may not be useful without the formulae and descriptive text.

Excerpts from the Foreword and Preface
Instructions for setting up Python and the Jupiter Notebook

Machine Learning - Giving Computers the Ability to Learn from Data [dir] [ipynb] [nbviewer]
Training Machine Learning Algorithms for Classification [dir] [ipynb] [nbviewer]
A Tour of Machine Learning Classifiers Using Scikit-Learn [dir] [ipynb] [nbviewer]
Building Good Training Sets – Data Pre-Processing [dir] [ipynb] [nbviewer]
Compressing Data via Dimensionality Reduction [dir] [ipynb] [nbviewer]
Learning Best Practices for Model Evaluation and Hyperparameter Optimization [dir] [ipynb] [nbviewer]
Combining Different Models for Ensemble Learning [dir] [ipynb] [nbviewer]
Applying Machine Learning to Sentiment Analysis [dir] [ipynb] [nbviewer]
Embedding a Machine Learning Model into a Web Application [dir] [ipynb] [nbviewer]
Predicting Continuous Target Variables with Regression Analysis [dir] [ipynb] [nbviewer]
Working with Unlabeled Data – Clustering Analysis [dir] [ipynb] [nbviewer]
Training Artificial Neural Networks for Image Recognition [dir] [ipynb] [nbviewer]
Parallelizing Neural Network Training via Theano [dir] [ipynb] [nbviewer]

Equation Reference [PDF] [TEX]

Bonus Notebooks (not in the book)

Logistic Regression Implementation [dir] [ipynb] [nbviewer]
A Basic Pipeline and Grid Search Setup [dir] [ipynb] [nbviewer]
An Extended Nested Cross-Validation Example [dir] [ipynb] [nbviewer]
A Simple Barebones Flask Webapp Template [view directory][download as zip-file]
Reading handwritten digits from MNIST into NumPy arrays [GitHub ipynb] [nbviewer]
Scikit-learn Model Persistence using JSON [GitHub ipynb] [nbviewer]
Multinomial logistic regression / softmax regression [GitHub ipynb] [nbviewer]

"Bonus Content" (not in the book)

Model evaluation, model selection, and algorithm selection in machine learning - Part I

We had such a great time at SciPy 2016 in Austin! It was a real pleasure to meet and chat with so many readers of my book. Thanks so much for all the nice words and feedback! And in case you missed it, Andreas Mueller and I gave an Introduction to Machine Learning with Scikit-learn; if you are interested, the video recordings of Part I and Part II are now online!

Note

I have set up a separate library, mlxtend, containing additional implementations of machine learning (and general "data science") algorithms. I also added implementations from this book (for example, the decision region plot, the artificial neural network, and sequential feature selection algorithms) with additional functionality.

Translations

Dear readers,
first of all, I want to thank all of you for the great support! I am really happy about all the great feedback you sent me so far, and I am glad that the book has been so useful to a broad audience.

Over the last couple of months, I received hundreds of emails, and I tried to answer as many as possible in the available time I have. To make them useful to other readers as well, I collected many of my answers in the FAQ section (below).

In addition, some of you asked me about a platform for readers to discuss the contents of the book. I hope that this would provide an opportunity for you to discuss and share your knowledge with other readers:

Google Groups Discussion Board

(And I will try my best to answer questions myself if time allows! :))

The only thing to do with good advice is to pass it on. It is never of any use to oneself.
— Oscar Wilde

Examples and Applications by Readers

Once again, I have to say (big!) THANKS for all the nice feedback about the book. I've received many emails from readers, who put the concepts and examples from this book out into the real world and make good use of them in their projects. In this section, I am starting to gather some of these great applications, and I'd be more than happy to add your project to this list -- just shoot me a quick mail!

FAQ

General Questions

Questions about the Machine Learning Field

Questions about ML Concepts and Statistics

Cost Functions and Optimization

Regression Analysis

What is the difference between Pearson R and Simple Linear Regression?

Tree models

Model evaluation

Logistic Regression

Neural Networks and Deep Learning

Preprocessing, Feature Selection and Extraction

Naive Bayes

Other

Programming Languages and Libraries for Data Science and Machine Learning

Questions about the Book

Contact

I am happy to answer questions! Just write me an email or consider asking the question on the Google Groups Email List.

If you are interested in keeping in touch, I have quite a lively twitter stream (@rasbt) all about data science and machine learning. I also maintain a blog where I post all of the things I am particularly excited about.

What's Next

SciPy 2016 in Austin, Texas is coming up soon. I am really excited to teach the scikit-learn tutorial session with Andreas Mueller on July 12th this year! I am looking forward to seeing & meeting you all there!
I have received a bunch of emails lately, and YES, I am really looking forward to writing a new book! Deep learning is the topic that excites me most at the moment, but I think that building up the math background in an interesting, engaging way may be time well spent?! I've been brainstorming lately, and a "Think Machine Learning" series may be a cool idea. I am planning to write about calculus, linear algebra, probability theory, statistics, and all the other puzzle pieces to assemble the big picture: Deep learning. So, stay tuned while I am getting started! :)

willjleong/python-machine-learning-book

python-machine-learning-book

Citing this Book

Feedback & Reviews

Links

Translations

Literature References & Further Reading Resources

Errata

Table of Contents and Code Notebooks

Translations

Google Groups Discussion Board

Examples and Applications by Readers

FAQ

General Questions

Questions about the Machine Learning Field

Questions about ML Concepts and Statistics

Cost Functions and Optimization

Regression Analysis

Tree models

Model evaluation

Logistic Regression

Neural Networks and Deep Learning

Other Algorithms for Supervised Learning

Unsupervised Learning

Semi-Supervised Learning

Ensemble Methods

Preprocessing, Feature Selection and Extraction

Naive Bayes

Other

Programming Languages and Libraries for Data Science and Machine Learning

Questions about the Book

Contact

What's Next