This project contains solutions to the Stanford Machine Learning course exercises implemented with Python and scikit-learn. The scikit-learn machine learning library provides optimized implementations for all algorithms presented in the course and needed in the course exercises. Instead of writing low-level Octave code, as required by the course, the solutions presented here demonstrate how to use scikit-learn to solve these exercises on a much higher level. It is a level that is closer to that of real-world machine learning projects. This project respects the Coursera Honor Code as the presented solutions can't be used to derive the lower-level Octave code that must be written to complete the assignments.
I developed these solutions while learning Python and its scientific programming libraries such as NumPy, SciPy, pandas and matplotlib in a machine learning context. The solutions are provided as Jupyter Python notebooks. Developers new to scikit-learn hopefully find them useful to see how the machine learning topics covered in the course relate to the scikit-learn API. In their current state, the notebooks neither explain machine learning basics nor introduce the used libraries. For learning machine learning basics I highly recommend attending the course lectures. For an introduction to the used libraries the following tutorials are a good starting point:
- Python tutorial
- NumPy tutorial
- SciPy tutorial
- Pandas tutorial
- Pyplot tutorial
- Scikit-learn tutorials
- Exercise 1 notebook: Linear regression (ex1.pdf)
- Exercise 2 notebook: Logistic regression (ex2.pdf)
- Exercise 3 notebook: Multi-class classification and neural networks (ex3.pdf)
- Exercise 4 notebook: Neural networks learning (ex4.pdf)
- Exercise 5 notebook: Regularized linear regression and bias vs. variance (ex5.pdf)
- Exercise 6 notebook: Support vector machines
- Exercise 7 notebook: K-means clustering and principal component analysis
- Exercise 8 notebook: Anomaly detection and recommender systems