This repository contains several exercises to practice machine learning algorithm through scikit-learn framework. All exercises come from the Linux magazine HS n°94.
This project use virtualenvwrapper to create a virtual environment for python.
$ sudo -H pip install virtualenvwrapper
$ mkdir ~/.virtualenvs
$ echo "export WORKON_HOME=~/.virtualenvs" >> ~/.bashrc
$ echo "source /usr/local/bin/" >> ~/.bashrc
$ bash
$ mkvirtualenv machine_learning --python=/usr/bin/python3
$ workon machine_learning
$ pip install -r requirements.txt
In this section, we learned how to use linear regression, define "a" and "b" values to draw the linear equation and how to use spline to represent complex equations.
Firstly, we had to download several data sets.
- Download "players_stats.csv" =>
- Download "yellow_tripdata_2017_0*" =>
- Download "04cars.dat.txt" =>
In this example, we will see a linear correlation between the height of a NBA player and his weight.
$ python
$ python
Spline is a way to modelize complex equation that do not follow the pattern ax + b.
$ python
In this exercise, we will use spline to define the different time needed to go to JFK airport followind the same travel in taxi.
$ python
$ python
In this section, we learned how to use PCA, normalized data and reduce variable dimensions.
In this exercise, we will use brute force to show all combination of Iris datasets.
$ python
In this exercise, we will use a basic linear example to see how to reduce a 2d representation to 1D representation.
$ python
In this exercise, we will use PCA and biplot methods to represent of one chart the IRIS dataset.
$ python
In this exercise, we will see that unnormalized data could alter PCA analysis.
$ python
This project is distributed under the MIT licence.
To test the quality, run this commands :
$ pip install flake8 prospector
$ flake8
$ prospector -F -i dataset/
To fix a bug, open an issue in github and submit a pull request.