Practical Machine Learning

Instructor: Alejandro Correa Bahnsen

Requiriments

  • Python version 3.5;
  • Numpy, the core numerical extensions for linear algebra and multidimensional arrays;
  • Scipy, additional libraries for scientific programming;
  • Matplotlib, excellent plotting and graphing libraries;
  • IPython, with the additional libraries required for the notebook interface.
  • Pandas, Python version of R dataframe
  • Seaborn, used mainly for plot styling
  • scikit-learn, Machine learning library!

A good, easy to install option that supports Mac, Windows, and Linux, and that has all of these packages (and much more) is the Anaconda.

GIT!! Unfortunatelly out of the scope of this class, but please take a look at these tutorials

Sessions

Session Notebook link Exercises
1 Introduction to Python 01.1 - Finding digits of Pi, 01.2 - OLS and Numpy
2 Introduction to Machine Learning 02 - Churn Model
3 Pandas Data Frame 03 - Baby names
4 Linear Regression 04 - Bikes Rent
5 Logistic Regression 05 - Titanic
6 Data preparation and Model Evaluation 06 - Titanic V2
7 Kaggle Competitions 07 - Titanic Kaggle Competition
8 Feature Selection 08 - Titanic V3
9 Naive Bayes 09 - Yelp Reviews
10 KNN 10 - NBA Stats
11 Information Retrieval 11 - Mashable
12 Natural Language Processing
13 Decision Trees 13 - Bike Sharing
14 14 - Kaggle Competition - Mashable
15 Unbalance Datasets 15 - Fraud Detection
16 Ensemble Methods - Bagging 16 - MashableV1
17 Ensemble Methods - Bagging cont & Boosting 17 - MashableV2
18 Support Vector Machines 18 - Wine
19 Regularization 19 - Wine V2
20 Cost-Sensitive Classification 20 - Churn
21 Intro Deep Learning
22 Model Deployment