**Tutorials, examples, collections, and everything else that falls into the categories: pattern classification, machine learning, and data mining.**
Sections
- Introduction to Machine Learning and Pattern Classification
- Pre-Processing
- Model Evaluation
- Parameter Estimation
- Machine Learning Algorithms
- Clustering
- Collecting Data
- Data Visualization
- Statistical Pattern Classification Examples
- Books
- Talks
- Applications
- Resources
[Download a PDF version] of this flowchart.
Introduction to Machine Learning and Pattern Classification
-
Predictive modeling, supervised machine learning, and pattern classification - the big picture [Markdown]
-
Entry Point: Data - Using Python's sci-packages to prepare data for Machine Learning tasks and other data analyses [IPython nb]
-
An Introduction to simple linear supervised classification using
scikit-learn
[IPython nb]
Pre-processing
-
Feature Extraction
- Tips and Tricks for Encoding Categorical Features in Classification Tasks [IPython nb]
-
Scaling and Normalization
- About Feature Scaling: Standardization and Min-Max-Scaling (Normalization) [IPython nb]
-
Feature Selection
- Sequential Feature Selection Algorithms [IPython nb]
-
Dimensionality Reduction
- Principal Component Analysis (PCA) [IPython nb]
- The effect of scaling and mean centering of variables prior to a PCA [PDF] [HTML]
- PCA based on the covariance vs. correlation matrix [IPython nb]
-
Linear Discriminant Analysis (LDA) [IPython nb]
- Kernel tricks and nonlinear dimensionality reduction via PCA [IPython nb]
-
Representing Text
- Tf-idf Walkthrough for scikit-learn [IPython nb]
Model Evaluation
- An Overview of General Performance Metrics of Binary Classifier Systems [PDF]
- Cross-validation
- Streamline your cross-validation workflow - scikit-learn's Pipeline in action [IPython nb]
Parameter Estimation
-
Parametric Techniques
- Introduction to the Maximum Likelihood Estimate (MLE) [IPython nb]
- How to calculate Maximum Likelihood Estimates (MLE) for different distributions [IPython nb]
-
Non-Parametric Techniques
- Kernel density estimation via the Parzen-window technique [IPython nb]
- The K-Nearest Neighbor (KNN) technique
-
Regression Analysis
-
Linear Regression
- Least-Squares fit [IPython nb]
-
Non-Linear Regression
-
Machine Learning Algorithms
Bayes Classification
- Naive Bayes and Text Classification I - Introduction and Theory [View PDF] [Download PDF]
Logistic Regression
- Out-of-core Learning and Model Persistence using scikit-learn [IPython nb]
Neural Networks
-
Artificial Neurons and Single-Layer Neural Networks - How Machine Learning Algorithms Work Part 1 [IPython nb]
-
Activation Function Cheatsheet [IPython nb]
Ensemble Methods
- Implementing a Weighted Majority Rule Ensemble Classifier in scikit-learn [IPython nb]
Decision Trees
- Cheatsheet for Decision Tree Classification [IPython nb]
Clustering
- Protoype-based clustering
- Hierarchical clustering
- Complete-Linkage Clustering and Heatmaps in Python [IPython nb]
- Density-based clustering
- Graph-based clustering
- Probabilistic-based clustering
Collecting Data
-
Collecting Fantasy Soccer Data with Python and Beautiful Soup [IPython nb]
-
Download Your Twitter Timeline and Turn into a Word Cloud Using Python [IPython nb]
-
Reading MNIST into NumPy arrays [IPython nb]
Data Visualization
- Exploratory Analysis of the Star Wars API [IPython nb]
- Matplotlib examples -Exploratory data analysis of the Iris dataset [IPython nb]
Statistical Pattern Classification Examples
-
Supervised Learning
-
Parametric Techniques
-
Univariate Normal Density
- Ex1: 2-classes, equal variances, equal priors [IPython nb]
- Ex2: 2-classes, different variances, equal priors [IPython nb]
- Ex3: 2-classes, equal variances, different priors [IPython nb]
- Ex4: 2-classes, different variances, different priors, loss function [IPython nb]
- Ex5: 2-classes, different variances, equal priors, loss function, cauchy distr. [IPython nb]
-
Multivariate Normal Density
- Ex5: 2-classes, different variances, equal priors, loss function [IPython nb]
- Ex7: 2-classes, equal variances, equal priors [IPython nb]
-
-
Non-Parametric Techniques
-
Books
Python Machine Learning
Talks
An Introduction to Supervised Machine Learning and Pattern Classification: The Big Picture
MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics
Applications
MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song Lyrics
This project is about building a music recommendation system for users who want to listen to happy songs. Such a system can not only be used to brighten up one's mood on a rainy weekend; especially in hospitals, other medical clinics, or public locations such as restaurants, the MusicMood classifier could be used to spread positive mood among people.
mlxtend - A library of extension and helper modules for Python's data analysis and machine learning libraries.
Resources
-
Copy-and-paste ready LaTex equations [Markdown]
-
Open-source datasets [Markdown]
-
Free Machine Learning eBooks [Markdown]
-
Terms in data science defined in less than 50 words [Markdown]
-
Useful libraries for data science in Python [Markdown]
-
General Tips and Advices [Markdown]
-
A matrix cheatsheat for Python, R, Julia, and MATLAB [HTML]