Some practices using statistical machine learning technique based on some dataset.
To see more detail or example about deep learning, you can checkout my Deep Learning repository.
- Using Python 3
(most of the relative path links are according to the repository root)
numpy
: For low-level math operationspandas
: For data manipulationsklearn
- Scikit Learn: For evaluation metrics, some data preprocessing
For comparison purpose
sklearn
: For machine learning modelscvxopt
: For convex optimization problem (for SVM)
NLP related
gensim
: Topic Modellinghmmlearn
: Hidden Markov Models in Python, with scikit-learn like APIjieba
: Chinese text segementation librarypyHanLP
: Chinese NLP library (Python API)nltk
: Natural Language Toolkit
- Surpervised Learning
- Classification - Discrete
- Regression - Continuous
- Unsupervised Learning
- Clustering - Discrete
- Dimensionality Reduction - Continuous
- Association Rule Learning
- Semi-supervised Learning
- Reinforcement Learning
- Classification
Logistic Regression
(optimization algo.)k-Nearest Neighbors (kNN)
Support Vector Machine (SVM)
- Deduction (optimization algo.)Naive Bayes
Decision Tree (ID3, C4.5, CART)
- Regression
Linear Regression
(optimization algo.)Tree (CART)
- Clustering
k-Means
Hierarchical Clustering
- Association Rule Learning
- Dimensionality Reduction
Principal Compnent Analysis (PCA)
Single Value Decomposition (SVD)
- LSA, LSI, Recommendation SystemLinear Discriminant Analysis (LDA)
- Bagging
Random Forests
- Boosting
AdaBoost
<- With some basic boosting notesGradient Boosting
Gradient Boosting Decision Tree (GBDT)
(aka. Multiple Additive Regression Tree (MART))
XGBoost
Hidden Markov Model (HMM)
Probabilistic Latent Semantic Analysis (PLSA)
Latent Dirichlet Allocation (LDA)
Vector Space Model (VSM)
- Classification
- Data Preprocessing
- Real-world Problem
- Evaluation Metrics
- Binary to Multi-class Expension
- Regression
- Evaluation Metrics
- Clustering
- Evaluation Metrics
- Recommendation System
- Collaborative Filtering
- Information Retrieval - Topic Modelling
- Latent Semantic Analysis (LSA/LSI/SVD)
- Latent Dirichlet Allocation (LDA)
- Random Projections (RP)
- Hierarchical Dirichlet Process (HDP)
- word2vec
- Kernel Usages
- Convex Optimization
- Linear Algebra
- Orthogonality
- Eigenvalues
- Hessian Matrix
- Quadratic Form
- Markov Chain - HMM
- Calculus
- Multivariable Deratives
- Quadratic Approximations
- Lagrange Multipliers and Constrained Optimization - SVM SMO
- Lagrange Duality
- Multivariable Deratives
- Probability and Statistics
- Statistical Estimation
- Algebra
- Trigonometry
(from A to Z)
- Decision Tree
- Entropy
- Naive Bayes
- Bayes' Theorem
- PCA
- Orthogonal Transformations
- Eigenvalues
- SVD
- Eigenvalues
- SVM
- Convex Optimization
- Constrained Optimization
- Lagrange Multipliers
- Kernel
- Machine Learning in Action
- 統計學習方法 (李航)
- 機器學習 (周志華)
- Linear Algebra with Applications (Steven Leon)
- Convex Optimization (Stephen Boyd & Lieven Vandenberghe)
- Numerical Linear Algebra (L. Trefethen & D. Bau III)
- Google - Machine Learning Recipes with Josh Gordon
- Youtube - Machine Learning Fun and Easy
- Siraj Raval - The Math of Intelligence
- bilibili - 機器學習 - 白板推導系列
- bilibili - 機器學習升級版
- ApacheCN (ML, DL, NLP)
- Machine learning 101 (infographics)
- Google Machine Learning Crash Course
- Kaggle Learn Machine Learning
- Microsoft Professional Program - Artificial Intelligence track
- Machine Learning from Scratch (eriklindernoren/ML-From-Scratch)
- Jack-Cherish/Machine-Learning
- Dod-o/Statistical-Learning-Method_Code