GNU General Public License v2.0GPL-2.0

Introduction to Statistical Learning

Notes about the course:

The course Text books:

  1. An Introduction to Statistical Learning: https://www.amazon.com/Introduction-Statistical-Learning-Applications-Statistics/dp/1461471370
  2. Machine Learning: A Probabilistic Perspective https://www.amazon.com/Machine-Learning-Probabilistic-Perspective-Computation/dp/0262018020

Instructor : Omid Safarzadeh:

Chapter 1 : Learning Algorithms

Supervised learning
Unsupervised learning
Semi-Supervised learning
Online Learning
Reinforcement Learning
Graph Representation Learning

Chapter 2 : Regression, Cross-Validation

The least squares approach
Multiple linear regression
Bias-variance tradeoff
Leave-one-out cross-validation
k-fold cross validation

Chapter 3 : Logistic , Ridge, Lasso Regression

logistic regression
    MLE for simple logistic regression 
Ridge regression
Bias-variance tradeoff
Pros and cons of ridge regression

Chapter 4 : Bayesian Estimation, MAP

    Likelihood and posterior distribution
       Computing the posterior
       Maximum likelihood estimation (MLE)
    Maximum a posteriori (MAP) estimation
       Posterior mean
       MAP properties
    Bayesian linear regression

Chapter 5 : Unsupervised Learning, PCA

 Unsupervised learning 
 Principal component analysis (PCA)

Chapter 6 : Recommendation Systems

   Collaborative Filtering
   Matrix Factorization
   Funk SVD
   Alternating Least Square

Chapter 7 : EM Algorithm

 Expectation Maximization Algorithm

Chapter 8 : Clustering

K means
Gaussian Mixture Models

Chapter 9: Activation and Loss functions

Activation Functions
    Exponential Linear Unit (ELU)
    Exponential activation function
    Gaussian error linear unit (GELU)
    Hard sigmoid
    Rectified Linear Unit (ReLU)
    Scaled Exponential Linear Unit (SELU)
    Hyperbolic Tangent
Loss Functions
   Mean Absolute Error (MAE)
   Mean Absolute Percentage Error (MAPE)
   Mean Squared Error (MSE)
   Indicator function

Chapter 10: Neural Networks, RNN, LSTM, CNN

    Gradient Decent
    Back Propagation

Chapter 11: Transformers

   Sequence to sequence models
   Attention Mechanism
      Bottleneck Problem
      Attention Layer
      Categories of Attention Mechanism
       Self attention mechanism
       Multi-Head attention mechanism
       Encoder Architecture
       Decoder Architecture
       Full Architecture
   Positional Encoding
       Language Masked Learning
       BERT Input
       BERT Output

Chapter 12: Automatic Feature Extraction

 Automatic FE with TensorFlow
 Deep & Cross Network Structure
 Pre Processing
 Cross Network
    Deep NN
    Deep & Cross Network V1
    Deep & Cross Network V2
    Model construction
 Model understanding for interpreting cross features
 Model Performance