Live Machine Learning Class:

中文机器学习研究线上课

2022年我坚持每周日晚上8:30直播机器学习研究课程系列 (微信二维码在这个链接)- From 2022, I hold regular 8:30pm Sunday Night live (SNL) broadcast on Machine Learning theory.

English version

From April 2022, I started a machine learning research seminar series every 2-3 weeks in English via Zoom. It's at 7pm Hong Kong Time. I will continue to explain machine learning using an intermediate level mathematics. The current topic is: "Gradient Descend Research". You need a solid understanding of linear algebra, calculus, probability and statistics. You can register via meetup https://www.meetup.com/machine-learning-hong-kong/ (Back in Australia, I also conducted research training to all machine learning PhD students at Australian universities, with over 100 students participating via Zoom.)

Learning Theory Classes

Video Tutorial to these notes 视频资料

I recorded about 20% of these notes in videos in 2015 in Mandarin (all my notes and writings are in English) You may find them on Youtube and bilibili and Youku

我在2015年用中文录制了这些课件中约10％的内容 (我目前的课件都是英文的)大家可以在Youtube 哔哩哔哩 and 优酷下载

Course on Foundational Mathematics in Machine Learning 机器学习基础数学课程

Class 1: Model Evaluation

common concepts and techniques for classification model evaluation, including bootstrapping sampling, confusion matrices, receiver operating characteristic (ROC) curves. 分类模型评估的常见概念和技术，包括自举抽样、混淆矩阵、接收器操作特征 (ROC) 曲线

Class 2: Decision Tree

In addition to all the basics of decision trees, I've added a $\chi^2$ test section to this note. 除了决策树的所有基础知识之外，我还在此说明中添加了 $\chi^2$ 测试部分。

Class 3: Simple Bayes

This note is intended to provide an intuitive explanation of the basic concepts of probability, Bayes' theorem, graphical models of probability. 本课件旨在对概率的基本概念、贝叶斯定理、概率的图形模型提供直观的解释

Class 4: Regression

This note is to explain the century-old, simplest regression models: linear and polynomial regression, and some techniques for evaluating regression performance, especially the coefficient of determination (CoD) method. 这篇笔记是为了解释最简单的回归模型：线性回归和多项式回归，以及一些评估回归性能的技术，尤其是确定系数 (CoD) 方法

Class 5: Neural Network

First I show three different last output layer models: logistic, multinomial, and linear regression. Then I show the concept of gradient descent. The main part is to show a basic fully connected neural network and finally a convolutional neural network. 首先，我展示了三个不同的最后输出层模型：逻辑回归、多项式和线性回归。然后我展示了梯度下降的概念。主要部分是展示一个基本的全连接神经网络，最后是一个卷积神经网络。

Class 6: Unsupervised Learning

This note describes some common topics in unsupervised learning. From the most obvious methods like clustering, to topic modeling (Latent Diricher Allocation) and traditional word embeddings like the word2vec algorithm. 本课件描述了无监督学习中的一些常见主题。从最明显的方法（如聚类）到主题建模和传统的词嵌入（如 word2vec 算法）。

Course on Intemediate Mathematics in Machine Learning 机器学习中级数学课程

I'm currently updating/validating and correcting notes I've written over the past decade and incorporating them into an introduction/intermediate/advanced machine learning course. I will gradually delete all my previous Beamer notes and replace them with technical report notes. 我目前正在更新/验证和更正我在过去十年中写的笔记，并将它们合并到入门/中级/高级机器学习课程中。我会逐渐删除之前写的 Beamer 笔记，并用技术报告笔记代替它们。

Expectation Maximization

Proof of convergence for E-M, examples of E-M through Gaussian Mixture Model, [gmm_demo.m] and [kmeans_demo.m] and [bilibili video] 最大期望E-M的收敛证明, E-M到高斯混合模型的例子, [gmm_demo.m] 和 [kmeans_demo.m] 和 [B站视频链接]

Markov Chain Monte Carlo

MCMC background, including random matrix, power method convergence, detailed balance and PageRank algorithm, some basic MCMC methods, including Metropolitan-Hasting, Gibbs, and LDA as an example MCMC背景，包括随机矩阵、幂法收敛、详细平衡和PageRank算法，一些基本的MCMC方法，包括Metropolitan-Hasting、Gibbs和LDA为例

Variational Inference

Explain Variational Bayes both the non-exponential and exponential family distribution [vb_normal_gamma.m] and [bilibili video] 解释变分贝叶斯非指数和指数族分布。[vb_normal_gamma.m] 和 [B站视频链接]

State Space Model (Dynamic model)

explain in detail of Kalman Filter [bilibili video], [kalman_demo.m] and Hidden Markov Model [bilibili video]

状态空间模型(动态模型) 详细解释了卡尔曼滤波器 [B站视频链接], [kalman_demo.m] 和隐马尔可夫模型 [B站视频链接]

Sinovation DeeCamp 创新工场DeeCAMP讲义

DeeCamp 2019：Story of Softmax

properties of Softmax, Estimating softmax without compute denominator, Probability re-parameterization: Gumbel-Max trick and REBAR algorithm (softmax的故事) Softmax的属性, 估计softmax时不需计算分母, 概率重新参数化, Gumbel-Max技巧和REBAR算法

DeeCamp 2018：When Probabilities meet Neural Networks

Expectation-Maximization & Matrix Capsule Networks; Determinantal Point Process & Neural Networks compression; Kalman Filter & LSTM; Model estimation & Binary classifier (当概率遇到神经网络) 主题包括：EM算法和矩阵胶囊网络; 行列式点过程和神经网络压缩; 卡尔曼滤波器和LSTM; 模型估计和二分类问题关系

Deep Learning Research Topics 深度学习研究

Variance Reduction

REBAR, RELAX algorithm and some detailed explanation of re-parameterization of Gumbel conditionals REBAR，RELAX算法以及对Gumbel条件概率重新参数化的一些详细说明

New Research on Softmax function

Out-of-distribution, Neural Network Calibration, Gumbel-Max trick, Stochastic Beams Search (some of these lectures overlap with DeeCamp2019) 分布外、神经网络校准、Gumbel-Max 技巧、随机光束(BEAM)搜索（其中一些讲座与 DeeCamp2019 重叠）

Mathematics for Generative Adversarial Networks

How GAN works, Traditional GAN, Mathematics on W-GAN, Info-GAN, Bayesian GAN GAN如何工作，传统GAN，W-GAN数学，Info-GAN，贝叶斯GAN

Advanced Variational Autoencoder

How Varational Autoencoder works, Importance Weighted Autoencoders, Normalized Flow via ELBO, Adversarial Variational Bayes, Mixture Density VAE, stick-breaking VAE 变分自编码器的工作原理，重要性加权自编码器，通过ELBO的标准化流，对抗变分贝叶斯, 混合密度自编码器，stick-breaking 自编码器

Infinite Depth: NeuralODE and Adjoint Equation

Discuss Neural ODE and in particular the use of adjoint equation in Parameter training 讨论神经ODE，尤其是在参数训练中使用伴随方程

Bayesian Inference and Deep Learning (Seminar Talk)

This is a seminar talk I gave on some modern examples in which Bayesian (or probabilistic) framework is to explain, assist and assisted by Deep Learning. 这是我的演讲稿件。归纳了一些最近研究例子中，贝叶斯（或概率）框架来解释，帮助(或被帮助于)深度学习。

Optimization Method 优化方法

Tutorial on Gradient Descend Research

This is a progressive research note on Implicit Bias and Implicit Regularization of Gradient Descent Algorithms (check out my biweekly seminars), Convergence Research for Stochastic Gradient Descent etc.这是关于梯度下降算法的隐式偏差和隐式正则化的渐进式研究笔记（查看我的双周研讨会）、随机梯度下降的收敛研究等。

Tutorial on Duality

Lagrangian duality, dual function, KKT condition, example on support vector machines and Farkas Lemma 拉格朗日对偶、对偶函数、KKT 条件、支持向量机示例和 Farkas 引理

Conjugate Gradient Descend

A quick explanation of Conjugate Gradient Descend 共轭梯度下降的快速解释

Deep Learning Basics 深度学习基础

Convolution Neural Networks: from basic to recent Research

detailed explanation of CNN, various Loss function, Centre Loss, contrastive Loss, Residual Networks, Capsule Networks, YOLO, SSD 卷积神经网络：从基础到最近的研究：包括卷积神经网络的详细解释，各种损失函数，中心损失函数，对比损失函数，残差网络，胶囊网络, YOLO，SSD

Restricted Boltzmann Machine

Restricted Boltzmann Machine (RBM) and Contrastive Divergence (CD) Basics 受限玻尔兹曼机 (RBM) 和对比发散 (CD) 基础知识

3D Geometry Computer vision 3D几何计算机视觉

3D Geometry Fundamentals

Camera Models, Intrinsic and Extrinsic parameter estimation, Epipolar Geometry, 3D reconstruction, Depth Estimation 相机模型，内部和外部参数估计，对极几何，三维重建，图像深度估计

Recent Deep 3D Geometry based Research

Recent research of the following topics: Single image to Camera Model estimation, Multi-Person 3D pose estimation from multi-view, GAN-based 3D pose estimation, Deep Structure-from-Motion, Deep Learning based Depth Estimation, 以下主题的最新研究：单图像到相机模型的估计，基于多视图的多人3D姿势估计，基于GAN的3D姿势估计，基于运动的深度结构，基于深度学习的深度估计

This section is co-authored with PhD student Yang Li 本部分与博士研究生李杨合写

Reinforcement Learning 强化学习

Reinforcement Learning Basics

basic knowledge in reinforcement learning, Markov Decision Process, Bellman Equation and move onto Deep Q-Learning 深度增强学习: 强化学习的基础知识，马尔可夫决策过程，贝尔曼方程，深度Q学习

Monto Carlo Tree Search

Monto Carlo Tree Search, alphaGo learning algorithm 蒙托卡罗树搜索，alphaGo学习算法

Policy Gradient

Policy Gradient Theorem, Mathematics on Trusted Region Optimization in RL, Natural Gradients on TRPO, Proximal Policy Optimization (PPO), Conjugate Gradient Algorithm 政策梯度定理, RL中可信区域优化的数学,TRPO自然梯度, 近似策略优化(PPO), 共轭梯度算法

Natural Language Processing 自然语言处理

Word Embeddings

GloVe, Fasttext, negative sampling 系统的介绍了自然语言处理中的“词表示”中的技巧

Deep Natural Language Processing

RNN, LSTM, Seq2Seq with Attenion, Beam search, Attention is all you need, Convolution Seq2Seq, Pointer Networks 深度自然语言处理：递归神经网络,LSTM,具有注意力机制的Seq2Seq，集束搜索，指针网络和 "Attention is all you need", 卷积Seq2Seq

Data Science PowerPoint and Source Code 数据科学 PowerPoint 和源代码

30 minutes introduction to AI and Machine Learning

An extremely gentle 30 minutes introduction to AI and Machine Learning. Thanks to my PhD student Haodong Chang for assist editing 30分钟介绍人工智能和机器学习, 感谢我的学生常浩东进行协助编辑

[costFunction.m]
[soft_max.m]
[industry data science Jupyter notebook]
Recommendation system

collaborative filtering, Factorization Machines, Non-Negative Matrix factorisation, Multiplicative Update Rule 推荐系统: 协同过滤，分解机，非负矩阵分解，和期中“乘法更新规则”的介绍

Probabilistic Model 概率模型课件

Probabilistic Estimation

some useful distributions, conjugacy, MLE, MAP, Exponential family and natural parameters 一些常用的分布，共轭特性，最大似然估计, 最大后验估计, 指数族和自然参数

Monte-Carlo Inference 蒙特卡洛推理

Introduction to Monte Carlo

inverse CDF, rejection, adaptive rejection, importance sampling [adaptive_rejection_sampling.m] and [hybrid_gmm.m] 累积分布函数逆采样, 拒绝式采样, 自适应拒绝式采样, 重要性采样 [adaptive_rejection_sampling.m] 和 [hybrid_gmm.m]

Markov Chain Monte Carlo

M-H, Gibbs, Slice Sampling, Elliptical Slice sampling, Swendesen-Wang, demonstrate collapsed Gibbs using LDA [lda_gibbs_example.m] and [test_autocorrelation.m] and [gibbs.m] and [bilibili video] 马尔可夫链蒙特卡洛的各种方法 [lda_gibbs_example.m] 和 [test_autocorrelation.m] 和 [gibbs.m] 和 [B站视频链接]

Particle Filter (Sequential Monte-Carlo)

Sequential Monte-Carlo, Condensational Filter algorithm, Auxiliary Particle Filter [bilibili video] 粒子滤波器（序列蒙特卡洛）[B站视频链接]

Advanced Probabilistic Model 高级概率模型课件

Bayesian Non Parametrics (BNP) and its inference basics

Dircihlet Process (DP), Chinese Restaurant Process insights, Slice sampling for DP [dirichlet_process.m] and [bilibili video] and [Jupyter Notebook]

非参贝叶斯及其推导基础: 狄利克雷过程,**餐馆过程,狄利克雷过程Slice采样 [dirichlet_process.m] 和 [B站视频链接] 和 [Jupyter Notebook]

Bayesian Non Parametrics (BNP) extensions

Hierarchical DP, HDP-HMM, Indian Buffet Process (IBP) 非参贝叶斯扩展: 层次狄利克雷过程，分层狄利克雷过程-隐马尔可夫模型，印度自助餐过程(IBP)

Completely Random Measure (early draft - written in 2015)

Levy-Khintchine representation, Compound Poisson Process, Gamma Process, Negative Binomial Process Levy-Khintchine表示，复合Poisson过程，Gamma过程，负二项过程

Sample correlated integers from HDP and Copula

This is an alternative explanation to our IJCAI 2016 papers. The derivations are different from the paper, but portraits the same story. 这是对我的IJCAI2016论文的一个不同解释。虽然写的方法公式推导不同，但描绘的是同一事情

Determinantal Point Process

explain the details of DPP’s marginal distribution, L-ensemble, its sampling strategy, our work in time-varying DPP 行列式点过程解释:行列式点过程的边缘分布，L-ensemble，其抽样策略，我们在“时变行列式点过程”中的工作细节

Determinantal Point Process Basics (updated)

this is a re-write of the previous DPP tutorial without the time-vaying part 这是之前DPP教程的重写，没有时间变化部分

Special Thanks

I want to thank all the Universities where I have worked for tolerating me indulging my love of knowledge dissemination. 我要感谢所有我工作过的大学容忍我沉迷于知识传播
I always look for high quality PhD students in Machine Learning, both in terms of probabilistic model and Deep Learning theory. Contact me on xuyida@hkbu.edu.hk 如果你想加入我的机器学习博士生团队或有兴趣合作, 请通过xuyida@hkbu.edu.hk与我联系。

Shuoming/machine-learning-notes