-
Introduction to data science/analytics
- Linear algebra review
- Python environment setup
-
Statistical Description of Structured Data
- Introduction to statistics: random variables, random distribution, histogram
- Statistic distributions: Gaussian, Poisson etc.
-
Linear model
- Correlation;
- Linear regression;
- Likelihood function and maximum likelihood estimator
-
Logistic regression and Poisson regression
- Logistic regression
- Newton-raphson method and Gradient Descent
- Poinsson regression
-
Generalized Linear Model
- Exponential Family
- Link function
- Generalized Linear Model
-
Statistical Modeling Framework
- Empirical Modeling Practices
- Feature engineer, variable selection
- Model evaluations
-
Machine Learning I: Tree Algorithms
- CART Model
- Entropy and impurity measure
- Random forest and GBM
-
Machine Learning II
- Artificial neurons, activation function
- Feedforward neural networks
- Stochastic gradient descent
- Backpropagation
-
Nature Language Process I
- Word2vec; embeddings
- Word embedding;
- Language model
- Similarity measure
-
Nature Language Process II
- Conditional probability and bayes theory
- Part-of-speech tagging
-
Nature Language Process III
- Multi-class classification
- IOB tagging
- Name Entity Recognition
- Document classification
Pacteria/Foundations-of-Analytics
Lecture notes; sample code; slides for the course foundations of analytics that I am teaching at Washington University in St Louis
TeX