-
Linear Regression
- Cost Function
- Gradient Descent
- Feature Scaling
- Vectorization (SIMD)
- Logistic Regression
- Overfitting, Underfitting
- Regularization
-
Neural Network
- Neural Network Forward Propagation
- Numpy library
- Tensor Flow, PyTorch or Pure Python?
- Linear Algebra(vector and matrix)
- Activation Function alternative to Sigmoid
- Multi-class Classification
- Soft Max
- Adam (introduce)
- Other Layer types (Convolutional)
-
What 's a good model?
- Test set, Dev set
- Model selection
- Probility and statistics
- Baseline, Bias, Variance review
- Learning curve
-
What to try when model is not good?
- More training data (Fix high variance)
- Try smaller set of features (Fix high variance)
- Get additional features (Fix high bias)
- Add polynomial features (Fix high bias)
- Decrease regularization parameter (Fix high bias)
- Increase regularization parameter (Fix high variance)
- Neural network often low bias, so just add more data
-
How to collect more data?
- Augmentation (transform x -> x' but still same y)
- Tranfer learning
-
ML development process
-
Unethical side of ML
-
Skewed dataset, precision, recall, F1 score
-
Desision tree
- When to stop spliting?
- Impurity, Entropy
- Information gain
- One hot encoding (split feature)
- Continuous feature
- Regression decision tree (use variance)
- Tree ensemble
- Sample replacement, random forest, XGBoost
-
Unsupervised learning
- K-means, Elbow method
- Anomaly detection
- Normal distribution
- Choosing epsilon
-
Recommended system
- Collaborative filtering
- Content based filtering
-
Reinforcement learning (learnt in AI class this semester)
- Markov Decision Process
- Q-learning
- Bellman equation
- Continoues State Space
- Neural Network for Q-learning
- Mini batch, soft update
- Computation graph
- Derivative chain rule (Đạo hàm của hàm hợp)
- Vectorization
- Broadcasting in numpy
- Regularization
- Drop out regularization
- Normalization (từ xác suất thống kê)
- Vanishing/ Exploding gradient (very deep network problem)
- Gradient checking
- Mini batch
- Exponentlly weighted average (trung bình động)
- Bias correction
- Gradient descent with momentum
- RMSprop
- Adam optimization
- Learning rate decay
- Tuning hyperparameters (learning rate > beta > mini batch size > hidden layer size > number of layers)
- Sampling scale
- Pandas vs Caviar approach
- Batch normalization (normalize z of layer[i])
- Human Performance, Bayes Optimal Error
- Tunning model for performance
- Data mismatch
- Train-Dev set
- Multitask and transfer learning
- End-to-end deep learning
- Convolutional Neural Network
- Max pooling, Average pooling