Machine learning models including classification, regression, NLP, association rule learning etc...Please find the ipynb file and toy dataset under each folder. If you have any questions, please reach out to me at linzi_yu@hms.harvard.edu!
Adapted from Udemy course: Machine learning A-Z: https://www.udemy.com/course/machinelearning/. For education and self-learning usage only.
- Feature imputation: SimpleImputer
- Feature encoding: OneHotEncoder, LabelEncoder
- Feature normalization: normalization and standardization
- Feature engineering: wrangle with date, bin values into discrete intervals
- Multilinear regression
- Polynomial Regression
- Support Vector Regression (SVR)
- Decision Tree Regression
- Random Forest Regression
- XGB regressor And model evaluation by r2_score
- Logistic regression
- KNN
- Kernel SVM
- Naive bayes
- Decision tree
- Random forest
- XGBoost
- CatBoost regressor Model evaluation by confution matrix and accuracy score
- K-Means Clustering
- Hierarchical Clustering
- Apriori
- Eclat
- Thompson Sampling
- Upper Confidence Bound (UCB)
- Creating the Bag of Words model by CountVectorizer
- Artificial Neural Network (keras)
- Convolutional Neural Network (keras)
- PCA
- Kernal PCA
- LDA And visualizing the results after regression model
- k-Fold Cross Validation (prevent overfitting, mean model performance, improve generalization of the model)
- Grid Search to find the best model and the best parameters
- ML roadmap: https://whimsical.com/machine-learning-roadmap-2020-CA7f3ykvXpnJ9Az32vYXva
- In-depth introduction to machine learning in 15 hours of expert videos: https://www.dataschool.io/15-hours-of-expert-machine-learning-videos/
- Python OOP-Corey Schafer: https://www.youtube.com/watch?v=rq8cL2XMM5M&list=PL-osiE80TeTsqhIuOqKhwlXsIBIdSeYtc&index=3
- Best Numpy Functions For Data Science: https://www.kaggle.com/code/abhayparashar31/best-numpy-functions-for-data-science