This is the public repository for the ML Process Course (https://365datascience.com/learn-machine-learning-process-a-z/). In this course, we take you through the end-to-end process of building a Machine Learning Model. We did not build this course ourselves. We stood on the shoulders of giants. We think its only fair to credit all the resources we used to build this course, as we could not have created this course without the help of the ML community.
- Coding Workbooks for Each Course
- Data Science Blogs
- Applying ML
- Problem Framing
- Data Collection
- Data Preprocessing
- Exploratory Data Analysis
- Feature Engineering
- Cross Validation
- Feature Selection
- Imbalanced Data
- Modeling
- Model Evaluation
- AirBnB Engineering
- Spotify Research
- Netflix Research
- DoorDash ML Blog
- Uber Engineering
- Lyft Engineering
- Shopify Engineering
- Meta Engineering
- LinkedIn Engineering
- Analytics Hierarchy of Needs by Ryan Foley
- The First Rule of ML by Eugene Yan
- ML for Product Analytics by Ron Tidhar
- Getting Data From a Database
- What is SQL - Video
- How Data Analysts Use SQL - Video
- Why Are APIs Important For Data Science - Video
- What is an API?
- How Companies Caputre Data
- How Cookies Work
- Web Scraping Basics
- Types of Survey Techniques
- 365 Data Science SQL Course
- All You Need To Know About Different Types Of Missing Data Values And How To Handle It
- 7 Ways to Handle Missing Values in Machine Learning
- Null Values Imputation by Utkarsh Gupta
- What are methods to make a predictive model more robust to outliers?
- Guidelines for Removing and Handling Outliers in Data by Jim Frost
- 10 Types of Statistical Data Distribution Models
- What is EDA?
- Is Data Visualization Important for Data Science? - Video
- Box Plot Details
- Histogram Additional Details
- Types of Data Distributions
- Understanding Skew
- Scatterplots and Correlation Additional Details
- Types of Correlations
- Correlation Matrix Details
- Different correlation matrices in python
- Creating Pivot Tables Documentation
- When to use different types of charts
- All about Categorical Variable Encoding by Baijayanta Roy
- Ordinal and One-Hot Encodings for Categorical Data by Jason Brownlee
- Feature Engineering Ordinal Variables by Sheng Jun
- Target Encoding by Ryan Holbrook
- Weight of Evidence Coding by Bruce Lund
- About Feature Scaling and Normalization by Sebastian Raschka
- Feature Scaling Techniques in Python – A Complete Guide by Eddie_4072
- Feature Scaling for Machine Learning: Understanding the Difference Between Normalization vs. Standardization by Aniruddha Bhandari
- Robust Scaler - Sklearn Docs
- Log Transformation: Purpose and Interpretation by Kyaw Saw Htoon
- Best exponential transformation to linearize your data with Scipy
- Exponentially scaling your data in order to zoom in on small differences
- Box Cox Transformation by Ted Hessing
- Box-Cox Transformation and Target Variable: Explained
- Understanding 8 types of Cross-Validation by Satyam Kumar
- 7 Types of Cross Validation by Soumyaa Rawat
- k-fold cross-validation explained in plain English by Rukshan Pramoditha
- Machine Learning Fundamentals: Bias and Variance by Josh Starmer/Statquest
- A Quick Intro to Leave-One-Out Cross-Validation (LOOCV) by Statology
- Time Series Cross Validation by Robert Hyndman
- Time Based Cross Validation
- KFold vs. Monte Carlo by Rebecca Patro
- Everything You Need to Know About Feature Selection In Machine Learning by Kartik Menon
- A comprehensive guide to Feature Selection using Wrapper methods in Python
- What is the difference between filter, wrapper, and embedded methods for feature selection? by Sebastian Raschka
- Introduction to Feature Selection methods with an example (or how to select the right variables?)
- Feature selection in Python using the Filter method by Renu Khandelwal
- 10 Techniques to deal with Imbalanced Classes in Machine Learning by Analytics Vidhya
- Oversampling and Undersampling by Kurtis Pykes
- Overcoming Class Imbalance using SMOTE Techniques
- SMOTE explained for noobs – Synthetic Minority Over-sampling TEchnique line by line by Rich Data
- SMOTE by Joos Korstanje
- 5 SMOTE Techniques for Oversampling your Imbalance Data by Cornellius Yudha Wijaya
- Fixing Imbalanced Datasets: An Introduction to ADASYN by Rui Nian
- What does "baseline" mean in the context of machine learning?
- Sklearn's Dummy Estimators
- 7 Hyperparameter Optimization Techniques Every Data Scientist Should Know
- A Comprehensive Guide on Hyperparameter Tuning and its Techniques
- Hyperparameter tuning in Python by Tooba Jamal
- Random Search for Hyper-Parameter Optimization by James Bergestra and Yoshua Bengio
- A Conceptual Explanation of Bayesian Hyperparameter Optimization for Machine Learning by Will Koehrsen
- Bayesian Optimization Primer by SigOpt
- Genetic Algorithms by Marcos Del Cueto
- Simulated Annealing From Scratch in Python by Jason Brownlee
- Optimization Techniques — Simulated Annealing by Frank Liang
- Hyperparameter optimization for Neural Networks
- Ensemble Methods: Elegant Techniques to Produce Improved Machine Learning Results
- A Gentle Introduction to Ensemble Learning Algorithms by Jason Brownlee
- Types of Ensemble methods in Machine learning by Anju Tajbangshi
- Introduction to Ensembling/Stacking in Python by Anisotropic
- Ensembles and Model Stacking by Eshaan Kirpal
- Model Evaluation Metrics in Machine Learning by Nagesh Singh Chauhan
- 11 Important Model Evaluation Metrics for Machine Learning Everyone should know
- How To Interpret R-squared in Regression Analysis by Jim Frost
- Know The Best Evaluation Metrics for Your Regression Model by Raghav Agrawal
- Recall, Precision, F1, ROC, AUC, and everything by Ofir Shalev
- F1 Score vs ROC AUC vs Accuracy vs PR AUC: Which Evaluation Metric Should You Choose? by Jakub Czakon
- Intuition behind Log Loss Score by Gaurav Dembla
- Why is ROC AUC equivalent to the probability that two randomly-selected samples are correctly ranked?
- Man U Whitney Test
- Essential Things You Need to Know About F1-Score
- ROC, AUC, precision, and recall visually explained by Paul Vanderlaken
- R-squared Is Not Valid for Nonlinear Regression by Jim Frost
- 3 Best metrics to evaluate Regression Model? by Songhao Wu
- Git For Data Scientists - Video
- Git Documentation
- Git Basics in 10 minutes
- 365 Data Science Git & Github Course
- Save and load ml models
- 5 Different Ways to Save your ML Model
- Improve analytics slide decks
- Streamlit Gallery
- Build 12 streamlit apps - Video
- Streamlit Project Example - Video
- Build an api in python - Video
- How to create an API in python
- How to create a python library