This repository contains slides, notebooks, and datasets for the Machine Learning University (MLU) Decision Trees and Ensemble Methods class. Our mission is to make Machine Learning accessible to everyone. We have courses available across many topics of machine learning and believe knowledge of ML can be a key enabler for success. This class is designed to help you get started with tree based models, learn about widely used Machine Learning techniques and apply them to real-world problems.
Watch all class video recordings in this YouTube playlist from our YouTube channel.
There are five lectures, one final project and five assignments for this class.
Lecture 1 | Lecture 2 | Lecture 3 | Lecture 4 | Lecture 5 |
---|---|---|---|---|
Decision Trees | Bias-variance trade-off | Bootstrapping | Random Forest Proximities | Boosting |
Impurity Functions | Error Decomposition | Bagging | Some use cases for Proximities | Gradient Boosting |
CART Algorithm | Extra Trees Algorithm | Random Forests | Feature Importance in Trees | XGBoost, LightGBM and CatBoost |
Regularization | Bias-variance and Randomized Ensembles | Feature Importance in Random Forests |
Final Project: Practice working with a "real-world" computer vision dataset for the final project. Final project dataset is in the data/final_project folder. For more details on the final project, check out this notebook.
If you would like to contribute to the project, see CONTRIBUTING for more information.
The license for this repository depends on the section. Data set for the course is being provided to you by permission of Amazon and is subject to the terms of the Amazon License and Access. You are expressly prohibited from copying, modifying, selling, exporting or using this data set in any way other than for the purpose of completing this course. The lecture slides are released under the CC-BY-SA-4.0 License. The code examples are released under the MIT-0 License. See each section's LICENSE file for details.