mlcourse.ai – Open Machine Learning Course
🇷🇺 Russian version 🇷🇺
❗ Current session launched on October 1, 2018. Fill in this form to participate, you can still join ❗
Mirrors (:uk:-only): mlcourse.ai (main site), Kaggle Dataset (same notebooks as Kernels)
Outline
This is the list of published articles on medium.com 🇬🇧, habr.com 🇷🇺, and jqr.com 🇨🇳. Icons are clickable. Also, links to Kaggle Kernels (in English) are given. This way one can reproduce everything without installing a single package.
- Exploratory Data Analysis with Pandas 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
- Visual Data Analysis with Python 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernels: part1, part2
- Classification, Decision Trees and k Nearest Neighbors 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
- Linear Classification and Regression 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernels: part1, part2, part3, part4, part5
- Bagging and Random Forest 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernels: part1, part2, part3
- Feature Engineering and Feature Selection 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
- Unsupervised Learning: Principal Component Analysis and Clustering 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
- Vowpal Wabbit: Learning with Gigabytes of Data 🇬🇧 🇷🇺 🇨🇳, Kaggle Kernel
- Time Series Analysis with Python, part 1 🇬🇧 🇷🇺 🇨🇳. Predicting future with Facebook Prophet, part 2 🇬🇧, Kaggle Kernels: part1, part2
- Gradient Boosting 🇬🇧 🇷🇺, 🇨🇳, Kaggle Kernel
Lectures
Videolectures are uploaded to this YouTube playlist.
- Exploratory data analysis with Pandas, video
- Visualization, main plots for EDA, video
- Decision trees: theory and practical part
- Logistic regression: theoretical foundations, practical part (baselines in the "Alice" competition)
- Emsembles and Random Forest – part 1. Classification metrics – part 2. Example of a business task, predicting a customer payment – part 3
- Linear regression and regularization - theory, LASSO & Ridge, LTV prediction - practice
Assignments
- Exploratory Data Analysis of Olympic games with Pandas, nbviewer. Deadline: October 14, 21:59 UTC+2
- Exploratory Data Analysis of US flights, nbviewer. Deadline: October 21, 21:59 UTC+2
- Decision trees. nbviewer. Deadline: October 28, 21:59 UTC+2. Optional: implementing a decision tree algorithm, nvbiewer (no webforms and credits, the same deadline)
- Logisitic regression. nbviewer. Deadline: November 4, 21:59 UTC+2
- Random Forest and Logistic Regression in credit scoring and movie reviews classification. nbviewer. Deadline: November 11, 21:59 UTC+2
- Beating baselines in "How good is your Medium article?". nbviewer. Deadline: November 18, 21:59 UTC+2
The following are demo versions. Just for practice, they don't have an impact on rating.
- Exploratory data analysis with Pandas, nbviewer, Kaggle Kernel
- Analyzing cardiovascular disease data, nbviewer, Kaggle Kernel
- Decision trees with a toy task and the UCI Adult dataset, nbviewer, Kaggle Kernel
- Linear Regression as an optimization problem, nbviewer, Kaggle Kernel
- Logistic Regression and Random Forest in the credit scoring problem, nbviewer, Kaggle Kernel
- Exploring OLS, Lasso and Random Forest in a regression task, nbviewer, Kaggle Kernel
- Unsupervised learning, nbviewer, Kaggle Kernel
- Implementing online regressor, nbviewer, Kaggle Kernel
- Time series analysis, nbviewer, Kaggle Kernel
- Gradient boosting and flight delays, nbviewer, Kaggle Kernel
Kaggle competitions
- Catch Me If You Can: Intruder Detection through Webpage Session Tracking. Kaggle Inclass
- How good is your Medium article? Kaggle Inclass
Rating
Throughout the course we are maintaining a student rating. It takes into account credits scored in assignments and Kaggle competitions. Top students (according to the final rating) will be listed on a special Wiki page.
Community
Discussions between students are held in the #mlcourse_ai channel of the OpenDataScience Slack team. Fill in this form to get an invitation. The form will also ask you some personal questions, don't hesitate 👋
More info
Go to mlcourse.ai
The course is free but you can support organizers by making a pledge on Patreon