30 Days of ML

📋 Day 01 Assignment

  1. Follow the instructions in this notebook to get started with Kaggle
  2. Join 30 Days of ML Discord Community and introduce yourself in the #introductions channel

💡 What You’ll Learn

Today, you’ll set up your Kaggle account, move up from Novice to Contributor, and even make your very first submission to a Kaggle competition! The assignment should only take 45 minutes to complete.

Start Today's Assignment

Also, you’ll be able to jump into our Discord community & connect with other learners. It will be a great resource to ask questions and get help from others.

Do note, Kaggle team members and Google Developer Experts will be there periodically, but the room is primarily a peer to peer space. Please follow Kaggle’s community guidelines found here.

📋 Day 02 Assignment

  1. Read this tutorial (from Lesson 1 of the Python course)
  2. Complete this exercise (from Lesson 1 of the Python course)

💡 What You’ll Learn

In Lesson 1 (Hello Python), you’ll get a feel for Python syntax, and learn how to work with variables and do arithmetic in Python.

📋 Day 03 Assignment

  1. Read this tutorial (from Lesson 2 of the Python course)
  2. Complete this exercise (from Lesson 2 of the Python course)

💡 What You’ll Learn

In Lesson 2 (Functions and Getting Help), you’ll learn how to work with functions, which are reusable blocks of code designed to perform a task. You’ll also learn how to write your own!

📋 Day 04 Assignment

  1. Read this tutorial (from Lesson 3 of the Python course)
  2. Complete this exercise (from Lesson 3 of the Python course)

💡 What You’ll Learn

In Lesson 3 (Booleans and Conditionals), you’ll learn all about the Boolean data type, which allows you to represent “True” and “False” in Python code. This will provide a strong foundation for understanding how to write conditional statements, which are used to modify how code runs based on whether certain conditions hold.

📋 Day 05 Assignment

  1. Read this tutorial (from Lesson 4 of the Python course)
  2. Complete this exercise (from Lesson 4 of the Python course)
  3. Read this tutorial (from Lesson 5 of the Python course)
  4. Complete this exercise (from Lesson 5 of the Python course)

💡 What You’ll Learn

In Lesson 4 (Lists), you’ll learn how to use Python lists to store ordered collections of values. Lists are incredibly useful when writing code to manage several related variables.

In Lesson 5 (Loops and List Comprehensions), you’ll learn an efficient way to repeatedly execute code. With list comprehensions, you’ll often be able to condense code that would have taken several lines to just a single line!

📋 Day 06 Assignment

  1. Read this tutorial (from Lesson 6 of the Python course)
  2. Complete this exercise (from Lesson 6 of the Python course)

💡 What You’ll Learn

In Lesson 6 (Strings and Dictionaries), you’ll learn about strings, which is a data type that is useful for representing human-readable data, such as text. A dictionary is another new data type, that is similar to a list, but with important differences that makes it incredibly useful in its own right.

📋 Day 07 Assignment

  1. Read this tutorial (from Lesson 7 of the Python course)
  2. Complete this exercise (from Lesson 7 of the Python course)

💡 What You’ll Learn

One of the best things about Python is the vast number of high-quality custom libraries that have been written for it. In Lesson 7 (Working with External Libraries), you’ll learn how to access this pre-written code and use it in your own work.

📋 Day 08 Assignment

  1. Read this tutorial (from Lesson 1 of the Intro to ML course)
  2. Read this tutorial (from Lesson 2 of the Intro to ML course)
  3. Complete this exercise (from Lesson 2 of the Intro to ML course)

💡 What You’ll Learn

In Lesson 1 (How Models Work), you will start at the very beginning: what exactly is “machine learning”, and how is it used in the real world? You’ll learn the answers to these questions and explore the basics of decision trees, as you start to build a strong foundation for some of the most cutting-edge techniques in data science.

In Lesson 2 (Basic Data Exploration), you’ll learn all about pandas, the primary tool used by data scientists for exploring and manipulating data. Then, you’ll use your new knowledge to examine a dataset of home prices.

📋 Day 09 Assignment

  1. Read this tutorial (from Lesson 3 of the Intro to ML course)
  2. Complete this exercise (from Lesson 3 of the Intro to ML course)
  3. Read this tutorial (from Lesson 4 of the Intro to ML course)
  4. Complete this exercise (from Lesson 4 of the Intro to ML course)

💡 What You’ll Learn

In Lesson 3 (Your First Machine Learning Model), you’ll create a machine learning model using the scikit-learn library, one of the most popular and efficient tools for data analysis.

Along the way, you’ll learn some basic techniques for working with very large datasets. These skills are especially important for modern data scientists, who often work with “big data” containing millions of variables ― many more than a human can conceivably understand! Thankfully, machines excel at discovering useful patterns in datasets that are too large for humans to wrap their heads around. :)

Once you have built a model, how good is it? How exactly should you judge how close the model’s predictions are to what actually happened? In Lesson 4 (Model Validation), you’ll use model validation to measure the quality of your model.

📋 Day 10 Assignment

  1. Read this tutorial (from Lesson 5 of the Intro to ML course)
  2. Complete this exercise (from Lesson 5 of the Intro to ML course)
  3. Read this tutorial (from Lesson 6 of the Intro to ML course)
  4. Complete this exercise (from Lesson 6 of the Intro to ML course)

💡 What You’ll Learn

In Lesson 5 (Underfitting and Overfitting), you’ll learn about the fundamental concepts of underfitting and overfitting. Then you'll apply these ideas to gain a deep understanding of why some models succeed and others fail. This knowledge will make you much more efficient at discovering highly accurate machine learning models.

In Lesson 6 (Random Forests), you’ll learn all about random forests, another machine learning model you can add to your growing toolkit. Then, put your new knowledge to use immediately by building your own random forest model that exceeds the performance of the models that you’ve built so far!

📋 Day 11 Assignment

  1. Read this tutorial (from Lesson 7 of the Intro to ML course)
  2. Complete this exercise (from Lesson 7 of the Intro to ML course)

💡 What You’ll Learn

One way to further improve your skills is to participate in machine learning competitions. In Lesson 7 (Machine Learning Competitions), you’ll create and submit your predictions to a Kaggle competition.

📋 Day 12 Assignment

  1. Read this tutorial (from Lesson 1 of the Intermediate ML course)
  2. Complete this exercise (from Lesson 1 of the Intermediate ML course)
  3. Read this tutorial (from Lesson 2 of the Intermediate ML course)
  4. Complete this exercise (from Lesson 2 of the Intermediate ML course)
  5. Read this tutorial (from Lesson 3 of the Intermediate ML course)
  6. Complete this exercise (from Lesson 3 of the Intermediate ML course)

💡 What You’ll Learn

In Lesson 1 (Introduction), you’ll learn more about what the course covers.

Most machine learning libraries (including scikit-learn) give an error if you try to build a model using data with missing values. In Lesson 2 (Missing Values), you’ll learn about three different approaches for dealing with missing values in your data.

A categorical variable is a variable that takes only a limited number of values, and it’s common to encounter them in data. Learn how to work with them in Lesson 3 (Categorical Variables).

📋 Day 13 Assignment

  1. Read this tutorial (from Lesson 4 of the Intermediate ML course)
  2. Complete this exercise (from Lesson 4 of the Intermediate ML course)
  3. Read this tutorial (from Lesson 5 of the Intermediate ML course)
  4. Complete this exercise (from Lesson 5 of the Intermediate ML course)

💡 What You’ll Learn

In Lesson 4 (Pipelines), you’ll learn a simple way to keep your data preprocessing and modeling code organized.

You’re already a bit familiar with model validation from the Intro to Machine Learning course. In Lesson 5 (Cross-Validation), you’ll explore a more advanced validation technique that gives a better measure of model performance.

📋 Day 14 Assignment

  1. Read this tutorial (from Lesson 6 of the Intermediate ML course)
  2. Complete this exercise (from Lesson 6 of the Intermediate ML course)
  3. Read this tutorial (from Lesson 7 of the Intermediate ML course)
  4. Complete this exercise (from Lesson 7 of the Intermediate ML course)

💡 What You’ll Learn

In Lesson 6 (XGBoost), you will learn how to build and optimize models with gradient boosting. This method dominates many Kaggle competitions and achieves state-of-the-art results on a variety of datasets.

In Lesson 7 (Data Leakage), you will learn what data leakage is and how to prevent it. If you don't know how to prevent it, leakage will come up frequently, and it will ruin your models in subtle and dangerous ways. So, this is one of the most important concepts for practicing data scientists.

📋 Week 3 Assignment

This is your personal invitation link to join the competition. Please do not share it with anyone else, and do not post it in the Discord community.

💡 What You’ll Learn

In the link above, you’ll find a detailed introduction to Kaggle competitions (that covers how to work in a team and much more), along with a getting started tutorial that walks you through how to make your very first submission.

📋 Final week's Assignment

✏️ Important Notes

Competition

It’s not too late to get started, if you have not already. This guide has all of the orientation you need.

You can make really strong progress by doing just a little bit each day: aim to submit to the competition at least once each day. Remember you can chat with other participants in the Discussion tab, and you can view code examples from the Kaggle community in the Code tab.

Google Developer Expert Workshops

The workshops are optional for the 30 Days of ML program. There are 3 available: Intro to Supervised Classification, How to Build a Data Science Portfolio and Scikit-optimize for LightGBM.

If you have any questions, there is a Question and Answer channel for each workshop in the 30 Days of ML Discord server. Note: the speakers may not be able to get to every question. If you are able to answer a question, feel free to jump in to help others.