/JLL_machine-learning-content

Contenido del Bootcamp

Primary LanguageJupyter Notebook

Machine Learning Bootcamp

  1. Use the export folder for publishing pursposes, all lessons are compiled into that folder.
  2. Compile original lessons by running bash export.sh

🔥 Remember to run the $ bash export command.

Python (3 days)

  1. Python for Datascience

Probability and Statistic Skills (6 days)

  1. Calculus

  2. Linear Algebra

📝 Calculus and Linear algebra problems

  1. Probability

📝 Probability problems

  1. Descriptive Statistics

📝 Descriptive statistics problems

  1. Random Variables

📝 Probability Distribution problems

6.1. Hypothesis Testing

📝 Hypothesis testing problems

Computer Science (1 day)

  1. Optimizing Algorithms

📝 Algorithm optimization problems

Collect and store data (4 days)

1.1. Intro to SQL (Structured Query language) - external

1.2 Create and connect to SQL databases with Python

📝 Connecting to a Sql database from Python

2.1. Loading Static Files (csv, json, yml)

2.2. Web Scraping tools and techniques

📝 Web scraping data from a website

  1. Project structure

📝 Interacting with the Twitter API

Data Management (3 days)

  1. Exploratory data analysis (EDA)

1.2. Titanic survival notebook to understand EDA (2 hours)

  1. Feature Engineering

2.1. How to deal with outliers

2.2. How to deal with missing data

2.3. Feature encoding for categorical variables

2.4. Feature Scaling

  1. Feature selection techniques

📝 Project: New York City Airbnb exploratory data analysis (2 hours)

Machine Learning (12 days)

1.1. Machine Learning Basics

1.2. Model evaluation

1.3. Model hyperparameters optimization

1.4. Logistic Regression on Titanic notebook

📝 Project: Bank Marketing Campaign (2 hours)

2.1. Linear Regression

2.2. Exploring Linear Regression notebook

📝 Project: Predicting insurance cost (2 hours)

3.1. Regularized Linear Regression

📝 Project: Finding important sociodemographic features that impact in health resources (2 hours)

4.1. Decision Trees

4.2. Exploring Decision Trees Notebook

📝 Project: Classifying patients having diabetes or not (2 hours)

5.1. Random Forest

📝 Project: Improving Titanic survival results (2 hours)

6.1. Boosting Algorithms

📝 Project: Boosting your Titanic with XGBoost algorithm (2 hours)

7.1. Naive Bayes

7.2. Exploring Naive Bayes notebook

📝 Project: Create a Google Play store reviews classifier (Sentiment Analysis) (2 hours)

8.1. Support Vector Machine

8.2. Intro to Natural Language Processing

8.3. Exploring Natural Language Processing Notebook

📝 Project: Building an email spam detector (2 hours)

9.1. K-nearest neighbors (KNN)

📝 Project: Building a simple movie recommender system (2 hours)

10.1. Unsupervised Learning

📝 Project: Segment houses based on their coordinates and median income. (2 hours)

11.1. Time Series Forecasting

11.2. Exploring Time Series Notebook

📝 Project: CPU usage anomaly detection (2 hours)

12.1. Introduction to Deep Learning

12.2. Exploring Neural Networks Notebook

📝 Project: Building an image classifier (2 hours)

Data Science as Software (4 days)

  1. How to create a machine learning web app using Flask and Heroku.

📝 Flask app project

  1. How to create a machine learning web app using Streamlit and Heroku

📝 Streamlit app project

Data Warehouse tools: modeling on the cloud (3 days)

  1. Cloud Computing

  2. Intro to AWS SageMaker

Week 13-16: Final project