-
Lesson 1
- Motivation
- Provocations: hw, data, internet, AI, cloud computing
- Tendencies
- Platforms
- References
-
Lesson 2
- Conda essentials
- Platforms (jupyter, jupyterlab, colab)
- Python Crash Course
-
Lesson 3
- Modules, Iterations, List Comprehesion
- String and date operations
- Introducting to Object-Oriented Programming (OOP)
-
Lesson 4
- Introduction to Numpy
- Introduction to Pandas
-
Lesson 5
- Data Cleaning Basic
-
Lesson 6
- Exploratory Data Analysis I
- Matplotlib
- Line, Bar and Scatter Plots
-
Lesson 7
- Exploratory Data Analysis II
- Histogram and Box Plots
- Wrapper from Pandas to Matplotlib
-
Lesson 8
- Exploratory Data Analysis III
- Case study: gender gap
- Aesthetics
- Colors, Lines width
- Annotations
-
Lesson 9
- Exploratory Data Analysis IV
- Case study: titanic
- Visualizing missing values
- Aggregate data using pivot table
- Storytelling from Seaborn
-
Lesson 10
- Exploratory Data Analysis V
- Visualizing geographical data
- Working with basemap
- Customizing the plot
- Folium
- Maps with markers
- Maker clusters -Heatmap
-
Lesson 11
- Exploratory Data Analysis VI
- Case Study #1 - Jonh Snow Map
- Case Study #2 - Open Data Natal-RN
-
Lesson 12
- Exploratory Data Analysis VII
- Case study: IBGE
- Geojson
- Importing files
- Creating maps
- Choropleths maps
-
Lesson 13
- Case study: NYC open data (education)
- Data cleaning walkthrough
- Combining data
- Groupby
- Merge (inner, outer, right, left)
-
Lesson 14
- Sampling
- Population and sampling
- Sampling error
- Simple random sampling (SRS)
- Stratified sampling
- Clustering sampling
- Variables in statistics
- Quantitative and qualitative variables
- Scale of measurements (nominal, ordinal, interval, ratio)
- Sampling
-
Lesson 15
- Frequency Distributions
- Sorting frequency distribution tables
- Percentiles and percentiles ranks
- Information loss
- Visualizing Distributions
- Bar, Pie, Histograms plots
- Skewed distributions
- Symmetrical Distributions
- Comparing Frequency Distribution
- Frequency Distributions
-
Lesson 16
- A brief history of AI
- Key definitions
- Types of Machine Learning
- Machine Learning Workflow
- Main challenges
- End-to-end ML project
-
Lesson 17
- Univariate KNN
- Euclidean distance for univariate
- Function to make predictions
- Error metrics
- Multivariate KNN
- Normalize columns
- Euclidean distance for multivariate
- Hyperparameter optimization
- Cross-Validation
- Univariate KNN
-
Lesson 18
- Linear Regression (one variable)
- Cost function
- Gradient descent
- Refresher on linear algebra concepts
- Linear Regresion (multiple variables)
-
Lesson 19
- Classification
- Binary Classification
- Decision Boundary
- Cost Function
- Multiclass Classification
- Regularization
- Hands on Scikit
-
Lesson 20
- Clustering Basic
- K-Means
- Case study: senators votes, nba
-
Lesson 21
- Introduction to Decision Tree
- Converting categorical variables
- Splitting Data
- Decision Trees as flows of data
- Entropy
- Information gain
- Applying Decision Trees
- Overfitting problem
-
Lesson 22
- Ensembles (Random Forest)
- Combining predictions
- Why Ensembling works
- Introduction variation with bagging and random features
- Reducing overfitting using Random Forest
- Case study: US Census, predicting bike rentals
-
Lesson 23
- Getting Started with Kaggle
- Feature Preparation, Selection and Engineering
- Model Selection and Tuning
- Creating a Kaggle Workflow
-
Lesson 24
- Deep Learning Fundamentals I
- Representing neural network
- Nonlinear activation functions
- Hidden Layers
- Case study: build a handwritten digit classified
-
Lessson 25
- Deep Learning Fundamentals II
- Mathematical building blocks of neural networks
- Getting started with neural networks
- Classifying movie reviews: a binary classification example
- Classifying newswires: a multiclass classification problem
- Predicting houses price: a regression problem
-
Lesson 26
- Deep Learning Fundamentals III
- Formal evaluation procedures for machine learning models
- Preparing data for deep learning
- Feature engineering
- Tackling overfitting -The universal workflow for approaching machine learning problems
- Case study: titanic