Data Science & Machine Learning Course

  • Lesson 1

    • Motivation
    • Provocations: hw, data, internet, AI, cloud computing
    • Tendencies
    • Platforms
    • References
  • Lesson 2

    • Conda essentials
    • Platforms (jupyter, jupyterlab, colab)
    • Python Crash Course
  • Lesson 3

    • Modules, Iterations, List Comprehesion
    • String and date operations
    • Introducting to Object-Oriented Programming (OOP)
  • Lesson 4

    • Introduction to Numpy
    • Introduction to Pandas
  • Lesson 5

    • Data Cleaning Basic
  • Lesson 6

    • Exploratory Data Analysis I
    • Matplotlib
    • Line, Bar and Scatter Plots
  • Lesson 7

    • Exploratory Data Analysis II
    • Histogram and Box Plots
    • Wrapper from Pandas to Matplotlib
  • Lesson 8

    • Exploratory Data Analysis III
    • Case study: gender gap
    • Aesthetics
    • Colors, Lines width
    • Annotations
  • Lesson 9

    • Exploratory Data Analysis IV
    • Case study: titanic
    • Visualizing missing values
    • Aggregate data using pivot table
    • Storytelling from Seaborn
  • Lesson 10

    • Exploratory Data Analysis V
    • Visualizing geographical data
    • Working with basemap
    • Customizing the plot
    • Folium
    • Maps with markers
    • Maker clusters -Heatmap
  • Lesson 11

    • Exploratory Data Analysis VI
    • Case Study #1 - Jonh Snow Map
    • Case Study #2 - Open Data Natal-RN
  • Lesson 12

    • Exploratory Data Analysis VII
    • Case study: IBGE
    • Geojson
    • Importing files
    • Creating maps
    • Choropleths maps
  • Lesson 13

    • Case study: NYC open data (education)
    • Data cleaning walkthrough
    • Combining data
    • Groupby
    • Merge (inner, outer, right, left)
  • Lesson 14

    • Sampling
      • Population and sampling
      • Sampling error
      • Simple random sampling (SRS)
      • Stratified sampling
      • Clustering sampling
    • Variables in statistics
      • Quantitative and qualitative variables
      • Scale of measurements (nominal, ordinal, interval, ratio)
  • Lesson 15

    • Frequency Distributions
      • Sorting frequency distribution tables
      • Percentiles and percentiles ranks
      • Information loss
    • Visualizing Distributions
      • Bar, Pie, Histograms plots
      • Skewed distributions
      • Symmetrical Distributions
    • Comparing Frequency Distribution
  • Lesson 16

    • A brief history of AI
    • Key definitions
    • Types of Machine Learning
    • Machine Learning Workflow
    • Main challenges
    • End-to-end ML project
  • Lesson 17

    • Univariate KNN
      • Euclidean distance for univariate
      • Function to make predictions
      • Error metrics
    • Multivariate KNN
      • Normalize columns
      • Euclidean distance for multivariate
    • Hyperparameter optimization
    • Cross-Validation
  • Lesson 18

    • Linear Regression (one variable)
    • Cost function
    • Gradient descent
    • Refresher on linear algebra concepts
    • Linear Regresion (multiple variables)
  • Lesson 19

    • Classification
    • Binary Classification
    • Decision Boundary
    • Cost Function
    • Multiclass Classification
    • Regularization
    • Hands on Scikit
  • Lesson 20

    • Clustering Basic
    • K-Means
    • Case study: senators votes, nba
  • Lesson 21

    • Introduction to Decision Tree
    • Converting categorical variables
    • Splitting Data
    • Decision Trees as flows of data
    • Entropy
    • Information gain
    • Applying Decision Trees
    • Overfitting problem
  • Lesson 22

    • Ensembles (Random Forest)
    • Combining predictions
    • Why Ensembling works
    • Introduction variation with bagging and random features
    • Reducing overfitting using Random Forest
    • Case study: US Census, predicting bike rentals
  • Lesson 23

    • Getting Started with Kaggle
    • Feature Preparation, Selection and Engineering
    • Model Selection and Tuning
    • Creating a Kaggle Workflow
  • Lesson 24

    • Deep Learning Fundamentals I
    • Representing neural network
    • Nonlinear activation functions
    • Hidden Layers
    • Case study: build a handwritten digit classified
  • Lessson 25

    • Deep Learning Fundamentals II
    • Mathematical building blocks of neural networks
    • Getting started with neural networks
    • Classifying movie reviews: a binary classification example
    • Classifying newswires: a multiclass classification problem
    • Predicting houses price: a regression problem
  • Lesson 26

    • Deep Learning Fundamentals III
    • Formal evaluation procedures for machine learning models
    • Preparing data for deep learning
    • Feature engineering
    • Tackling overfitting -The universal workflow for approaching machine learning problems
    • Case study: titanic