/LetsPy

This repo contains codes/slides for LetsPy talk.

Primary LanguageJupyter Notebook

LetsPy

This repo contains codes/slides for LetsPy talk.

Topics

  • Refresh the basics of Python language - which was covered on 1st Day
    • list
    • tuple
    • dictionary
    • sets
  • Introduction to Data Science
    • What is Data Science
    • Business Intelligence Vs Data Science
    • Lifecycle of Data Science
    • Case study
  • Introduction to Pandas
    • Installation !pip install pandas
    • Covering the basics of library's main data structures - dataframes and series.
    • Working with DataFrame - dives a bit deeper into the functionality of DataFrames which shows how to inspect, select, filter, merge, combine, and group your data.
    • Using pandas with Titanic dataset - Kaggle's beginner friendly problem statement so that one should apply the learnings of the first two parts in order to do basic analysis.
  • Introduction to Numpy
    • Installation !pip install numpy
    • Numpy 1D Array
    • Numpy 2D Array
    • Multidimensional Array
    • Array concatenation & splitting & subarrays
    • Difference between dynamic-type list and fixed-type (Numpy-style) array
  • Introduction to matplotLib
    • Installation - !pip install matplotlib
    • Plot two dataframe columns as a scatter plot
    • Plot column values as a bar plot
    • Line plot with multiple columns
    • Save plot to file
    • Bar plot with group by
    • Stacked bar plot with group by, percentage counts
    • Stacked bar plot with two-level group by
    • Stacked bar plot with two-level groupby, percentages normalized to 100%
    • Stacked bar plot with a single groupby, percentages normalized to 100%
    • Plot a histogram of column values
    • Date histogram
  • Introduction to Scikit-learn - Optical Recognition of Handwritten Digits using Scikit - Learn Beginner Friendly ML project.
    • Installation !pip install -U scikit-learn
    • This tutorial will help folks, to understand basics of ML - i.e. how to do Exploratory Data Analysis with the help of matplotlib and Principal Component Analysis (PCA).
    • Preprocess the data with normalization + split the data into training and test sets.
    • If time permits, will do KMeans on top of this.

Sources:-