/2023fstep25

Primary LanguageJupyter NotebookMIT LicenseMIT

Agenda:

08.30 am: Intro, career paths in data and questions

  1. This is how my career is progressing so far
  2. Here's some information about career paths in data
  3. What problems do you want to solve?

09.00 am: Looker Studio

  1. Create a Google account (you can use your personal one too, if you prefer)
  2. Head to Looker Studio, using another tab
  3. Download as CSV from here and upload to Looker Studio
  4. Build a few charts and filters
    • What are the total sales by category and year?
    • What are the total profit by category and year?
    • Which cities have generated the highest sales?
    • What are the margins by segment?
    • Make your dashboard look better using a color palette generator
  5. Practice by creating another page named Customer, and using any chart/table:
    • What is the name of the Customer with the Highest Sales in New York City?
    • What Sub-Category did this Customer buy, and how much Sales and Profit did this Customer generate for the company?
    • Open question: Is the business doing better or worse off from 2014 to 2017?
  • Source: Superstore sample data from Tableau

10.30 am: Break


10.50 am: Google Colab and Seaborn

  1. Using either your new Google account or your personal account, open Google Colab in another tab
  2. Google Colab's interface and functions: - Tools >> Settings >>
    • Editor >> Show line numbers (check if you prefer)
    • Miscellaneous >> Corgi mode, Kitty mode (turn on if you like)
    • Test with some basic code:
      • Click Connect at top right
      • Write simple definition
      • Test simple math problem
    • Runtime settings
      • Run cells
      • Reset
    • Code and text cells
    • Save
  3. More about Colab’s Markdown here
  4. Refer to colab_intro.ipynb
  5. Open a new notebook on Google Colab
  6. Try out a few plots on Seaborn:
    • histplot
    • displot
    • boxplot
    • lmplot 7. Practice:
    • Load the CSV into Colab
    • Create a displot, where x-axis represents Region
    • Create a catplot, where x-axis is Profit and y-axis is Sub-Category
    • Adjust the size of charts by sns.set(rc={'figure.figsize':(25.7,8.27)})
    • Create a boxplot where x-axis is Sales, and y-axis is Region

12:30 pm: Break


1:30 pm: data.gov.my

  1. Using either your new Google account or your personal account, sign up for Postman, then log in

    • Look for Workspaces (top left of the page) >> My Workspace (Click)

    • image

    • Look for the + sign beside Overview

    • image

  2. Open data.gov.my using another tab on your browser >> API Docs

  3. Scroll to Realtime APIs >> Flood Warning API

  4. Try the Flood Warning API Endpoint GET https://api.data.gov.my/flood-warning on Postman

    • Look for the icon circled in the snapshot below

    • image

    • Switch from cURL to Python - Requests

    • image

    • Copy the Python code from Postman and paste into a new Colab notebook

  5. Change print(response.text) to response.json() to see the data returned by the API endpoint

  6. Tasks:

    • Convert json output into a dataframe using pd.json_normalize()
    • Store dataframe into a variable df
    • Filter to include only df[(df["water_level_indicator"] == "NORMAL")] and today's date
    • Output into a CSV file, then download and open in Excel and Google My Maps
    • The nearest river to our current location is Sungai Gombak, spot which station_id is it
    • What other questions can you answer?
    • Don't forget to try using the Weather API

2:30 pm: Break


2.50 pm: Send emails of Air Pollution Index (API) data using Python

  1. Need to create another Google account, if you haven't already. Remember to activate 2-Step Verification
  2. Refer to Sending a Plain-Text Email. Look for the code section just above the Sending a Fancy Email header.
  3. Register for an account on aciqn
  4. Using Postman, try making an API call (documentation here
    • GET https://api.waqi.info/feed/kuala%20lumpur/?token=###INSERT TOKEN HERE###
  5. From Postman, copy the Python code into a Colab notebook
  6. Tasks:
    • Get the API readings of a few stations
    • Send the email containing the API of a chosen station/a few selected locations

4.15 pm: Wrap up

  1. Data visualization and coding skills help to automate routine tasks - worth picking up to save time
  2. Get certified for cloud computing