/Toronto-Shelter-ETL-Pipeline

In this project, I built an end-to-end data pipeline that processes and analyzes daily occupancy and capacity data from Toronto's shelter and overnight service programs.

Primary LanguagePythonMIT LicenseMIT

Objective

The purpose of this project was to analyze data regarding occupancy and funding/actual capacity in active overnight shelter services operating within the Toronto area.

Tools & Architecture

Project Arcitecture

  • Python (Ingestion & Transformation)
  • Google Cloud Platform
    • Cloud Storage (Storage)
    • Cloud Composer/Airflow (Scheduling)
    • BigQuery (Warehouse)
  • Looker Studio (Analytics)

Data

The dataset used was provided by the Shelter, Support and Housing Administration division and is available for preview and download at the City of Toronto's open data catalogue.

Data Model Schema:

Data Model

Dashboard

Queried and joined data within BigQuery, sending it to Looker Studio where it was used to build a report that highlighted insights regarding overnight shelter programs in the city. The interactive report can be found here.

Dashboard