/chicago-etl-pipeline-data-engineering-project

Chicago Data Analytics - Modern Data Engineering GCP Project

Primary LanguageJupyter Notebook

Chicago Taxi Trip Data Analytics - Modern Data Engineering GCP Project

Introduction

The aim of this project is to perform data analytics on Chicago Taxi Trip data using various tools and technologies including Python, SQL, Mage Data Pipeline Tool, GCP Storage, Compute Instance, BigQuery and Looker Studio.

Architecture

Data Source

Chicago Taxi Trip datasets was extracted using the Chicago Data Portal API Endpoint with the columns recording information on trip start and end date time, trip in seconds, trip in miles, pickup and dropoff community area, fare, tips, payment type, company, pickup and dropoff locations.

Community area data that contains the pickup and dropoff community area name was used to enriched the Chicago Taxi trip data.

Data Source Link

  1. Chicago Taxi Trip
  2. Community Area

Data Collection and transformation Scripts

Data Model

Data Visualization

Link: Chicago Taxi Dashboard