/etl-pipeline-uk-crime-analysis

Data engineering pipeline for collecting, transforming, analysing, and visualising UK crime data.

Primary LanguageHCL

UK Crime Patterns with Cloud Data Engineering

Overview

This project is a work in progress. When finished, it will demonstrate a data engineering pipeline to collect, transform, and analyse open data on crime in the United Kingdom. The goal is to gain insights into crime trends, geographic distributions, and other crucial patterns

Dataset

UK Police Data (CSV): Sourced from https://data.police.uk/data/. Includes monthly CSV files for individual police forces.

Technologies

  • Cloud Platform: Google Cloud Platform (GCP)
  • Infrastructure Management: Terraform
  • Data Orchestration: Mage running on GCP
  • Data Storage: Google Cloud Storage (GCS)
  • Data Warehousing: Google BigQuery
  • Data Transformation: dbt (SQL-based transformations, testing)
  • Visualisation: Looker