Desigining Advanced Data Architectures for Business Intelligence.
This repository contains the data engineering solution for analyzing restaurant inspections and violations happening in Chicago. The project involves data extraction, data cleansing, transformation, exploratory analysis, data visualization, and data flow orchestration of event data on the cloud. The goal of this project is to enable data-driven decisions, improve customer engagement, and drive business growth by providing insights into food establishment inspections in Chicago.
The data for this project is sourced from the Chicago Data Portal. It includes information on inspections of restaurants and other food establishments in Chicago from January 1, 2010, to the present. Inspections are performed by staff from the Chicago Department of Public Health’s Food Protection Program using a standardized procedure. The dataset is updated daily and is available in CSV format.
Number of Rows: 443,000
Number of Columns: 17
Update Frequency: Daily
File Format: CSV
ER/Studio Data Architect for data modeling.
Alteryx for data integration and ETL processes.
Microsoft SQL Server for data warehousing.
Power BI and Tableau for business intelligence reporting and visualization.
This dataset consists of below metadata information:
The project utilizes ER/Studio Data Architect to design and manage data models and metadata for the database. It involves creating conceptual, logical, and physical data models, mapping relationships and dependencies between them, and implementing a multi-fact dimensional model using normalization and schema design strategies.
Create a dynamic dimensional model using ER/Studio to facilitate efficient data analysis, enabling exploration of food inspection data across various dimensions and hierarchies.
The data integration and ETL pipeline is a two-layer process:
Landing or Staging Data Layer: Data is extracted from the source and staged into the SQL Server database with minimal transformation (as-is data load). Audit fields are added to identify newly loaded or updated records.
Schema: CHICAGO_INS
Table: STG_CHICAGO_FOOD_ESTD_INS
Dimensional Data Layer:
Data is loaded into the multi-fact dimensional data warehouse after extraction and transformation with defined rules. Data quality checks are performed post data load, and audit columns are added to each dimension and fact for record audit and maintenance.
Business Intelligence reporting involves preparing, analyzing, and portraying business metrics to support decision making. This project leverages the potential of Power BI and Tableau to derive key business insights by identifying key measurements such as satisfaction level and applying dimensions such as inspection date, restaurant, and violation code information.
BI reports and dashboards have been developed to identify violations in health norms and regulations at restaurants and cafeterias in Chicago.