
The SQL project involved designing tables to hold data from six CSV files, creating a table schema for each file, importing the data into SQL tables, and performing data analysis. The analysis involved answering various questions about the data, such as listing employee information and department managers and ...

MIT LicenseMIT


This project is focused on analysing the legacy employee data of Pewlett Hackard, a fictional company, during the 1980s and 1990s. The goal is to perform data modeling, data engineering, and data analysis on the available data from six CSV files.

Project Overview

The project is divided into three main parts:

Part 1: Data Modelling

Inspected the CSV files and created an Entity Relationship Diagram (ERD) of the tables using QuickDBD. image

Part 2: Data Engineering

1.Created a table schema for each of the six CSV files, specifying data types, primary keys, foreign keys, and other constraints.
2.Imported each CSV file into its corresponding SQL table.

Part 3: Data Analysis

Performed various SQL queries to answer questions about the data, such as employee details, hire dates, managers, department information, and frequency counts of employee last names.


Data Modelling: ERD or table schemas
Data Engineering: Table definitions, column data types, primary keys, foreign keys, constraints, and table relationships
Data Analysis: SQL queries for various data analysis tasks


The project is deployed to a GitHub repository, including the necessary files for data modeling, data engineering, and data analysis.


The data used in this project was generated by Mockaroo, LLC (2022), a realistic data generator.