/Pewlett-Hackard-employees

For this project, I designed tables to hold the data from the CSV files, imported the CSV files into a SQL database, and then queried the database for some specific information the data.

Description

I performed a research about people at Pewlett Hackard employed during the 1980s and 1990s. All that remains of the employee database from that period are six CSV files. For this project, I designed tables to hold the data from the CSV files, imported the CSV files into a SQL database, and then queried the database for some specific information the data. The project work flow used for this project is captured in the chart below:


Tools Used

Project Steps

Data Modelling

A conceptual schema was created in order to identify relationships amongst the data. The conceptual schema was tansformed into an Entity Relationship Diagram (ERD). I used QuickDBD app to make the ERD sketch. In the ERD I identified and showed all primary keys and mapped foreign keys. The foreign mapping included one to many and many to many relationship.

image

Data Engineering

In this step, I used the information from the ERD to create a table schema for each of the six CSV files. I also included the primary keys, foreign keys, and other constraints into each of the table schema. To uniquely identify a row, I added a unique constraint to the primary keys. Extra care was taken to create tables in the correct order to handle the foreign keys. Thereafter, I Imported each CSV file into its corresponding SQL table. I also attempted an alternative way to import the csv files into the created tables using COPY FROM method available in Postgres. However, in other to use this method, it is required to set file permission to Full control. As best practice, at the beginning of the data engineering process, I parsed a query to drop tables before Creating Tables. The CASCADE statement allows tables with foreigns to be dropped.

Data Analysis

The last step of this project was to query the database using sql SELECT statement in a bid to provide a robust analysis on the following tasks:

  • #1 A list of the employee number, last name, first name, sex, and salary of each employee.
  • #2 A list of the first name, last name, and hire date for the employees who were hired in 1986.
  • #3 A list of the manager of each department along with their department number, department name, employee number, last name, and first name.
  • #4 A list of the department number for each employee along with that employee’s employee number, last name, first name, and department name.
  • #5 A list of first name, last name, and sex of each employee whose first name is Hercules and whose last name begins with the letter B.
  • #6 A list of each employee in the Sales department, including their employee number, last name, and first name.
  • #7 A list of each employee in the Sales and Development departments, including their employee number, last name, first name, and department name.
  • #8 A list of the frequency counts, in descending order, of all the employee last names (that is, how many employees share each last name).

Results

A snapshot of the output of the analysis can be found here .

References

Data generated by Mockaroo, LLCLinks to an external site., (2022). Realistic Data Generator.