/sql_challenge

Challenge consisted of creating a database, to investigate and assess employment data.

Primary LanguageJupyter Notebook

sql_challenge

Table of Contents

  1. Background
  2. File Description
  3. Technologies

Background

Employee Database: A Mystery in Two Parts

As a recently hired data engineer at Pewlett Hackard. Our first major task is a research project on employees of the corporation from the 1980s and 1990s. All that remain of the database of employees from that period are six CSV files.

In this assignment, we will design the tables to hold data in the CSVs, import the CSVs into a SQL database, and answer questions about the data. In other words, we are to perform:

  1. Data Engineering

  2. Data Analysis

My final results included the following files:

  • QuickDBD-table schema.txt
  • QuickDBD-export.sql
  • QuickDBD-export.png
  • Table_Schema.sql
  • Queries.sql
  • SQL_Challenge_Bonus.ipynb

File Descriptions

  • File names: departments.csv (2 columns)
  • File names: dept_emp.csv (2 columns)
  • File names: dept_manager.csv (2 columns)
  • File names: employees.csv (7 columns)
  • File names: salaries.csv (2 columns)
  • File names: titles.csv (2 columns)

Technologies

Programming and Scripting languages used the following programs pgAdmin4 (Version 5.2 (4280.88)) and Jupyter Notebook (Version 6.3.0).

Applied use of:

  • QuickDBD
  • pgAdmin4: Importing Data, Joins, OrderBy, Foreign Keys,
  • Jupter Noteook: SqlAlchemy, Pandas, Matplotlib
  • DataFrame Merge
  • Groupby Function