This analysis involves setting up sample relational employee database using data from the Employee CSV Files.
An Entity Relationship Diagram reflecting the data in the CSV files is first created to illustrate the data relationships:
The data can be imported from the CSV files to SQL using this Table Schemata based on the ERD.
With the employee database set, the data can be analyzed. Provided in the employee_queries are various SQL queries performing the following:
-
Lists the following details of each employee: employee number, last name, first name, sex, and salary.
-
Lists first name, last name, and hire date for employees who were hired in 1986.
-
Lists the manager of each department with the following information: department number, department name, the manager's employee number, last name, first name.
-
Lists the department of each employee with the following information: employee number, last name, first name, and department name.
-
Lists first name, last name, and sex for employees whose first name is "Hercules" and last names begin with "B."
-
Lists all employees in the Sales department, including their employee number, last name, first name, and department name.
-
Lists all employees in the Sales and Development departments, including their employee number, last name, first name, and department name.
-
In descending order, lists the frequency count of employee last names, i.e., how many employees share each last name.
Example query:
Further analysis involving Pandas and Matplotlib visualization via SQLAlchemy is available in the bonus_analysis Jupyter Notebook for which an additional SQL View is needed.
This Notebook imports the data from SQL to create:
A histogram to visualize the most common salary ranges for employees.
A bar chart of average salary by title.