Employee Performance Analysis for HR Analytics Using Python and MySQL

This project is aimed at analyzing employee performance data using Python and MySQL. The data consists of various parameters such as age, gender, department, recruitment channel, education level, training scores, and previous year ratings, among others. The analysis of this data helps HR managers to identify areas of improvement and make data-driven decisions to optimize the performance of their employees.

Preprocessing of Data Using Python

The data preprocessing steps were carried out using Python. The following tasks were performed:

  1. Removing duplicate rows.
  2. Removing rows for which numeric columns are having irrelevant data type values.
  3. Filling null values.
  4. Visualizing categorical and numerical columns using different graphs.
  5. Visualizing the relationship between numeric columns using scatterplots.
  6. Saving the dataset into an Excel file.

Analysis of Data Using SQL

The data analysis was performed using SQL. The following tasks were carried out:

  1. Finding the average age of employees in each department and gender group.
  2. Listing the top 3 departments with the highest average training scores.
  3. Finding the percentage of employees who have won awards in each region.
  4. Showing the number of employees who have met more than 80% of KPIs for each recruitment channel and education level.
  5. Finding the average length of service for employees in each department, considering only employees with previous year ratings greater than or equal to 4.
  6. Listing the top 5 regions with the highest average previous year ratings.
  7. Listing the departments with more than 100 employees having a length of service greater than 5 years.
  8. Showing the average length of service for employees who have attended more than 3 trainings, grouped by department and gender.
  9. Finding the percentage of female employees who have won awards, per department. Also, show the number of female employees who won awards and total female employees.
  10. Calculating the percentage of employees per department who have a length of service between 5 and 10 years.
  11. Finding the top 3 regions with the highest number of employees who have met more than 80% of their KPIs and received at least one award, grouped by department and region.
  12. Calculating the average length of service for employees per education level and gender, considering only those employees who have completed more than 2 trainings and have an average training score greater than 75.
  13. For each department and recruitment channel, finding the total number of employees who have met more than 80% of their KPIs, have a previous_year_rating of 5, and have a length of service greater than 10 years.
  14. Calculating the percentage of employees in each department who have received awards, have a previous_year_rating of 4 or 5, and an average training score above 70, grouped by department and gender.
  15. Listing the top 5 recruitment channels with the highest average length of service for employees who have met more than 80% of their KPIs, have a previous_year_rating of 5, and an age between 25 and 45 years, grouped by department and recruitment channel.

Conclusion

In conclusion, the analysis of employee performance data using Python and MySQL can help HR managers to make data-driven decisions and optimize the performance of their employees. The data preprocessing steps and SQL queries presented in this project can serve as a starting point for analyzing employee performance data in other organizations. By using these methods, HR managers can gain valuable insights into employee performance, identify areas for improvement, and take steps to enhance the overall productivity of their workforce.