iamAntimPal/LeetCode_SQL_Database

Add a Challenging SQL & Pandas Problem: "Employee Performance Analysis"

Opened this issue · 0 comments

Add a Challenging SQL & Pandas Problem: "Employee Performance Analysis"

Description

I propose that we add a new challenging question to our repository that requires solving using both SQL and Pandas. The question involves real-world data analysis and combines multiple concepts. Below is the proposed problem statement.


Problem: Employee Performance Analysis

Context:
A company maintains a database of its employees along with their monthly performance metrics. The data is stored in two tables: Employees and Performance.

Table: Employees

Column Name Type Description
employee_id INT Unique identifier for each employee.
name VARCHAR Employee name.
department VARCHAR Department of the employee.

Table: Performance

Column Name Type Description
employee_id INT Foreign key referencing Employees.employee_id.
month DATE The month of the performance record (assume it is the first day of the month).
sales INT Number of sales made by the employee in that month.
targets_met INT Number of targets met by the employee in that month.

Task:
Write a solution that reports, for each department:

  • Employee Count: Total number of employees.
  • Average Monthly Sales: Average monthly sales per employee (across all available months).
  • Target Achievement Rate: The overall rate of target achievement calculated as:
    [
    \text{Target Achievement Rate} = \frac{\text{Total Targets Met}}{\text{Total Targets Assigned}}
    ]
    (Assume each employee has a fixed target of 10 per month.)

Return the result table with the following columns:
department, employee_count, avg_monthly_sales, target_achievement_rate.
Round avg_monthly_sales and target_achievement_rate to two decimal places.


Requirements:

  1. SQL Solution:

    • Write a query that computes the required results using appropriate JOINs, GROUP BY, and aggregation functions.
    • Ensure the query handles cases where performance records might be missing for some employees.
  2. Pandas (Python) Solution:

    • Write a function that accepts two DataFrames (one for Employees and one for Performance) and returns a DataFrame with the required results.
    • Use Pandas operations such as merge, groupby, and aggregation functions to compute the values.

Example (Simplified):

Given sample data, your output might look like:

department employee_count avg_monthly_sales target_achievement_rate
IT 5 250.75 0.92
Finance 3 180.50 0.88
HR 2 200.00 0.95

Additional Notes:

  • The SQL solution should utilize functions like ROUND() to format the results.
  • The Pandas solution should be encapsulated in a function that takes the two DataFrames as input and returns the resulting DataFrame.
  • This question combines multiple SQL concepts (JOINs, GROUP BY, aggregations, rounding) along with equivalent Pandas operations, making it a great practice problem for both SQL and Python skills.