LeetCode 615: Average Salary: Departments VS Company

Problem Restatement

We are given two tables: Salary and Employee.

The Salary table stores each employee’s salary for a month.

Column	Type	Meaning
`id`	int	Salary record id
`employee_id`	int	Employee id
`amount`	int	Salary amount
`pay_date`	date	Salary payment date

The Employee table stores each employee’s department.

Column	Type	Meaning
`employee_id`	int	Employee id
`department_id`	int	Department id

We need to compare each department’s average salary with the company’s average salary for the same month.

Return:

Column	Meaning
`pay_month`	Month in `YYYY-MM` format
`department_id`	Department id
`comparison`	`higher`, `lower`, or `same`

The problem asks for the comparison result of each department’s average salary against the company average salary.

Input and Output

Input tables:

Salary
Employee

Output columns:

pay_month, department_id, comparison

The comparison is:

Condition	Output
Department average > company average	`higher`
Department average < company average	`lower`
Department average = company average	`same`

Example

Salary table:

id	employee_id	amount	pay_date
1	1	9000	2017-03-31
2	2	6000	2017-03-31
3	3	10000	2017-03-31
4	1	7000	2017-02-28
5	2	6000	2017-02-28
6	3	8000	2017-02-28

Employee table:

employee_id	department_id
1	1
2	2
3	2

For March 2017, the company average is:

(9000 + 6000 + 10000) / 3 = 8333.33

Department 1 average:

So department 1 is higher.

Department 2 average:

(6000 + 10000) / 2 = 8000

So department 2 is lower.

For February 2017, the company average is:

(7000 + 6000 + 8000) / 3 = 7000

Department 1 average is 7000.

Department 2 average is also:

(6000 + 8000) / 2 = 7000

So both are same.

Output:

pay_month	department_id	comparison
2017-02	1	same
2017-03	1	higher
2017-02	2	same
2017-03	2	lower

First Thought: Compute Two Averages Separately

We need two values for each month and department:

Average	Grouping
Company average	Month
Department average	Month and department

A direct approach is to build one query for company averages and another query for department averages, then join them.

That works, but it is more verbose.

A window function can compute both averages from the joined salary and employee rows.

Key Insight

Once we join Salary to Employee, each salary row knows its department.

Then we can compute:

AVG(amount) OVER (PARTITION BY pay_date)

for the company average in that month.

And:

AVG(amount) OVER (PARTITION BY pay_date, department_id)

for the department average in that month.

Because pay_date is a full date but the output needs month format, we also convert it with:

DATE_FORMAT(pay_date, '%Y-%m')

Algorithm

Join Salary with Employee using employee_id.

For each joined row:

Compute pay_month.
Compute company monthly average.
Compute department monthly average.
Compare the two averages with CASE.
Use DISTINCT because each employee row in the same department-month would otherwise repeat the same result.

SQL Solution

WITH averages AS (
    SELECT
        DATE_FORMAT(s.pay_date, '%Y-%m') AS pay_month,
        e.department_id,
        AVG(s.amount) OVER (
            PARTITION BY s.pay_date
        ) AS company_avg,
        AVG(s.amount) OVER (
            PARTITION BY s.pay_date, e.department_id
        ) AS department_avg
    FROM Salary AS s
    JOIN Employee AS e
        ON s.employee_id = e.employee_id
)
SELECT DISTINCT
    pay_month,
    department_id,
    CASE
        WHEN department_avg > company_avg THEN 'higher'
        WHEN department_avg < company_avg THEN 'lower'
        ELSE 'same'
    END AS comparison
FROM averages;

Code Explanation

First, join salaries to departments:

FROM Salary AS s
JOIN Employee AS e
    ON s.employee_id = e.employee_id

Now every salary row has a department_id.

The output month is created with:

DATE_FORMAT(s.pay_date, '%Y-%m') AS pay_month

The company average is partitioned only by month:

AVG(s.amount) OVER (
    PARTITION BY s.pay_date
) AS company_avg

This groups all employees paid in the same month.

The department average is partitioned by both month and department:

AVG(s.amount) OVER (
    PARTITION BY s.pay_date, e.department_id
) AS department_avg

This groups employees from the same department in the same month.

Finally, compare both averages:

CASE
    WHEN department_avg > company_avg THEN 'higher'
    WHEN department_avg < company_avg THEN 'lower'
    ELSE 'same'
END AS comparison

We use SELECT DISTINCT because the CTE has one row per employee salary record, but the answer needs one row per month and department.

Correctness

After the join, each salary amount is associated with exactly one department.

For each month, company_avg is computed over all salary rows in that month, so it equals the company-wide average salary for that month.

For each month and department, department_avg is computed over exactly the salary rows belonging to that department in that month, so it equals that department’s average salary for that month.

The CASE expression returns higher, lower, or same according to the comparison between these two averages.

Since DISTINCT removes duplicate rows for employees in the same department-month, the final result contains exactly one comparison for each department in each month.

Therefore, the query returns the required comparison result.

Complexity

Let n be the number of salary rows.

Metric	Value	Why
Time	`O(n log n)`	Window partitions may require sorting by month and department
Space	`O(n)`	The intermediate joined rows and window results may be stored

With indexes on Salary.employee_id, Employee.employee_id, and Salary.pay_date, the join and partitioning can be optimized.

Alternative: Group Then Join

We can compute department and company averages separately, then join them.

WITH department_average AS (
    SELECT
        DATE_FORMAT(s.pay_date, '%Y-%m') AS pay_month,
        e.department_id,
        AVG(s.amount) AS department_avg
    FROM Salary AS s
    JOIN Employee AS e
        ON s.employee_id = e.employee_id
    GROUP BY
        DATE_FORMAT(s.pay_date, '%Y-%m'),
        e.department_id
),
company_average AS (
    SELECT
        DATE_FORMAT(pay_date, '%Y-%m') AS pay_month,
        AVG(amount) AS company_avg
    FROM Salary
    GROUP BY DATE_FORMAT(pay_date, '%Y-%m')
)
SELECT
    d.pay_month,
    d.department_id,
    CASE
        WHEN d.department_avg > c.company_avg THEN 'higher'
        WHEN d.department_avg < c.company_avg THEN 'lower'
        ELSE 'same'
    END AS comparison
FROM department_average AS d
JOIN company_average AS c
    ON d.pay_month = c.pay_month;

This version is slightly longer, but it avoids DISTINCT because each CTE already has the desired granularity.

Testing

Sample data:

CREATE TABLE Salary (
    id INT,
    employee_id INT,
    amount INT,
    pay_date DATE
);

CREATE TABLE Employee (
    employee_id INT,
    department_id INT
);

INSERT INTO Salary (id, employee_id, amount, pay_date) VALUES
    (1, 1, 9000, '2017-03-31'),
    (2, 2, 6000, '2017-03-31'),
    (3, 3, 10000, '2017-03-31'),
    (4, 1, 7000, '2017-02-28'),
    (5, 2, 6000, '2017-02-28'),
    (6, 3, 8000, '2017-02-28');

INSERT INTO Employee (employee_id, department_id) VALUES
    (1, 1),
    (2, 2),
    (3, 2);

Expected output:

pay_month	department_id	comparison
2017-02	1	same
2017-02	2	same
2017-03	1	higher
2017-03	2	lower