A clear explanation of Median Employee Salary using SQL window functions to rank employees inside each company.
Problem Restatement
We are given an Employee table.
| Column | Type | Meaning |
|---|---|---|
id | int | Employee id |
company | varchar | Company name |
salary | int | Employee salary |
We need to return the rows that contain the median salary of each company.
When sorting salaries inside a company, ties are broken by id.
If a company has an odd number of employees, return the one middle row.
If a company has an even number of employees, return the two middle rows.
Return the result table in any order. The problem specifically asks for the employee rows containing the median salary, not only the median value.
Input and Output
| Item | Meaning |
|---|---|
| Input | Employee table |
| Output | Rows containing each company’s median salary |
| Sort order | salary ASC, then id ASC |
| Odd count | Return one middle row |
| Even count | Return two middle rows |
Expected output columns:
id, company, salaryExample
Input:
| id | company | salary |
|---|---|---|
| 1 | A | 2341 |
| 2 | A | 341 |
| 3 | A | 15 |
| 4 | A | 15314 |
| 5 | A | 451 |
| 6 | A | 513 |
| 7 | B | 15 |
| 8 | B | 13 |
| 9 | B | 1154 |
| 10 | B | 1345 |
| 11 | B | 1221 |
| 12 | B | 234 |
| 13 | C | 2345 |
| 14 | C | 2645 |
| 15 | C | 2645 |
| 16 | C | 2652 |
| 17 | C | 65 |
For company A, sorted by salary:
| Position | id | salary |
|---|---|---|
| 1 | 3 | 15 |
| 2 | 2 | 341 |
| 3 | 5 | 451 |
| 4 | 6 | 513 |
| 5 | 1 | 2341 |
| 6 | 4 | 15314 |
There are six employees, so the median rows are positions 3 and 4.
For company C, sorted by salary and then id:
| Position | id | salary |
|---|---|---|
| 1 | 17 | 65 |
| 2 | 13 | 2345 |
| 3 | 14 | 2645 |
| 4 | 15 | 2645 |
| 5 | 16 | 2652 |
There are five employees, so the median row is position 3.
Output:
| id | company | salary |
|---|---|---|
| 5 | A | 451 |
| 6 | A | 513 |
| 12 | B | 234 |
| 9 | B | 1154 |
| 14 | C | 2645 |
First Thought: Group and Compute Median
A normal median query might group by company and compute one median salary value.
But this problem asks for rows.
For even-sized companies, it returns the two middle employee rows, not their averaged salary.
So we need to rank employees inside each company and then select the middle rank or middle two ranks.
Key Insight
Use two window functions:
ROW_NUMBER() OVER (
PARTITION BY company
ORDER BY salary, id
)This gives each employee a position inside their company after sorting by salary and id.
Then use:
COUNT(*) OVER (
PARTITION BY company
)This gives the total number of employees in that company.
Once we have both row_number and count, we can keep the middle row or rows.
Median Position Rule
Suppose a company has cnt employees and an employee has rank rn.
For odd cnt, such as 5, the median position is:
(cnt + 1) / 2 = 3For even cnt, such as 6, the median positions are:
cnt / 2 = 3
cnt / 2 + 1 = 4A compact condition that works for both cases is:
rn >= cnt / 2
AND rn <= cnt / 2 + 1In MySQL, / returns decimal division, so for cnt = 5:
cnt / 2 = 2.5
cnt / 2 + 1 = 3.5The only integer rank between 2.5 and 3.5 is 3.
For cnt = 6:
cnt / 2 = 3
cnt / 2 + 1 = 4So ranks 3 and 4 are selected.
Algorithm
- Build a CTE that selects every employee.
- Inside the CTE, compute:
rn, the employee’s row number inside the company.cnt, the number of employees in the company.
- Select only rows where
rnis betweencnt / 2andcnt / 2 + 1. - Return
id,company, andsalary.
Correctness
The window function ROW_NUMBER() assigns each employee a unique rank inside their company after sorting by salary and then id. This matches the required median ordering.
The window function COUNT(*) gives the number of employees in the same company for every row.
If the company has an odd count, the condition:
rn >= cnt / 2 AND rn <= cnt / 2 + 1selects exactly the middle integer rank.
If the company has an even count, the same condition selects exactly the two middle ranks.
Therefore, the query returns exactly the employee rows containing the median salary for each company.
Complexity
Let n be the number of rows in Employee.
| Metric | Value | Why |
|---|---|---|
| Time | O(n log n) | Rows are sorted inside each company partition |
| Space | O(n) | The database may materialize window function results |
The exact runtime depends on the database engine and indexes.
Implementation
WITH ranked AS (
SELECT
id,
company,
salary,
ROW_NUMBER() OVER (
PARTITION BY company
ORDER BY salary, id
) AS rn,
COUNT(*) OVER (
PARTITION BY company
) AS cnt
FROM Employee
)
SELECT
id,
company,
salary
FROM ranked
WHERE rn >= cnt / 2
AND rn <= cnt / 2 + 1;Code Explanation
The CTE creates one ranked version of the table:
WITH ranked AS (...)Inside it, this assigns a sorted position to every employee within their company:
ROW_NUMBER() OVER (
PARTITION BY company
ORDER BY salary, id
) AS rnThe PARTITION BY company part means each company is ranked independently.
The ORDER BY salary, id part means salaries are sorted ascending, and equal salaries are ordered by employee id.
Then this computes the company size:
COUNT(*) OVER (
PARTITION BY company
) AS cntFinally, the outer query keeps only the median rank or ranks:
WHERE rn >= cnt / 2
AND rn <= cnt / 2 + 1Alternative Explicit Condition
Some readers find the explicit odd and even cases easier to understand.
WITH ranked AS (
SELECT
id,
company,
salary,
ROW_NUMBER() OVER (
PARTITION BY company
ORDER BY salary, id
) AS rn,
COUNT(*) OVER (
PARTITION BY company
) AS cnt
FROM Employee
)
SELECT
id,
company,
salary
FROM ranked
WHERE
(cnt % 2 = 1 AND rn = (cnt + 1) / 2)
OR
(cnt % 2 = 0 AND rn IN (cnt / 2, cnt / 2 + 1));This version says the same thing more directly.
Testing
Create sample data:
CREATE TABLE Employee (
id INT PRIMARY KEY,
company VARCHAR(10),
salary INT
);
INSERT INTO Employee (id, company, salary) VALUES
(1, 'A', 2341),
(2, 'A', 341),
(3, 'A', 15),
(4, 'A', 15314),
(5, 'A', 451),
(6, 'A', 513),
(7, 'B', 15),
(8, 'B', 13),
(9, 'B', 1154),
(10, 'B', 1345),
(11, 'B', 1221),
(12, 'B', 234),
(13, 'C', 2345),
(14, 'C', 2645),
(15, 'C', 2645),
(16, 'C', 2652),
(17, 'C', 65);Expected median rows:
| id | company | salary |
|---|---|---|
| 5 | A | 451 |
| 6 | A | 513 |
| 12 | B | 234 |
| 9 | B | 1154 |
| 14 | C | 2645 |
Additional test cases:
| Case | Expected behavior |
|---|---|
| One employee in a company | Return that employee |
| Two employees in a company | Return both employees |
| Odd number of employees | Return one middle row |
| Even number of employees | Return two middle rows |
| Equal salaries | Break ties by id |