A clear explanation of finding the first device used by each player using SQL aggregation and a join.
Problem Restatement
We are given an Activity table.
Each row records one player login on one date, including the device used and the number of games played.
We need to report the device that each player used on their first login date.
The result can be returned in any order. The table has columns player_id, device_id, event_date, and games_played; (player_id, event_date) is the primary key. The task is to report the first logged-in device for each player.
Table
| Column | Type | Meaning |
|---|---|---|
player_id | int | Player identifier |
device_id | int | Device used by the player |
event_date | date | Login date |
games_played | int | Number of games played before logout |
The primary key is:
(player_id, event_date)This means each player has at most one row on a given date.
Output
| Column | Meaning |
|---|---|
player_id | Player identifier |
device_id | Device used on the player’s first login date |
Example
Input:
| player_id | device_id | event_date | games_played |
|---|---|---|---|
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-05-02 | 6 |
| 2 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-02 | 0 |
| 3 | 4 | 2018-07-03 | 5 |
For player 1, the first login date is:
2016-03-01The device used on that date is:
2For player 2, the only login date is:
2017-06-25The device is:
3For player 3, the first login date is:
2016-03-02The device is:
1Output:
| player_id | device_id |
|---|---|
| 1 | 2 |
| 2 | 3 |
| 3 | 1 |
First Thought: Find the First Date
This problem is close to LeetCode 511.
In LeetCode 511, we only needed:
player_id, MIN(event_date)But here we need the device_id from the same row as the first login date.
So this query is not enough:
SELECT
player_id,
MIN(event_date) AS first_login
FROM Activity
GROUP BY player_id;It gives the first date, but it does not give the device used on that date.
Key Insight
We should solve this in two steps:
- Find each player’s first login date.
- Join that result back to
Activityto get the device from that exact row.
The grouped subquery gives:
player_id, first_loginThen the join matches:
Activity.player_id = first_login.player_id
Activity.event_date = first_login.first_loginSince (player_id, event_date) is the primary key, this match identifies exactly one row for each player.
Algorithm
Create a derived table:
SELECT
player_id,
MIN(event_date) AS first_login
FROM Activity
GROUP BY player_idThen join it with Activity.
Return:
Activity.player_id,
Activity.device_idfrom the row where the event date equals the first login date.
Correctness
The subquery groups rows by player_id.
For each player group, MIN(event_date) returns that player’s earliest login date.
Joining this result back to Activity on both player_id and event_date selects the row that represents the player’s first login.
Because (player_id, event_date) is the primary key, there is only one such row per player.
The selected device_id therefore belongs to the device used on that player’s first login date.
So the query returns exactly the required result.
Complexity
| Metric | Value | Why |
|---|---|---|
| Time | O(n) or O(n log n) | Depends on how the database executes grouping and joining |
| Space | O(p) | The grouped result stores one row per player |
Here:
| Symbol | Meaning |
|---|---|
n | Number of rows in Activity |
p | Number of distinct players |
An index on (player_id, event_date) helps because the query groups and joins using those columns.
Implementation
SELECT
a.player_id,
a.device_id
FROM Activity AS a
JOIN (
SELECT
player_id,
MIN(event_date) AS first_login
FROM Activity
GROUP BY player_id
) AS first_activity
ON a.player_id = first_activity.player_id
AND a.event_date = first_activity.first_login;Code Explanation
The subquery finds the first login date per player:
SELECT
player_id,
MIN(event_date) AS first_login
FROM Activity
GROUP BY player_idThe outer query reads from the original table:
FROM Activity AS aThen joins the first-login table:
JOIN (...) AS first_activityThe join condition matches the same player:
a.player_id = first_activity.player_idand the first login date:
a.event_date = first_activity.first_loginFinally, the query returns the player and the first-login device:
SELECT
a.player_id,
a.device_idAlternative: Window Function
Some SQL engines support FIRST_VALUE.
SELECT DISTINCT
player_id,
FIRST_VALUE(device_id) OVER (
PARTITION BY player_id
ORDER BY event_date
) AS device_id
FROM Activity;This partitions rows by player and chooses the device from the earliest event_date. A public solution also uses this FIRST_VALUE pattern.
The join version is usually easier to explain and works well for this problem.
Testing
Test table:
CREATE TABLE Activity (
player_id INT,
device_id INT,
event_date DATE,
games_played INT,
PRIMARY KEY (player_id, event_date)
);
INSERT INTO Activity (player_id, device_id, event_date, games_played) VALUES
(1, 2, '2016-03-01', 5),
(1, 2, '2016-05-02', 6),
(2, 3, '2017-06-25', 1),
(3, 1, '2016-03-02', 0),
(3, 4, '2018-07-03', 5);Expected result:
| player_id | device_id |
|---|---|
| 1 | 2 |
| 2 | 3 |
| 3 | 1 |
Additional test case:
INSERT INTO Activity (player_id, device_id, event_date, games_played) VALUES
(4, 9, '2020-01-10', 0),
(4, 8, '2019-12-31', 3);For player 4, the first login date is:
2019-12-31So the returned device should be:
8Test meaning:
| Test | Why |
|---|---|
| Player with multiple rows | Checks earliest date selection |
| Player with one row | The only device is returned |
| Earlier row inserted later | Confirms insertion order does not matter |
games_played = 0 | Confirms games played does not affect the result |