A clear explanation of finding each player's first login date using SQL aggregation.
Problem Restatement
We are given an Activity table.
Each row records one player login on one date, together with the device used and the number of games played.
We need to report the first login date for each player.
The result can be returned in any order. The table has columns player_id, device_id, event_date, and games_played, and the goal is to find the earliest event_date for every player_id.
Table
| Column | Type | Meaning |
|---|---|---|
player_id | int | Player identifier |
device_id | int | Device used by the player |
event_date | date | Login date |
games_played | int | Number of games played before logout |
The primary key is:
(player_id, event_date)This means one player has at most one row for a given date.
Output
| Column | Meaning |
|---|---|
player_id | Player identifier |
first_login | Earliest login date for that player |
Example
Input:
| player_id | device_id | event_date | games_played |
|---|---|---|---|
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-05-02 | 6 |
| 2 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-02 | 0 |
| 3 | 4 | 2018-07-03 | 5 |
For player 1, the login dates are:
2016-03-01
2016-05-02The first login is:
2016-03-01For player 2, there is only one row, so the first login is:
2017-06-25For player 3, the first login is:
2016-03-02Output:
| player_id | first_login |
|---|---|
| 1 | 2016-03-01 |
| 2 | 2017-06-25 |
| 3 | 2016-03-02 |
First Thought: One Row Per Player
The result needs one row per player.
Whenever a SQL problem asks for one row per group, we should think about:
GROUP BYHere, the group is:
player_idInside each player group, we need the earliest date.
The aggregate function for the earliest value is:
MIN(event_date)Key Insight
The question does not ask for the whole first login row.
It only asks for:
player_id
first_login dateSo we do not need joins, subqueries, or window functions.
We can group all rows by player and take the minimum login date in each group.
Algorithm
Use SQL aggregation:
- Read rows from
Activity. - Group rows by
player_id. - For each group, compute
MIN(event_date). - Alias that value as
first_login.
Correctness
GROUP BY player_id partitions the table into one group per player.
Within each group, MIN(event_date) returns the earliest login date for that player.
The query outputs exactly one row for each player group, with the player id and that earliest date.
Therefore the result contains the first login date for every player.
Complexity
| Metric | Value | Why |
|---|---|---|
| Time | O(n) or O(n log n) | Depends on the database execution plan for grouping |
| Space | O(p) | The database stores aggregate state for p players |
Here:
| Symbol | Meaning |
|---|---|
n | Number of rows in Activity |
p | Number of distinct players |
With a suitable index on (player_id, event_date), the database can compute this efficiently.
Implementation
SELECT
player_id,
MIN(event_date) AS first_login
FROM Activity
GROUP BY player_id;Code Explanation
Select the player id:
SELECT
player_id,Compute the earliest login date for that player:
MIN(event_date) AS first_loginRead from the table:
FROM ActivityGroup rows by player:
GROUP BY player_id;After grouping, each output row corresponds to exactly one player.
Testing
Test table:
CREATE TABLE Activity (
player_id INT,
device_id INT,
event_date DATE,
games_played INT,
PRIMARY KEY (player_id, event_date)
);
INSERT INTO Activity (player_id, device_id, event_date, games_played) VALUES
(1, 2, '2016-03-01', 5),
(1, 2, '2016-05-02', 6),
(2, 3, '2017-06-25', 1),
(3, 1, '2016-03-02', 0),
(3, 4, '2018-07-03', 5);Expected result:
| player_id | first_login |
|---|---|
| 1 | 2016-03-01 |
| 2 | 2017-06-25 |
| 3 | 2016-03-02 |
Additional test case:
INSERT INTO Activity (player_id, device_id, event_date, games_played) VALUES
(4, 9, '2020-01-10', 0),
(4, 8, '2019-12-31', 3);For player 4, the first login should be:
2019-12-31Test meaning:
| Test | Why |
|---|---|
| Player with multiple rows | Checks MIN(event_date) |
| Player with one row | First login is the only login |
| Unordered dates | Confirms SQL does not rely on insertion order |
games_played = 0 | Confirms games played does not affect login date |