Skip to content

LeetCode 512: Game Play Analysis II

A clear explanation of finding the first device used by each player using SQL aggregation and a join.

Problem Restatement

We are given an Activity table.

Each row records one player login on one date, including the device used and the number of games played.

We need to report the device that each player used on their first login date.

The result can be returned in any order. The table has columns player_id, device_id, event_date, and games_played; (player_id, event_date) is the primary key. The task is to report the first logged-in device for each player.

Table

ColumnTypeMeaning
player_idintPlayer identifier
device_idintDevice used by the player
event_datedateLogin date
games_playedintNumber of games played before logout

The primary key is:

(player_id, event_date)

This means each player has at most one row on a given date.

Output

ColumnMeaning
player_idPlayer identifier
device_idDevice used on the player’s first login date

Example

Input:

player_iddevice_idevent_dategames_played
122016-03-015
122016-05-026
232017-06-251
312016-03-020
342018-07-035

For player 1, the first login date is:

2016-03-01

The device used on that date is:

2

For player 2, the only login date is:

2017-06-25

The device is:

3

For player 3, the first login date is:

2016-03-02

The device is:

1

Output:

player_iddevice_id
12
23
31

First Thought: Find the First Date

This problem is close to LeetCode 511.

In LeetCode 511, we only needed:

player_id, MIN(event_date)

But here we need the device_id from the same row as the first login date.

So this query is not enough:

SELECT
    player_id,
    MIN(event_date) AS first_login
FROM Activity
GROUP BY player_id;

It gives the first date, but it does not give the device used on that date.

Key Insight

We should solve this in two steps:

  1. Find each player’s first login date.
  2. Join that result back to Activity to get the device from that exact row.

The grouped subquery gives:

player_id, first_login

Then the join matches:

Activity.player_id = first_login.player_id
Activity.event_date = first_login.first_login

Since (player_id, event_date) is the primary key, this match identifies exactly one row for each player.

Algorithm

Create a derived table:

SELECT
    player_id,
    MIN(event_date) AS first_login
FROM Activity
GROUP BY player_id

Then join it with Activity.

Return:

Activity.player_id,
Activity.device_id

from the row where the event date equals the first login date.

Correctness

The subquery groups rows by player_id.

For each player group, MIN(event_date) returns that player’s earliest login date.

Joining this result back to Activity on both player_id and event_date selects the row that represents the player’s first login.

Because (player_id, event_date) is the primary key, there is only one such row per player.

The selected device_id therefore belongs to the device used on that player’s first login date.

So the query returns exactly the required result.

Complexity

MetricValueWhy
TimeO(n) or O(n log n)Depends on how the database executes grouping and joining
SpaceO(p)The grouped result stores one row per player

Here:

SymbolMeaning
nNumber of rows in Activity
pNumber of distinct players

An index on (player_id, event_date) helps because the query groups and joins using those columns.

Implementation

SELECT
    a.player_id,
    a.device_id
FROM Activity AS a
JOIN (
    SELECT
        player_id,
        MIN(event_date) AS first_login
    FROM Activity
    GROUP BY player_id
) AS first_activity
    ON a.player_id = first_activity.player_id
   AND a.event_date = first_activity.first_login;

Code Explanation

The subquery finds the first login date per player:

SELECT
    player_id,
    MIN(event_date) AS first_login
FROM Activity
GROUP BY player_id

The outer query reads from the original table:

FROM Activity AS a

Then joins the first-login table:

JOIN (...) AS first_activity

The join condition matches the same player:

a.player_id = first_activity.player_id

and the first login date:

a.event_date = first_activity.first_login

Finally, the query returns the player and the first-login device:

SELECT
    a.player_id,
    a.device_id

Alternative: Window Function

Some SQL engines support FIRST_VALUE.

SELECT DISTINCT
    player_id,
    FIRST_VALUE(device_id) OVER (
        PARTITION BY player_id
        ORDER BY event_date
    ) AS device_id
FROM Activity;

This partitions rows by player and chooses the device from the earliest event_date. A public solution also uses this FIRST_VALUE pattern.

The join version is usually easier to explain and works well for this problem.

Testing

Test table:

CREATE TABLE Activity (
    player_id INT,
    device_id INT,
    event_date DATE,
    games_played INT,
    PRIMARY KEY (player_id, event_date)
);

INSERT INTO Activity (player_id, device_id, event_date, games_played) VALUES
    (1, 2, '2016-03-01', 5),
    (1, 2, '2016-05-02', 6),
    (2, 3, '2017-06-25', 1),
    (3, 1, '2016-03-02', 0),
    (3, 4, '2018-07-03', 5);

Expected result:

player_iddevice_id
12
23
31

Additional test case:

INSERT INTO Activity (player_id, device_id, event_date, games_played) VALUES
    (4, 9, '2020-01-10', 0),
    (4, 8, '2019-12-31', 3);

For player 4, the first login date is:

2019-12-31

So the returned device should be:

8

Test meaning:

TestWhy
Player with multiple rowsChecks earliest date selection
Player with one rowThe only device is returned
Earlier row inserted laterConfirms insertion order does not matter
games_played = 0Confirms games played does not affect the result