Skip to content

LeetCode 262: Trips and Users

A clear explanation of the Trips and Users SQL problem using joins, filtering, grouping, and conditional aggregation.

Problem Restatement

We are given two tables: Trips and Users.

Trips stores taxi trip requests.

Users stores user information, including whether a user is banned.

We need to find the cancellation rate for each day from:

'2013-10-01'

to:

'2013-10-03'

Only trips where both the client and driver are unbanned should be counted.

The cancellation rate is:

cancelled requests / total requests

Cancelled means the trip status is either:

'cancelled_by_driver'

or:

'cancelled_by_client'

The result should show each day and its cancellation rate rounded to two decimal places. The problem states both client and driver must be unbanned, and the date range is from 2013-10-01 to 2013-10-03.

Tables

Trips

ColumnMeaning
idTrip id
client_idUser id of the client
driver_idUser id of the driver
city_idCity id
statusTrip status
request_atRequest date

Possible status values include:

'completed'
'cancelled_by_driver'
'cancelled_by_client'

Users

ColumnMeaning
users_idUser id
bannedWhether this user is banned
roleUser role

Possible banned values are:

'Yes'
'No'

Output

The result table should contain:

ColumnMeaning
DayRequest date
Cancellation RateCancelled requests divided by total requests, rounded to 2 decimals

Example output shape:

DayCancellation Rate
2013-10-010.33
2013-10-020.00
2013-10-030.50

First Thought: Count All Trips Per Day

A first attempt might group trips by day:

SELECT
    request_at AS Day,
    COUNT(*) AS total_requests
FROM Trips
GROUP BY request_at;

This gives total requests per day.

But it is incomplete because the problem says banned users must be excluded.

A trip should be counted only when both users are unbanned:

  1. The client is not banned.
  2. The driver is not banned.

So we need to join Users twice.

Key Insight

The Trips table has two user references:

Trips columnMeaningJoin target
client_idClient user idUsers.users_id
driver_idDriver user idUsers.users_id

Because both client and driver come from the same Users table, we join Users two times with different aliases.

JOIN Users AS c
    ON t.client_id = c.users_id

JOIN Users AS d
    ON t.driver_id = d.users_id

Then we filter:

c.banned = 'No'
AND d.banned = 'No'

Now each remaining row is a valid request with unbanned users.

Counting Cancelled Requests

After filtering valid trips, cancellation rate per day is:

cancelled_count / total_count

We can count cancelled trips using conditional aggregation.

In MySQL, this works:

SUM(t.status != 'completed')

Because t.status != 'completed' evaluates to 1 for cancelled trips and 0 for completed trips.

A more portable version uses CASE:

SUM(
    CASE
        WHEN t.status IN ('cancelled_by_driver', 'cancelled_by_client') THEN 1
        ELSE 0
    END
)

Then divide by:

COUNT(*)

Finally round to two decimal places:

ROUND(cancelled_count / total_count, 2)

Algorithm

  1. Start from Trips.
  2. Join Users as clients.
  3. Join Users as drivers.
  4. Keep only rows where both users are unbanned.
  5. Keep only rows from 2013-10-01 through 2013-10-03.
  6. Group by request_at.
  7. Compute:
    • cancelled trip count
    • total trip count
    • cancellation rate
  8. Round the rate to two decimals.

Correctness

After joining Trips with Users as c, each row contains the client information for that trip.

After joining Trips with Users as d, each row also contains the driver information for that trip.

The filter:

c.banned = 'No'
AND d.banned = 'No'

keeps exactly the trips where both users are unbanned.

The date filter keeps exactly the requested days.

For each remaining day group, COUNT(*) gives the total number of valid requests on that day.

The conditional sum counts exactly the valid requests whose status is cancelled by the client or cancelled by the driver.

Therefore the division computes the required cancellation rate for that day.

Rounding to two decimals gives the required output format.

Complexity

Let T be the number of rows in Trips, and U be the number of rows in Users.

MetricValueWhy
TimeDepends on indexesThe database joins Trips to Users twice, filters dates, then groups by day
SpaceDepends on execution planThe database may allocate memory for joins and aggregation

Useful indexes would include:

Users(users_id)
Trips(client_id)
Trips(driver_id)
Trips(request_at)

Users(users_id) is usually already indexed because it is the primary key.

Implementation

SELECT
    t.request_at AS Day,
    ROUND(
        SUM(
            CASE
                WHEN t.status IN ('cancelled_by_driver', 'cancelled_by_client') THEN 1
                ELSE 0
            END
        ) / COUNT(*),
        2
    ) AS `Cancellation Rate`
FROM Trips AS t
JOIN Users AS c
    ON t.client_id = c.users_id
JOIN Users AS d
    ON t.driver_id = d.users_id
WHERE c.banned = 'No'
  AND d.banned = 'No'
  AND t.request_at BETWEEN '2013-10-01' AND '2013-10-03'
GROUP BY t.request_at;

MySQL Short Version

SELECT
    t.request_at AS Day,
    ROUND(SUM(t.status != 'completed') / COUNT(*), 2) AS `Cancellation Rate`
FROM Trips AS t
JOIN Users AS c
    ON t.client_id = c.users_id
JOIN Users AS d
    ON t.driver_id = d.users_id
WHERE c.banned = 'No'
  AND d.banned = 'No'
  AND t.request_at BETWEEN '2013-10-01' AND '2013-10-03'
GROUP BY t.request_at;

This works in MySQL because boolean expressions can be summed as 1 or 0.

Code Explanation

This part selects the request date:

t.request_at AS Day

The first join gets the client row:

JOIN Users AS c
    ON t.client_id = c.users_id

The second join gets the driver row:

JOIN Users AS d
    ON t.driver_id = d.users_id

The banned-user filter is:

WHERE c.banned = 'No'
  AND d.banned = 'No'

The date filter is:

t.request_at BETWEEN '2013-10-01' AND '2013-10-03'

The conditional sum counts cancellations:

SUM(
    CASE
        WHEN t.status IN ('cancelled_by_driver', 'cancelled_by_client') THEN 1
        ELSE 0
    END
)

The denominator counts all valid requests for that day:

COUNT(*)

The final value is rounded:

ROUND(..., 2)

Testing

Sample data:

CREATE TABLE Trips (
    id INT PRIMARY KEY,
    client_id INT,
    driver_id INT,
    city_id INT,
    status VARCHAR(30),
    request_at DATE
);

CREATE TABLE Users (
    users_id INT PRIMARY KEY,
    banned VARCHAR(3),
    role VARCHAR(20)
);

INSERT INTO Trips VALUES
(1, 1, 10, 1, 'completed', '2013-10-01'),
(2, 2, 11, 1, 'cancelled_by_driver', '2013-10-01'),
(3, 3, 12, 6, 'completed', '2013-10-01'),
(4, 4, 13, 6, 'cancelled_by_client', '2013-10-01'),
(5, 1, 10, 1, 'completed', '2013-10-02'),
(6, 2, 11, 6, 'completed', '2013-10-02'),
(7, 3, 12, 6, 'completed', '2013-10-02'),
(8, 2, 12, 12, 'completed', '2013-10-03'),
(9, 3, 10, 12, 'completed', '2013-10-03'),
(10, 4, 13, 12, 'cancelled_by_driver', '2013-10-03');

INSERT INTO Users VALUES
(1, 'No', 'client'),
(2, 'Yes', 'client'),
(3, 'No', 'client'),
(4, 'No', 'client'),
(10, 'No', 'driver'),
(11, 'No', 'driver'),
(12, 'No', 'driver'),
(13, 'No', 'driver');

Expected result:

DayCancellation Rate
2013-10-010.33
2013-10-020.00
2013-10-030.50

Test meaning:

TestWhy
Banned client tripMust be excluded from numerator and denominator
Completed tripCounts in denominator only
Cancelled tripCounts in numerator and denominator
Multiple daysConfirms grouping by date
RoundingConfirms two decimal places