A clear SQL guide for selecting customers who were not referred by customer 2, including customers with no referee.
Problem Restatement
We are given a table called Customer.
Each row stores a customer, their name, and the ID of the customer who referred them.
We need to return the names of customers who were not referred by the customer with id = 2.
That includes two kinds of customers:
- Customers whose
referee_idis not2. - Customers whose
referee_idisNULL, meaning they were not referred by anyone.
Return the result table in any order. The table has columns id, name, and referee_id.
Table
Customer
| Column | Type | Meaning |
|---|---|---|
id | int | Customer ID |
name | varchar | Customer name |
referee_id | int | ID of the customer who referred this customer |
id is the primary key.
Example
Input:
| id | name | referee_id |
|---|---|---|
| 1 | Will | null |
| 2 | Jane | null |
| 3 | Alex | 2 |
| 4 | Bill | null |
| 5 | Zack | 1 |
| 6 | Mark | 2 |
Output:
| name |
|---|
| Will |
| Jane |
| Bill |
| Zack |
Alex and Mark are excluded because their referee_id is 2.
Will, Jane, and Bill are included because they were not referred by anyone.
Zack is included because he was referred by customer 1, not customer 2.
First Thought: Use Not Equal
A first attempt is:
SELECT name
FROM Customer
WHERE referee_id <> 2;This looks reasonable, but it misses rows where referee_id is NULL.
In SQL, comparing NULL with a value does not return true. The condition:
referee_id <> 2does not keep NULL rows.
So we need to handle NULL explicitly.
Key Insight
The required customers satisfy:
referee_id <> 2or:
referee_id IS NULLThe IS NULL condition includes customers who were not referred by anyone.
Algorithm
- Read rows from
Customer. - Keep rows where
referee_idis not equal to2. - Also keep rows where
referee_idisNULL. - Return only the
namecolumn.
Correctness
For any customer whose referee_id = 2, the customer was referred by customer 2, so the query must exclude them. This row fails both referee_id <> 2 and referee_id IS NULL.
For any customer whose referee_id is a value other than 2, the customer was referred by someone else. The condition referee_id <> 2 is true, so the query includes them.
For any customer whose referee_id is NULL, the customer was not referred by anyone. The condition referee_id IS NULL is true, so the query includes them.
Therefore, the query returns exactly the customers not referred by customer 2.
Complexity
Let:
n = number of rows in Customer| Metric | Value | Why |
|---|---|---|
| Time | O(n) | The table is scanned and filtered |
| Space | O(1) | Apart from the result table, no extra structure is needed |
The exact execution plan depends on the database engine.
Implementation
SELECT name
FROM Customer
WHERE referee_id <> 2
OR referee_id IS NULL;Code Explanation
This selects only the required output column:
SELECT nameThis reads from the customer table:
FROM CustomerThis condition keeps customers referred by someone other than customer 2:
referee_id <> 2This condition keeps customers with no referee:
referee_id IS NULLThe OR combines the two valid cases:
WHERE referee_id <> 2
OR referee_id IS NULL;Testing
Sample data:
CREATE TABLE Customer (
id INT PRIMARY KEY,
name VARCHAR(255),
referee_id INT
);
INSERT INTO Customer (id, name, referee_id) VALUES
(1, 'Will', NULL),
(2, 'Jane', NULL),
(3, 'Alex', 2),
(4, 'Bill', NULL),
(5, 'Zack', 1),
(6, 'Mark', 2);Query:
SELECT name
FROM Customer
WHERE referee_id <> 2
OR referee_id IS NULL;Expected result:
| name |
|---|
| Will |
| Jane |
| Bill |
| Zack |
Additional test cases:
| Case | Expected behavior |
|---|---|
referee_id = 2 | Excluded |
referee_id = 1 | Included |
referee_id = NULL | Included |
All customers referred by 2 | Empty result |
No customers referred by 2 | All customers returned |