# LeetCode 182: Duplicate Emails

## Problem Restatement

We are given a table named `Person`.

| Column | Type |
|---|---|
| id | int |
| email | varchar |

The `id` column is the primary key.

Each row stores one email address. The `email` field is guaranteed to be non-null.

We need to report all duplicate emails. A duplicate email means an email value that appears more than once in the table. The result can be returned in any order. The official statement asks us to report all duplicate emails from `Person`.

## Input and Output

| Item | Meaning |
|---|---|
| Input | Table `Person(id, email)` |
| Output | One column named `Email` |
| Rule | Return email values that appear more than once |
| Order | Any order is accepted |

Expected output column:

```sql
Email
```

## Examples

Input:

```text
Person
+----+---------+
| id | email   |
+----+---------+
| 1  | a@b.com |
| 2  | c@d.com |
| 3  | a@b.com |
+----+---------+
```

The email `a@b.com` appears two times.

The email `c@d.com` appears one time.

So only `a@b.com` is a duplicate.

Output:

```text
+---------+
| Email   |
+---------+
| a@b.com |
+---------+
```

## First Thought: Count Each Email

The problem asks for email values that appear more than once.

So we should group rows by email and count how many rows each email has.

For the example table:

```text
a@b.com -> 2
c@d.com -> 1
```

Then we keep only groups where the count is greater than `1`.

## Key Insight

`WHERE` filters individual rows before grouping.

`HAVING` filters groups after grouping.

Since we need to filter by `COUNT(*)`, we must use `HAVING`.

The core SQL shape is:

```sql
GROUP BY email
HAVING COUNT(*) > 1
```

This produces one row per duplicate email.

## Algorithm

1. Read rows from `Person`.
2. Group rows by `email`.
3. Count how many rows are in each email group.
4. Keep only groups where the count is greater than `1`.
5. Return the email value as `Email`.

## Correctness

After `GROUP BY email`, each group contains all rows with the same email address.

For any email `e`, `COUNT(*)` gives the number of rows in `Person` whose email is `e`.

The condition:

```sql
HAVING COUNT(*) > 1
```

keeps exactly the email groups that appear at least twice.

Therefore, every returned email is duplicate, and every duplicate email is returned.

## Complexity

| Metric | Value | Why |
|---|---|---|
| Time | `O(n)` average | The database scans rows and groups by email |
| Space | `O(k)` | The database stores one group per distinct email |

Here, `n` is the number of rows, and `k` is the number of distinct email values.

## Implementation

```sql
SELECT
    email AS Email
FROM Person
GROUP BY email
HAVING COUNT(*) > 1;
```

## Code Explanation

This reads the table:

```sql
FROM Person
```

This groups all rows with the same email:

```sql
GROUP BY email
```

This keeps only email groups with more than one row:

```sql
HAVING COUNT(*) > 1
```

Finally, this returns the email under the expected output name:

```sql
SELECT
    email AS Email
```

## Alternative Implementation

We can also solve this with a self join.

```sql
SELECT DISTINCT
    p1.email AS Email
FROM Person p1
JOIN Person p2
    ON p1.email = p2.email
   AND p1.id <> p2.id;
```

This finds pairs of different rows with the same email.

`DISTINCT` is needed because one duplicated email can create many matching pairs.

For this problem, `GROUP BY` with `HAVING` is simpler and closer to the requirement.

## Testing

Test case 1:

```sql
CREATE TABLE Person (
    id INT PRIMARY KEY,
    email VARCHAR(255)
);

INSERT INTO Person (id, email) VALUES
(1, 'a@b.com'),
(2, 'c@d.com'),
(3, 'a@b.com');

SELECT
    email AS Email
FROM Person
GROUP BY email
HAVING COUNT(*) > 1;
```

Expected:

```text
+---------+
| Email   |
+---------+
| a@b.com |
+---------+
```

Test case 2:

```sql
DELETE FROM Person;

INSERT INTO Person (id, email) VALUES
(1, 'a@b.com'),
(2, 'c@d.com');
```

Expected:

```text
+-------+
| Email |
+-------+
```

No email appears more than once.

Test case 3:

```sql
DELETE FROM Person;

INSERT INTO Person (id, email) VALUES
(1, 'a@b.com'),
(2, 'a@b.com'),
(3, 'a@b.com');
```

Expected:

```text
+---------+
| Email   |
+---------+
| a@b.com |
+---------+
```

Even if an email appears three times, it should appear once in the result.

Test case 4:

```sql
DELETE FROM Person;

INSERT INTO Person (id, email) VALUES
(1, 'a@b.com'),
(2, 'b@c.com'),
(3, 'a@b.com'),
(4, 'b@c.com'),
(5, 'd@e.com');
```

Expected:

```text
+---------+
| Email   |
+---------+
| a@b.com |
| b@c.com |
+---------+
```

