# LeetCode 929: Unique Email Addresses

## Problem Restatement

We are given a list of email addresses.

Each email has two parts:

```text
local-name@domain-name
```

The problem defines two special rules that apply only to the local name.

### Rule 1: Ignore Dots

In the local name:

```text
"."
```

characters are ignored.

For example:

```text
"alice.z" == "alicez"
```

### Rule 2: Ignore Everything After '+'

If a plus sign appears in the local name:

```text
"+"
```

then everything after it is ignored.

For example:

```text
"m.y+name" -> "my"
```

These rules do not apply to the domain name.

We must count how many unique email addresses actually receive mail after normalization.

The official statement defines the same dot-removal and plus-ignore rules for the local name only. ([leetcode.com](https://leetcode.com/problems/unique-email-addresses/?utm_source=chatgpt.com))

## Input and Output

| Item | Meaning |
|---|---|
| Input | A list of email strings |
| Output | Number of unique normalized emails |
| Dot rule | Dots in local name are ignored |
| Plus rule | Ignore everything after `+` in local name |
| Domain rule | Domain remains unchanged |

Function shape:

```python
class Solution:
    def numUniqueEmails(self, emails: list[str]) -> int:
        ...
```

## Examples

Example 1:

```python
emails = [
    "test.email+alex@leetcode.com",
    "test.e.mail+bob.cathy@leetcode.com",
    "testemail+david@lee.tcode.com",
]
```

Normalize the first email.

Split into:

```text
local  = "test.email+alex"
domain = "leetcode.com"
```

Remove everything after `+`:

```text
"test.email"
```

Remove dots:

```text
"testemail"
```

Final normalized email:

```text
"testemail@leetcode.com"
```

The second email becomes the same normalized address:

```text
"testemail@leetcode.com"
```

The third email has a different domain:

```text
"testemail@lee.tcode.com"
```

So there are:

```python
2
```

unique addresses.

Example 2:

```python
emails = [
    "a@leetcode.com",
    "b@leetcode.com",
    "c@leetcode.com",
]
```

All addresses are already different.

Answer:

```python
3
```

## First Thought

The rules only change the local name.

So for every email:

1. Split it into local name and domain name.
2. Process the local name.
3. Rebuild the normalized email.
4. Store it in a set.

At the end, the size of the set is the answer.

## Key Insight

A hash set automatically removes duplicates.

If two original emails normalize into the same string, the set stores only one copy.

So the whole problem becomes a normalization problem.

The normalization steps are:

1. Split by `'@'`
2. Remove everything after `'+'`
3. Remove all dots `'.'`
4. Combine with the original domain

## Algorithm

Create an empty set:

```python
seen = set()
```

For each email:

Split into:

```python
local, domain = email.split("@")
```

Remove the plus section:

```python
local = local.split("+")[0]
```

Remove dots:

```python
local = local.replace(".", "")
```

Build the normalized email:

```python
normalized = local + "@" + domain
```

Insert into the set:

```python
seen.add(normalized)
```

Return:

```python
len(seen)
```

## Walkthrough

Use:

```python
email = "test.email+alex@leetcode.com"
```

Split:

```text
local  = "test.email+alex"
domain = "leetcode.com"
```

Remove everything after `'+'`:

```text
"test.email"
```

Remove dots:

```text
"testemail"
```

Rebuild:

```text
"testemail@leetcode.com"
```

That is the normalized address.

## Correctness

For each email, the algorithm applies exactly the two rules defined in the problem.

First, the local name is truncated at the first `'+'`, so every character after `'+'` is ignored.

Second, all dots are removed from the remaining local name.

The domain name is preserved unchanged.

Therefore, the produced normalized string is exactly the address that receives the email according to the problem rules.

If two original emails normalize to the same receiving address, the algorithm inserts the same normalized string into the set, and the set stores only one copy.

If two emails normalize to different receiving addresses, they become different strings and both remain in the set.

So after processing all emails, the set contains exactly the unique receiving addresses.

The algorithm returns the size of that set, which is the correct answer.

## Complexity

Suppose:

```python
n = len(emails)
```

and the average email length is:

```python
m
```

| Metric | Value | Why |
|---|---|---|
| Time | `O(n * m)` | Each email is scanned a constant number of times |
| Space | `O(n * m)` | The set stores normalized emails |

## Implementation

```python
class Solution:
    def numUniqueEmails(self, emails: list[str]) -> int:
        seen = set()

        for email in emails:
            local, domain = email.split("@")

            local = local.split("+")[0]
            local = local.replace(".", "")

            normalized = local + "@" + domain

            seen.add(normalized)

        return len(seen)
```

## Code Explanation

We use a set to store unique normalized addresses:

```python
seen = set()
```

Split the email into local and domain parts:

```python
local, domain = email.split("@")
```

Remove everything after `'+'`:

```python
local = local.split("+")[0]
```

Remove all dots:

```python
local = local.replace(".", "")
```

Rebuild the normalized email:

```python
normalized = local + "@" + domain
```

Insert into the set:

```python
seen.add(normalized)
```

Finally return the number of unique entries:

```python
return len(seen)
```

## Testing

```python
def run_tests():
    s = Solution()

    assert s.numUniqueEmails([
        "test.email+alex@leetcode.com",
        "test.e.mail+bob.cathy@leetcode.com",
        "testemail+david@lee.tcode.com",
    ]) == 2

    assert s.numUniqueEmails([
        "a@leetcode.com",
        "b@leetcode.com",
        "c@leetcode.com",
    ]) == 3

    assert s.numUniqueEmails([
        "a.b.c+d@leetcode.com",
        "abc@leetcode.com",
    ]) == 1

    assert s.numUniqueEmails([
        "x+y@leetcode.com",
        "x@leetcode.com",
    ]) == 1

    assert s.numUniqueEmails([
        "user.name+tag+sorting@gmail.com",
        "username@gmail.com",
    ]) == 1

    assert s.numUniqueEmails([
        "a@leetcode.com",
        "a@lee.tcode.com",
    ]) == 2

    print("all tests passed")

run_tests()
```

| Test | Why |
|---|---|
| Official example | Basic normalization |
| Different addresses | No duplicates |
| Dot removal | Dots ignored |
| Plus handling | Ignore suffix after `+` |
| Multiple plus sections | Split at first `+` |
| Different domains | Domain is preserved |