LeetCode 929: Unique Email Addresses

Problem Restatement

We are given a list of email addresses.

Each email has two parts:

local-name@domain-name

The problem defines two special rules that apply only to the local name.

Rule 1: Ignore Dots

In the local name:

"."

characters are ignored.

For example:

"alice.z" == "alicez"

Rule 2: Ignore Everything After ‘+’

If a plus sign appears in the local name:

"+"

then everything after it is ignored.

For example:

"m.y+name" -> "my"

These rules do not apply to the domain name.

We must count how many unique email addresses actually receive mail after normalization.

The official statement defines the same dot-removal and plus-ignore rules for the local name only. (leetcode.com)

Input and Output

Item	Meaning
Input	A list of email strings
Output	Number of unique normalized emails
Dot rule	Dots in local name are ignored
Plus rule	Ignore everything after `+` in local name
Domain rule	Domain remains unchanged

Function shape:

class Solution:
    def numUniqueEmails(self, emails: list[str]) -> int:
        ...

Examples

Example 1:

emails = [
    "[email protected]",
    "[email protected]",
    "[email protected]",
]

Normalize the first email.

Split into:

local  = "test.email+alex"
domain = "leetcode.com"

Remove everything after +:

"test.email"

Remove dots:

"testemail"

Final normalized email:

"[email protected]"

The second email becomes the same normalized address:

"[email protected]"

The third email has a different domain:

"[email protected]"

So there are:

unique addresses.

Example 2:

emails = [
    "[email protected]",
    "[email protected]",
    "[email protected]",
]

All addresses are already different.

Answer:

First Thought

The rules only change the local name.

So for every email:

Split it into local name and domain name.
Process the local name.
Rebuild the normalized email.
Store it in a set.

At the end, the size of the set is the answer.

Key Insight

A hash set automatically removes duplicates.

If two original emails normalize into the same string, the set stores only one copy.

So the whole problem becomes a normalization problem.

The normalization steps are:

Split by '@'
Remove everything after '+'
Remove all dots '.'
Combine with the original domain

Algorithm

Create an empty set:

seen = set()

For each email:

Split into:

local, domain = email.split("@")

Remove the plus section:

local = local.split("+")[0]

Remove dots:

local = local.replace(".", "")

Build the normalized email:

normalized = local + "@" + domain

Insert into the set:

seen.add(normalized)

Return:

len(seen)

Walkthrough

Use:

email = "[email protected]"

Split:

local  = "test.email+alex"
domain = "leetcode.com"

Remove everything after '+':

"test.email"

Remove dots:

"testemail"

Rebuild:

"[email protected]"

That is the normalized address.

Correctness

For each email, the algorithm applies exactly the two rules defined in the problem.

First, the local name is truncated at the first '+', so every character after '+' is ignored.

Second, all dots are removed from the remaining local name.

The domain name is preserved unchanged.

Therefore, the produced normalized string is exactly the address that receives the email according to the problem rules.

If two original emails normalize to the same receiving address, the algorithm inserts the same normalized string into the set, and the set stores only one copy.

If two emails normalize to different receiving addresses, they become different strings and both remain in the set.

So after processing all emails, the set contains exactly the unique receiving addresses.

The algorithm returns the size of that set, which is the correct answer.

Complexity

Suppose:

n = len(emails)

and the average email length is:

Metric	Value	Why
Time	`O(n * m)`	Each email is scanned a constant number of times
Space	`O(n * m)`	The set stores normalized emails

Implementation

class Solution:
    def numUniqueEmails(self, emails: list[str]) -> int:
        seen = set()

        for email in emails:
            local, domain = email.split("@")

            local = local.split("+")[0]
            local = local.replace(".", "")

            normalized = local + "@" + domain

            seen.add(normalized)

        return len(seen)

Code Explanation

We use a set to store unique normalized addresses:

seen = set()

Split the email into local and domain parts:

local, domain = email.split("@")

Remove everything after '+':

local = local.split("+")[0]

Remove all dots:

local = local.replace(".", "")

Rebuild the normalized email:

normalized = local + "@" + domain

Insert into the set:

seen.add(normalized)

Finally return the number of unique entries:

return len(seen)

Testing

def run_tests():
    s = Solution()

    assert s.numUniqueEmails([
        "[email protected]",
        "[email protected]",
        "[email protected]",
    ]) == 2

    assert s.numUniqueEmails([
        "[email protected]",
        "[email protected]",
        "[email protected]",
    ]) == 3

    assert s.numUniqueEmails([
        "[email protected]",
        "[email protected]",
    ]) == 1

    assert s.numUniqueEmails([
        "[email protected]",
        "[email protected]",
    ]) == 1

    assert s.numUniqueEmails([
        "[email protected]",
        "[email protected]",
    ]) == 1

    assert s.numUniqueEmails([
        "[email protected]",
        "[email protected]",
    ]) == 2

    print("all tests passed")

run_tests()

Test	Why
Official example	Basic normalization
Different addresses	No duplicates
Dot removal	Dots ignored
Plus handling	Ignore suffix after `+`
Multiple plus sections	Split at first `+`
Different domains	Domain is preserved