LeetCode 811: Subdomain Visit Count

Problem Restatement

We are given a list of count-paired domains.

Each item has this format:

"count domain"

For example:

"9001 discuss.leetcode.com"

This means discuss.leetcode.com was visited 9001 times.

A visit to a domain also counts as a visit to all of its parent subdomains:

discuss.leetcode.com
leetcode.com
com

Return a list of count-paired domains showing the total visit count for every domain and subdomain. The result may be returned in any order. The official problem states that the output should use the same "count domain" format.

Input and Output

Item	Meaning
Input	A list of strings `cpdomains`
Output	A list of `"count domain"` strings
Domain format	Count, space, then domain
Visit rule	A visit to a domain also visits all parent subdomains
Output order	Any order is accepted

Examples

Example 1:

cpdomains = ["9001 discuss.leetcode.com"]

The domain contributes 9001 visits to:

discuss.leetcode.com
leetcode.com
com

So a valid answer is:

[
    "9001 discuss.leetcode.com",
    "9001 leetcode.com",
    "9001 com",
]

Example 2:

cpdomains = [
    "900 google.mail.com",
    "50 yahoo.com",
    "1 intel.mail.com",
    "5 wiki.org",
]

The counts accumulate like this:

Domain	Count
`google.mail.com`	900
`intel.mail.com`	1
`mail.com`	901
`yahoo.com`	50
`wiki.org`	5
`com`	951
`org`	5

A valid answer is:

[
    "900 google.mail.com",
    "901 mail.com",
    "951 com",
    "50 yahoo.com",
    "1 intel.mail.com",
    "5 wiki.org",
    "5 org",
]

First Thought: Parse and Count

Each input string gives us one visit count and one full domain.

For every full domain, we can generate all subdomains and add the same count to each one.

Since the same subdomain may appear from many input entries, we need a hash map:

domain -> total_count

Key Insight

A domain’s subdomains are obtained by removing labels from the left.

For example:

discuss.leetcode.com

has these counted domains:

discuss.leetcode.com
leetcode.com
com

So after parsing the count and full domain, we split the domain by ".".

Then for every starting position, join the remaining labels.

Algorithm

For each count-paired domain string:

Split it into count and domain:

count_text, domain = item.split()
count = int(count_text)

Split the domain into labels:

parts = domain.split(".")

Generate every subdomain:

".".join(parts[i:])

Add count to that subdomain in the hash map.

After processing all input strings, format each hash map entry:

f"{count} {domain}"

Return the formatted list.

Correctness

For each input item, the algorithm parses the visit count and the full domain exactly once.

A domain consists of labels separated by dots. Every parent subdomain is formed by removing zero or more labels from the left. The loop over i generates exactly those suffixes:

parts[i:]

When i = 0, it generates the full domain. When i moves right, it generates each parent subdomain. Therefore, every domain that should receive the visit count receives it.

The hash map adds counts for identical domains. So if multiple input domains share the same parent, such as google.mail.com and intel.mail.com, their visits both contribute to mail.com and com.

After all entries are processed, the hash map contains exactly the total count for every domain and subdomain. Formatting each entry as "count domain" gives the required output.

Complexity

Let L be the total number of characters across all input strings.

Metric	Value	Why
Time	`O(L)`	Each domain is split and its subdomains are generated
Space	`O(L)`	The hash map stores generated domain strings

Implementation

from collections import defaultdict

class Solution:
    def subdomainVisits(self, cpdomains: list[str]) -> list[str]:
        counts = defaultdict(int)

        for item in cpdomains:
            count_text, domain = item.split()
            count = int(count_text)

            parts = domain.split(".")

            for i in range(len(parts)):
                subdomain = ".".join(parts[i:])
                counts[subdomain] += count

        answer = []

        for domain, count in counts.items():
            answer.append(f"{count} {domain}")

        return answer

Code Explanation

The hash map stores accumulated visits:

counts = defaultdict(int)

Each input string is split into its count and domain:

count_text, domain = item.split()
count = int(count_text)

Then the domain is split into labels:

parts = domain.split(".")

For example:

"discuss.leetcode.com".split(".")

gives:

["discuss", "leetcode", "com"]

The loop creates each suffix domain:

for i in range(len(parts)):
    subdomain = ".".join(parts[i:])

For the example above, this creates:

discuss.leetcode.com
leetcode.com
com

Each receives the same visit count:

counts[subdomain] += count

Finally, each hash map entry is converted back to the required output format:

answer.append(f"{count} {domain}")

Testing

def normalize(items):
    return sorted(items)

def run_tests():
    s = Solution()

    assert normalize(s.subdomainVisits(
        ["9001 discuss.leetcode.com"]
    )) == normalize([
        "9001 discuss.leetcode.com",
        "9001 leetcode.com",
        "9001 com",
    ])

    assert normalize(s.subdomainVisits([
        "900 google.mail.com",
        "50 yahoo.com",
        "1 intel.mail.com",
        "5 wiki.org",
    ])) == normalize([
        "900 google.mail.com",
        "901 mail.com",
        "951 com",
        "50 yahoo.com",
        "1 intel.mail.com",
        "5 wiki.org",
        "5 org",
    ])

    assert normalize(s.subdomainVisits(
        ["1 a.com", "2 b.com"]
    )) == normalize([
        "1 a.com",
        "2 b.com",
        "3 com",
    ])

    assert normalize(s.subdomainVisits(
        ["10 com"]
    )) == normalize([
        "10 com",
    ])

    print("all tests passed")

run_tests()

Test	Why
Single three-label domain	Generates full domain and both parents
Multiple domains sharing parents	Checks count accumulation
Different domains under same top-level domain	Checks shared `com` count
One-label domain	Checks no dot case