# LeetCode 49: Group Anagrams

## Problem Restatement

We are given an array of strings `strs`.

We need to group strings that are anagrams of each other.

Two strings are anagrams if they contain the same characters with the same frequencies, but possibly in a different order.

The answer can be returned in any order. LeetCode states the input strings contain lowercase English letters, with `1 <= strs.length <= 10^4` and `0 <= strs[i].length <= 100`.

Example:

```python
strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
```

The output can be:

```python
[
    ["bat"],
    ["nat", "tan"],
    ["ate", "eat", "tea"],
]
```

The order of groups does not matter.

The order inside each group also does not matter.

## Input and Output

| Item | Meaning |
|---|---|
| Input | A list of lowercase strings `strs` |
| Output | A list of groups, where each group contains anagrams |
| Group order | Any order is allowed |
| String order inside group | Any order is allowed |

Function shape:

```python
def groupAnagrams(strs: list[str]) -> list[list[str]]:
    ...
```

## First Thought: Compare Every Pair

A direct approach is to compare every string with every other string.

To check whether two strings are anagrams, we can sort both strings and compare them.

For example:

```python
sorted("eat") == sorted("tea")
```

Both become:

```python
["a", "e", "t"]
```

So they are anagrams.

But comparing every pair is too slow. If there are many strings, pairwise comparison wastes work.

We need to compute one stable signature for each string and use that signature to group strings directly.

## Key Insight

All anagrams share the same sorted form.

For example:

| String | Sorted Signature |
|---|---|
| `"eat"` | `"aet"` |
| `"tea"` | `"aet"` |
| `"ate"` | `"aet"` |
| `"tan"` | `"ant"` |
| `"nat"` | `"ant"` |
| `"bat"` | `"abt"` |

So we can use the sorted string as a hash map key.

The map stores:

```python
signature -> list of strings with that signature
```

For example:

```python
{
    "aet": ["eat", "tea", "ate"],
    "ant": ["tan", "nat"],
    "abt": ["bat"],
}
```

At the end, the values of the map are the required groups.

## Algorithm

Create an empty hash map called `groups`.

For each word in `strs`:

1. Sort the characters in the word.
2. Convert the sorted characters into a string key.
3. Append the original word to `groups[key]`.

Return all values from the map.

## Walkthrough

Use:

```python
strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
```

Start:

```python
groups = {}
```

Process `"eat"`:

```python
key = "aet"
groups = {
    "aet": ["eat"]
}
```

Process `"tea"`:

```python
key = "aet"
groups = {
    "aet": ["eat", "tea"]
}
```

Process `"tan"`:

```python
key = "ant"
groups = {
    "aet": ["eat", "tea"],
    "ant": ["tan"]
}
```

Process `"ate"`:

```python
key = "aet"
groups = {
    "aet": ["eat", "tea", "ate"],
    "ant": ["tan"]
}
```

Process `"nat"`:

```python
key = "ant"
groups = {
    "aet": ["eat", "tea", "ate"],
    "ant": ["tan", "nat"]
}
```

Process `"bat"`:

```python
key = "abt"
groups = {
    "aet": ["eat", "tea", "ate"],
    "ant": ["tan", "nat"],
    "abt": ["bat"]
}
```

Return:

```python
[
    ["eat", "tea", "ate"],
    ["tan", "nat"],
    ["bat"],
]
```

## Correctness

Two strings are anagrams exactly when they have the same characters with the same frequencies.

Sorting a string puts its characters into a fixed order. Therefore, two strings have the same sorted signature if and only if they contain the same characters with the same frequencies.

The algorithm uses this sorted signature as the hash map key. Every string is placed into the group for its signature.

So all anagrams go into the same group.

Also, strings that are not anagrams have different sorted signatures, so they cannot be placed into the same group.

Since every input string is processed once and added to exactly one group, the returned groups contain all input strings, grouped correctly.

## Complexity

Let:

```python
m = len(strs)
k = maximum length of a string
```

| Metric | Value | Why |
|---|---|---|
| Time | `O(m * k log k)` | We sort each string |
| Space | `O(m * k)` | The groups store all strings, and keys store sorted signatures |

The auxiliary map stores one list per signature.

## Implementation

```python
from collections import defaultdict

class Solution:
    def groupAnagrams(self, strs: list[str]) -> list[list[str]]:
        groups = defaultdict(list)

        for word in strs:
            key = "".join(sorted(word))
            groups[key].append(word)

        return list(groups.values())
```

## Code Explanation

We use a `defaultdict(list)`:

```python
groups = defaultdict(list)
```

This lets us append to a group without checking whether the key already exists.

For each word:

```python
for word in strs:
```

we build its sorted signature:

```python
key = "".join(sorted(word))
```

Then we add the original word to the matching group:

```python
groups[key].append(word)
```

We store the original word, not the sorted key, because the output must contain the original strings.

At the end, return all groups:

```python
return list(groups.values())
```

## Testing

Because the output order can vary, we normalize the result before comparing.

```python
def normalize(groups):
    return sorted(sorted(group) for group in groups)

def run_tests():
    s = Solution()

    assert normalize(s.groupAnagrams(["eat", "tea", "tan", "ate", "nat", "bat"])) == normalize([
        ["bat"],
        ["nat", "tan"],
        ["ate", "eat", "tea"],
    ])

    assert normalize(s.groupAnagrams([""])) == normalize([
        [""],
    ])

    assert normalize(s.groupAnagrams(["a"])) == normalize([
        ["a"],
    ])

    assert normalize(s.groupAnagrams(["abc", "bca", "cab", "xyz", "zyx"])) == normalize([
        ["abc", "bca", "cab"],
        ["xyz", "zyx"],
    ])

    assert normalize(s.groupAnagrams(["a", "b", "c"])) == normalize([
        ["a"],
        ["b"],
        ["c"],
    ])

    print("all tests passed")

run_tests()
```

| Test | Expected Group Count | Reason |
|---|---:|---|
| `["eat", "tea", "tan", "ate", "nat", "bat"]` | `3` | Standard example |
| `[""]` | `1` | Empty string is valid |
| `["a"]` | `1` | Single string |
| `["abc", "bca", "cab", "xyz", "zyx"]` | `2` | Multiple groups |
| `["a", "b", "c"]` | `3` | No two strings are anagrams |

