# LeetCode 269: Alien Dictionary

## Problem Restatement

We are given a list of words sorted according to an unknown alien alphabet.

We need to infer one possible order of the alien alphabet.

Return a string containing all unique letters in a valid order.

If no valid order exists, return:

```python
""
```

If multiple valid orders exist, any one of them is accepted.

The constraints are `1 <= words.length <= 100`, `1 <= words[i].length <= 100`, and each word contains only lowercase English letters. The official examples include `["wrt","wrf","er","ett","rftt"] -> "wertf"`, `["z","x"] -> "zx"`, and `["z","x","z"] -> ""`.

## Input and Output

| Item | Meaning |
|---|---|
| Input | A list of words sorted by an alien alphabet |
| Output | One valid character order |
| Invalid case | Return `""` |
| Characters | Unique lowercase letters that appear in `words` |

Example function shape:

```python
def alienOrder(words: list[str]) -> str:
    ...
```

## Examples

Example 1:

```python
words = ["wrt", "wrf", "er", "ett", "rftt"]
```

Compare adjacent words.

| Pair | First difference | Rule |
|---|---|---|
| `"wrt"`, `"wrf"` | `t` vs `f` | `t` before `f` |
| `"wrf"`, `"er"` | `w` vs `e` | `w` before `e` |
| `"er"`, `"ett"` | `r` vs `t` | `r` before `t` |
| `"ett"`, `"rftt"` | `e` vs `r` | `e` before `r` |

These rules form:

```text
w -> e -> r -> t -> f
```

Answer:

```python
"wertf"
```

Example 2:

```python
words = ["z", "x"]
```

The first word comes before the second word, so:

```text
z -> x
```

Answer:

```python
"zx"
```

Example 3:

```python
words = ["z", "x", "z"]
```

From `"z"` before `"x"`:

```text
z -> x
```

From `"x"` before `"z"`:

```text
x -> z
```

This creates a cycle.

No valid alphabet order exists.

Answer:

```python
""
```

## First Thought: Compare All Words

A tempting idea is to compare every pair of words and collect ordering rules.

But only adjacent words are needed.

The input is already sorted. In a sorted list, each adjacent pair gives the strongest local constraint. Comparing non-adjacent pairs can add redundant information and can make reasoning harder.

So we compare:

```python
words[i]
words[i + 1]
```

for every adjacent pair.

## Key Insight

When two sorted words differ, the first differing character determines the alphabet order.

For example:

```python
"wrt"
"wrf"
```

The first two characters are the same:

```python
w == w
r == r
```

The first difference is:

```python
t != f
```

Since `"wrt"` appears before `"wrf"`, we know:

```text
t -> f
```

This is a directed graph edge.

After collecting all such edges, the problem becomes:

> Return a topological ordering of the graph.

If the graph has a cycle, no valid ordering exists.

## Invalid Prefix Case

There is one special invalid case.

If a longer word appears before its own prefix, the order is impossible.

Example:

```python
words = ["abc", "ab"]
```

No alphabet order can make `"abc"` come before `"ab"` when `"ab"` is its prefix.

So we must return:

```python
""
```

This case happens when:

```python
len(first) > len(second)
```

and no differing character is found.

## Algorithm

1. Create a graph node for every unique character in `words`.
2. Compare every adjacent word pair.
3. For each pair:
   - Find the first differing character.
   - Add a directed edge from the first word's character to the second word's character.
   - If no difference exists and the first word is longer, return `""`.
4. Compute indegrees for every character.
5. Put all characters with indegree `0` into a queue.
6. Repeatedly:
   - Pop a character from the queue.
   - Append it to the answer.
   - Decrease indegree of its neighbors.
   - If a neighbor reaches indegree `0`, push it into the queue.
7. If the answer length equals the number of unique characters, return it.
8. Otherwise, return `""` because a cycle exists.

## Walkthrough

Use:

```python
words = ["wrt", "wrf", "er", "ett", "rftt"]
```

Unique characters:

```python
w, r, t, f, e
```

Build edges:

```text
t -> f
w -> e
r -> t
e -> r
```

Indegrees:

| Character | Indegree |
|---|---:|
| `w` | 0 |
| `e` | 1 |
| `r` | 1 |
| `t` | 1 |
| `f` | 1 |

Queue starts with:

```python
["w"]
```

Process `w`.

Answer:

```python
"w"
```

Decrease indegree of `e` to `0`, so push `e`.

Process `e`.

Answer:

```python
"we"
```

Decrease indegree of `r` to `0`.

Process `r`.

Answer:

```python
"wer"
```

Decrease indegree of `t` to `0`.

Process `t`.

Answer:

```python
"wert"
```

Decrease indegree of `f` to `0`.

Process `f`.

Answer:

```python
"wertf"
```

All characters are processed.

Return:

```python
"wertf"
```

## Correctness

The graph contains one node for every character that appears in the input. Therefore every possible output character is represented.

For every adjacent pair of words, the first differing character is the only character comparison that determines their relative order. Adding an edge from the first word's character to the second word's character captures exactly that required constraint.

If a longer word appears before its own prefix, no lexicographic alphabet can satisfy the input order, so returning `""` is correct.

After the graph is built, any valid alien alphabet must place every edge source before its destination. This is exactly the definition of a topological ordering.

Kahn's algorithm repeatedly selects characters with no remaining prerequisites. When a character is removed, all of its outgoing constraints are satisfied, so reducing its neighbors' indegrees preserves the remaining dependency structure.

If all characters are processed, the produced order satisfies every edge and is a valid alien alphabet order.

If some characters remain unprocessed, every remaining character depends on another remaining character. That means a cycle exists, so no valid order exists.

Therefore the algorithm returns a valid order exactly when one exists.

## Complexity

Let `C` be the total number of characters across all words, and let `U` be the number of unique letters.

| Metric | Value | Why |
|---|---|---|
| Time | `O(C + U)` | We scan words, build edges, and topologically sort characters |
| Space | `O(U + E)` | Graph stores unique characters and ordering edges |

Since the alphabet is lowercase English letters, `U <= 26`.

## Implementation

```python
from collections import defaultdict, deque
from typing import List

class Solution:
    def alienOrder(self, words: List[str]) -> str:
        graph = {ch: set() for word in words for ch in word}
        indegree = {ch: 0 for ch in graph}

        for first, second in zip(words, words[1:]):
            min_len = min(len(first), len(second))
            found_difference = False

            for i in range(min_len):
                if first[i] != second[i]:
                    a = first[i]
                    b = second[i]

                    if b not in graph[a]:
                        graph[a].add(b)
                        indegree[b] += 1

                    found_difference = True
                    break

            if not found_difference and len(first) > len(second):
                return ""

        queue = deque(ch for ch in graph if indegree[ch] == 0)
        order = []

        while queue:
            ch = queue.popleft()
            order.append(ch)

            for nei in graph[ch]:
                indegree[nei] -= 1

                if indegree[nei] == 0:
                    queue.append(nei)

        if len(order) != len(graph):
            return ""

        return "".join(order)
```

## Code Explanation

Create one graph node for every character:

```python
graph = {ch: set() for word in words for ch in word}
```

Initialize every indegree to zero:

```python
indegree = {ch: 0 for ch in graph}
```

Compare adjacent words:

```python
for first, second in zip(words, words[1:]):
```

Only the first differing character matters:

```python
if first[i] != second[i]:
```

Add an edge:

```python
graph[a].add(b)
indegree[b] += 1
```

The `set` prevents duplicate edges from increasing indegree multiple times.

If no difference exists and the first word is longer:

```python
if not found_difference and len(first) > len(second):
    return ""
```

the input violates lexicographic order.

The queue starts with all characters that have no prerequisites:

```python
queue = deque(ch for ch in graph if indegree[ch] == 0)
```

Each popped character is appended to the answer.

When a neighbor's indegree becomes zero, it is ready to be processed.

Finally:

```python
if len(order) != len(graph):
    return ""
```

detects cycles.

## Testing

```python
def is_valid_order(order: str, words: list[str]) -> bool:
    if order == "":
        return False

    rank = {ch: i for i, ch in enumerate(order)}

    for first, second in zip(words, words[1:]):
        min_len = min(len(first), len(second))
        valid_pair = False

        for i in range(min_len):
            if first[i] != second[i]:
                if rank[first[i]] > rank[second[i]]:
                    return False
                valid_pair = True
                break

        if not valid_pair and len(first) > len(second):
            return False

    return True

def run_tests():
    s = Solution()

    words = ["wrt", "wrf", "er", "ett", "rftt"]
    assert is_valid_order(s.alienOrder(words), words)

    assert s.alienOrder(["z", "x"]) == "zx"
    assert s.alienOrder(["z", "x", "z"]) == ""
    assert s.alienOrder(["abc", "ab"]) == ""

    words = ["ab", "adc"]
    assert is_valid_order(s.alienOrder(words), words)

    words = ["a"]
    assert s.alienOrder(words) == "a"

    words = ["za", "zb", "ca", "cb"]
    assert is_valid_order(s.alienOrder(words), words)

    print("all tests passed")

run_tests()
```

Test meaning:

| Test | Why |
|---|---|
| Standard example | Builds a full dependency chain |
| `["z", "x"]` | Simple one-edge graph |
| `["z", "x", "z"]` | Cycle detection |
| `["abc", "ab"]` | Invalid prefix case |
| `["ab", "adc"]` | Partial order with free characters |
| Single word | Any order of its characters is valid |
| Multiple valid orders | Confirms checker accepts any valid topological order |