LeetCode 126: Word Ladder II

Problem Restatement

We are given three things:

Name	Meaning
`beginWord`	The word where the transformation starts
`endWord`	The word we want to reach
`wordList`	The dictionary of valid transformed words

We need to return all shortest transformation sequences from beginWord to endWord.

A valid transformation sequence looks like this:

beginWord -> s1 -> s2 -> ... -> endWord

Each step must obey two rules:

Adjacent words must differ by exactly one character.
Every transformed word after beginWord must exist in wordList.

The result should contain only the shortest valid sequences. If no sequence exists, return an empty list. LeetCode states this problem as returning all shortest transformation sequences, or an empty list when no such sequence exists.

Examples

Example 1:

beginWord = "hit"
endWord = "cog"
wordList = ["hot", "dot", "dog", "lot", "log", "cog"]

Output:

[
    ["hit", "hot", "dot", "dog", "cog"],
    ["hit", "hot", "lot", "log", "cog"]
]

Both sequences have length 5.

There are longer valid paths, but we do not return them because the problem asks only for shortest sequences.

Example 2:

beginWord = "hit"
endWord = "cog"
wordList = ["hot", "dot", "dog", "lot", "log"]

Output:

[]

Here, "cog" does not appear in wordList, so no valid sequence can end at "cog".

Input and Output

Item	Meaning
Input	`beginWord: str`, `endWord: str`, `wordList: list[str]`
Output	`list[list[str]]`
Rule	Change one letter at a time
Dictionary rule	Every transformed word must be in `wordList`
Goal	Return all shortest sequences

The function shape is:

class Solution:
    def findLadders(
        self,
        beginWord: str,
        endWord: str,
        wordList: list[str],
    ) -> list[list[str]]:
        ...

First Thought: Treat Words as a Graph

Each word is a node.

There is an edge between two words if they differ by exactly one character.

For example:

hit -> hot
hot -> dot
hot -> lot
dot -> dog
lot -> log
dog -> cog
log -> cog

So the problem becomes:

Find all shortest paths from beginWord to endWord in an unweighted graph.

Because every edge has the same cost, BFS is the right tool for finding shortest distances.

But this problem needs more than one shortest path. We need all of them.

So the solution has two phases:

Use BFS to find shortest paths and record parent links.
Use backtracking to rebuild all shortest sequences from endWord back to beginWord.

Brute Force Idea

A brute force solution could try every possible sequence of valid word changes.

At each word, we try every word in wordList and check whether it differs by one character.

This is too slow because the number of possible paths can grow very quickly.

Even worse, paths can revisit words and form cycles unless we carefully track visited states.

We need BFS because BFS explores the graph level by level.

The first level where we reach endWord gives the shortest possible sequence length.

Key Insight

We should not store full paths inside the BFS queue.

That can create huge memory usage because many paths may share the same prefix.

Instead, during BFS, we store parent relationships:

parents[child].add(parent)

For example, in the sample case:

parents["hot"] = {"hit"}
parents["dot"] = {"hot"}
parents["lot"] = {"hot"}
parents["dog"] = {"dot"}
parents["log"] = {"lot"}
parents["cog"] = {"dog", "log"}

Then we backtrack from "cog":

cog <- dog <- dot <- hot <- hit
cog <- log <- lot <- hot <- hit

After reversing each path, we get the final answer.

Why BFS Must Be Level-Based

This problem has one subtle detail.

Suppose two different words in the same BFS level can both reach the same next word.

We must keep both parents.

For example:

dog -> cog
log -> cog

If "dog" and "log" are at the same distance from beginWord, then both parent links are valid shortest-path links.

So we cannot remove a word immediately after seeing it once inside the same level.

Instead, we collect all words visited in the current level, then remove them after the level is complete.

That allows all shortest parents from the same level to be recorded.

Algorithm

First, put all words into a set for fast lookup:

word_set = set(wordList)

If endWord is missing, return immediately:

if endWord not in word_set:
    return []

Then run BFS from beginWord.

We keep:

Variable	Meaning
`level`	Current BFS frontier
`parents`	Maps each word to all previous words on shortest paths
`word_set`	Words that have not been fully visited yet
`found`	Whether this BFS level reaches `endWord`

For every word in the current level, generate all possible one-letter changes.

For each position, replace the character with 'a' to 'z'.

If the generated word exists in word_set, then it is a valid next word.

Record the parent:

parents[next_word].add(word)

After processing the whole level, remove all words found in that level from word_set.

Once endWord is found, stop BFS after finishing that level.

Then backtrack from endWord to beginWord.

Correctness

BFS explores words by distance from beginWord.

At level 1, it finds all words reachable in one transformation.

At level 2, it finds all words reachable in two transformations.

This continues until endWord is reached.

Because BFS processes levels in increasing distance, the first level that reaches endWord gives the shortest possible transformation length.

The parent map records only edges from the previous BFS level into the current BFS level. Therefore, every parent link belongs to a shortest path prefix.

When multiple previous-level words reach the same child, the algorithm keeps all of them. This preserves every shortest sequence.

Backtracking starts from endWord and follows all recorded parent links until beginWord. Since every parent link moves one level closer to beginWord, each reconstructed path has shortest length.

Therefore, the algorithm returns exactly all shortest transformation sequences.

Complexity

Let:

Symbol	Meaning
`n`	Number of words in `wordList`
`m`	Length of each word
`a`	Alphabet size, normally `26`
`p`	Number of shortest output paths

For each visited word, we try changing each of its m positions into 26 possible letters.

So BFS costs roughly:

O(n * m * 26)

Backtracking depends on the number of shortest paths. If there are many shortest paths, the output itself is large.

Metric	Value
Time	`O(n * m * 26 + total output size)`
Space	`O(n * m + parent links + total output size)`

The important point is that the algorithm avoids comparing every pair of words, which would cost O(n^2 * m).

Implementation

from collections import defaultdict

class Solution:
    def findLadders(
        self,
        beginWord: str,
        endWord: str,
        wordList: list[str],
    ) -> list[list[str]]:
        word_set = set(wordList)

        if endWord not in word_set:
            return []

        parents = defaultdict(set)
        level = {beginWord}
        found = False

        while level and not found:
            next_level = set()

            for word in level:
                if word in word_set:
                    word_set.remove(word)

            for word in level:
                chars = list(word)

                for i in range(len(chars)):
                    original = chars[i]

                    for code in range(ord("a"), ord("z") + 1):
                        chars[i] = chr(code)
                        next_word = "".join(chars)

                        if next_word not in word_set:
                            continue

                        parents[next_word].add(word)
                        next_level.add(next_word)

                        if next_word == endWord:
                            found = True

                    chars[i] = original

            level = next_level

        if not found:
            return []

        result = []
        path = [endWord]

        def backtrack(word: str) -> None:
            if word == beginWord:
                result.append(path[::-1])
                return

            for parent in parents[word]:
                path.append(parent)
                backtrack(parent)
                path.pop()

        backtrack(endWord)
        return result

Code Explanation

We first convert wordList into a set:

word_set = set(wordList)

This makes membership checks fast.

Then we handle the impossible case:

if endWord not in word_set:
    return []

Every transformed word must come from wordList, including endWord.

The parent map stores all shortest-path parents:

parents = defaultdict(set)

We use a set because one child may have multiple parents.

The BFS starts from beginWord:

level = {beginWord}

Before exploring the current level, we remove the current level words from word_set:

for word in level:
    if word in word_set:
        word_set.remove(word)

This prevents cycles and prevents longer paths from overwriting shortest paths.

Then for each word, we generate every one-letter mutation:

for i in range(len(chars)):
    original = chars[i]

    for code in range(ord("a"), ord("z") + 1):
        chars[i] = chr(code)
        next_word = "".join(chars)

If the generated word is still available, it belongs to the next BFS level:

if next_word not in word_set:
    continue

We record the parent relation:

parents[next_word].add(word)

If we find endWord, we mark it:

if next_word == endWord:
    found = True

We still finish the current level, because other words in the same level may also reach endWord.

After BFS, we backtrack from endWord:

path = [endWord]

When we reach beginWord, we reverse the path:

result.append(path[::-1])

That turns:

cog -> dog -> dot -> hot -> hit

into:

hit -> hot -> dot -> dog -> cog

Testing

def normalize(paths: list[list[str]]) -> list[list[str]]:
    return sorted(paths)

def run_tests():
    s = Solution()

    assert normalize(
        s.findLadders(
            "hit",
            "cog",
            ["hot", "dot", "dog", "lot", "log", "cog"],
        )
    ) == normalize(
        [
            ["hit", "hot", "dot", "dog", "cog"],
            ["hit", "hot", "lot", "log", "cog"],
        ]
    )

    assert s.findLadders(
        "hit",
        "cog",
        ["hot", "dot", "dog", "lot", "log"],
    ) == []

    assert s.findLadders(
        "a",
        "c",
        ["a", "b", "c"],
    ) in (
        [["a", "c"]],
        [["a", "b", "c"]],
    )

    assert normalize(
        s.findLadders(
            "red",
            "tax",
            ["ted", "tex", "red", "tax", "tad", "den", "rex", "pee"],
        )
    ) == normalize(
        [
            ["red", "ted", "tad", "tax"],
            ["red", "ted", "tex", "tax"],
            ["red", "rex", "tex", "tax"],
        ]
    )

    print("all tests passed")

run_tests()

For the single-character case, "a" can directly become "c" if "c" is in the word list. The shortest sequence is therefore:

["a", "c"]

So a stricter version of that test should be:

assert s.findLadders(
    "a",
    "c",
    ["a", "b", "c"],
) == [["a", "c"]]

Final cleaned test suite:

def normalize(paths: list[list[str]]) -> list[list[str]]:
    return sorted(paths)

def run_tests():
    s = Solution()

    assert normalize(
        s.findLadders(
            "hit",
            "cog",
            ["hot", "dot", "dog", "lot", "log", "cog"],
        )
    ) == normalize(
        [
            ["hit", "hot", "dot", "dog", "cog"],
            ["hit", "hot", "lot", "log", "cog"],
        ]
    )

    assert s.findLadders(
        "hit",
        "cog",
        ["hot", "dot", "dog", "lot", "log"],
    ) == []

    assert s.findLadders(
        "a",
        "c",
        ["a", "b", "c"],
    ) == [["a", "c"]]

    assert normalize(
        s.findLadders(
            "red",
            "tax",
            ["ted", "tex", "red", "tax", "tad", "den", "rex", "pee"],
        )
    ) == normalize(
        [
            ["red", "ted", "tad", "tax"],
            ["red", "ted", "tex", "tax"],
            ["red", "rex", "tex", "tax"],
        ]
    )

    print("all tests passed")

run_tests()

Final Notes on Ordering

LeetCode usually accepts the answer in any order.

The parent sets do not preserve deterministic order. If exact printed order matters for local testing, sort during backtracking:

for parent in sorted(parents[word]):
    path.append(parent)
    backtrack(parent)
    path.pop()

The algorithm remains the same.