# LeetCode 943: Find the Shortest Superstring

## Problem Restatement

We are given an array of strings:

```python
words
```

We need to return the shortest string that contains every string in `words` as a substring.

If several shortest answers exist, we may return any of them.

The problem also says that no string in `words` is a substring of another string in `words`. This lets us focus on ordering and overlapping the strings, instead of removing already-contained strings. The official statement defines this shortest superstring task and allows any valid shortest result.

## Input and Output

| Item | Meaning |
|---|---|
| Input | A list of strings `words` |
| Output | A shortest string containing every word as a substring |
| Multiple answers | Any shortest answer is accepted |
| Important condition | No word is a substring of another word |

Function shape:

```python
class Solution:
    def shortestSuperstring(self, words: list[str]) -> str:
        ...
```

## Examples

Example 1:

```python
words = ["alex", "loves", "leetcode"]
```

There is no useful overlap between these words.

One valid answer is:

```python
"alexlovesleetcode"
```

Other permutations such as:

```python
"lovesleetcodealex"
```

also contain every word and have the same length, so they may also be accepted.

Example 2:

```python
words = ["catg", "ctaagt", "gcta", "ttca", "atgcatc"]
```

One shortest answer is:

```python
"gctaagttcatgcatc"
```

It contains every word as a substring.

## First Thought: Try Every Ordering

If we choose an order of the words, we can merge them from left to right.

For example, if one word ends with the prefix of the next word, we can overlap them:

```text
"abcde" + "cdefg" -> "abcdefg"
```

The problem becomes:

Find the ordering of words that gives the maximum total overlap.

A direct solution tries all permutations of `words`.

That is too slow because there are:

```text
n!
```

possible orders.

We need dynamic programming over subsets.

## Key Insight

This is similar to a shortest Hamiltonian path problem on a complete directed graph.

Each word is a node.

The edge from word `i` to word `j` has a cost:

```text
extra characters needed to append words[j] after words[i]
```

For example:

```python
a = "abcde"
b = "cdefg"
```

The overlap is:

```text
"cde"
```

So appending `b` after `a` needs only:

```python
"fg"
```

The cost is `2`.

Then we use bitmask DP:

```python
dp[mask][last]
```

means the minimum length of a superstring that contains the set of words in `mask` and ends with word `last`.

To reconstruct the actual string, we also store the parent word used before `last`.

## Precompute Append Costs

For each pair `(i, j)`, compute:

```python
cost[i][j]
```

where `cost[i][j]` is the number of extra characters needed to append `words[j]` after `words[i]`.

If the maximum overlap length is `k`, then:

```python
cost[i][j] = len(words[j]) - k
```

Example:

```python
words[i] = "abcde"
words[j] = "cdefg"
```

Maximum suffix-prefix overlap is:

```text
"cde"
```

So:

```python
cost[i][j] = 5 - 3 = 2
```

## Algorithm

Let:

```python
n = len(words)
```

First compute all pairwise append costs.

Then initialize:

```python
dp[1 << i][i] = len(words[i])
```

This means a superstring containing only `words[i]` and ending at `i` is just `words[i]`.

Then transition:

For each `mask`, for each `last` inside `mask`, try appending a word `nxt` not in `mask`:

```python
new_mask = mask | (1 << nxt)
new_cost = dp[mask][last] + cost[last][nxt]
```

If `new_cost` improves `dp[new_mask][nxt]`, update it and record:

```python
parent[new_mask][nxt] = last
```

At the end, choose the best ending word from:

```python
dp[(1 << n) - 1]
```

Then reconstruct the order of words using `parent`.

Finally build the superstring from that order.

## Walkthrough

Use a small example:

```python
words = ["abc", "bcd", "cde"]
```

Pairwise overlaps:

| Append | Overlap | Extra |
|---|---|---:|
| `"abc" -> "bcd"` | `"bc"` | 1 |
| `"bcd" -> "cde"` | `"cd"` | 1 |
| `"abc" -> "cde"` | `"c"` | 2 |

The best order is:

```python
["abc", "bcd", "cde"]
```

Build:

```text
"abc" + "d" + "e" = "abcde"
```

The result contains all three words.

## Correctness

For a fixed set of words `mask` and ending word `last`, `dp[mask][last]` stores the minimum possible superstring length among all orderings that use exactly the words in `mask` and end at `last`.

The base case is correct because a set with one word has only one possible superstring: the word itself.

For the transition, suppose an optimal ordering for `new_mask` ends at `nxt`. The word before `nxt` must be some `last` in the smaller set `mask`. The best length for that smaller ordering is `dp[mask][last]`, and appending `nxt` after `last` adds exactly `cost[last][nxt]` characters.

The algorithm tries every possible previous ending word and every possible next word, so it considers the final step of every valid ordering.

Therefore, by induction over subset size, every DP state stores the optimal length for that state.

At the end, every word must be included, so the mask is `(1 << n) - 1`. The answer may end at any word, so choosing the minimum over all ending words gives the shortest possible superstring length.

The parent table records the transitions that produced the optimal DP value, so reconstructing the path gives an ordering that achieves that shortest length. Building the string from that ordering using the same overlaps produces a shortest valid superstring.

## Complexity

Let:

```python
n = len(words)
```

and let `L` be the maximum word length.

| Metric | Value | Why |
|---|---|---|
| Time | `O(n^2 * L^2 + 2^n * n^2)` | Pairwise overlap computation plus bitmask DP |
| Space | `O(2^n * n + n^2)` | DP table, parent table, and cost table |

The overlap part can be optimized, but the simple version is clear and accepted for the usual constraints.

## Implementation

```python
class Solution:
    def shortestSuperstring(self, words: list[str]) -> str:
        n = len(words)

        cost = [[0] * n for _ in range(n)]

        for i in range(n):
            for j in range(n):
                if i == j:
                    continue

                overlap = 0
                max_len = min(len(words[i]), len(words[j]))

                for k in range(1, max_len + 1):
                    if words[i][-k:] == words[j][:k]:
                        overlap = k

                cost[i][j] = len(words[j]) - overlap

        full = (1 << n) - 1
        INF = 10**18

        dp = [[INF] * n for _ in range(1 << n)]
        parent = [[-1] * n for _ in range(1 << n)]

        for i in range(n):
            dp[1 << i][i] = len(words[i])

        for mask in range(1 << n):
            for last in range(n):
                if dp[mask][last] == INF:
                    continue

                for nxt in range(n):
                    if mask & (1 << nxt):
                        continue

                    new_mask = mask | (1 << nxt)
                    new_cost = dp[mask][last] + cost[last][nxt]

                    if new_cost < dp[new_mask][nxt]:
                        dp[new_mask][nxt] = new_cost
                        parent[new_mask][nxt] = last

        last = min(range(n), key=lambda i: dp[full][i])

        order = []
        mask = full

        while last != -1:
            order.append(last)
            prev = parent[mask][last]
            mask ^= 1 << last
            last = prev

        order.reverse()

        ans = words[order[0]]

        for a, b in zip(order, order[1:]):
            ans += words[b][len(words[b]) - cost[a][b]:]

        return ans
```

## Code Explanation

We first compute how many extra characters are needed to append one word after another:

```python
cost[i][j] = len(words[j]) - overlap
```

Then we create the DP table:

```python
dp = [[INF] * n for _ in range(1 << n)]
```

Each state is a subset of words and the last word in the current ordering.

The base case is:

```python
dp[1 << i][i] = len(words[i])
```

Then we try adding each unused word:

```python
new_mask = mask | (1 << nxt)
new_cost = dp[mask][last] + cost[last][nxt]
```

If this improves the result, we store the previous word:

```python
parent[new_mask][nxt] = last
```

After filling the DP table, we choose the best final word:

```python
last = min(range(n), key=lambda i: dp[full][i])
```

Then we walk backward through `parent` to recover the word order.

Finally, we build the actual answer by appending only the non-overlapping suffix of each next word:

```python
ans += words[b][len(words[b]) - cost[a][b]:]
```

## Testing

```python
def contains_all(result, words):
    return all(word in result for word in words)

def run_tests():
    s = Solution()

    words = ["alex", "loves", "leetcode"]
    ans = s.shortestSuperstring(words)
    assert contains_all(ans, words)
    assert len(ans) == len("alexlovesleetcode")

    words = ["catg", "ctaagt", "gcta", "ttca", "atgcatc"]
    ans = s.shortestSuperstring(words)
    assert contains_all(ans, words)
    assert len(ans) == len("gctaagttcatgcatc")

    words = ["abc", "bcd", "cde"]
    ans = s.shortestSuperstring(words)
    assert contains_all(ans, words)
    assert ans == "abcde"

    words = ["ab", "ba"]
    ans = s.shortestSuperstring(words)
    assert contains_all(ans, words)
    assert len(ans) == 3

    print("all tests passed")

run_tests()
```

| Test | Why |
|---|---|
| `["alex","loves","leetcode"]` | No useful overlaps |
| `["catg","ctaagt","gcta","ttca","atgcatc"]` | Standard hard example |
| `["abc","bcd","cde"]` | Clean chain overlap |
| `["ab","ba"]` | Cyclic overlap with multiple valid answers |