A clear explanation of solving Find the Shortest Superstring using pairwise overlaps and bitmask dynamic programming.
Problem Restatement
We are given an array of strings:
wordsWe need to return the shortest string that contains every string in words as a substring.
If several shortest answers exist, we may return any of them.
The problem also says that no string in words is a substring of another string in words. This lets us focus on ordering and overlapping the strings, instead of removing already-contained strings. The official statement defines this shortest superstring task and allows any valid shortest result.
Input and Output
| Item | Meaning |
|---|---|
| Input | A list of strings words |
| Output | A shortest string containing every word as a substring |
| Multiple answers | Any shortest answer is accepted |
| Important condition | No word is a substring of another word |
Function shape:
class Solution:
def shortestSuperstring(self, words: list[str]) -> str:
...Examples
Example 1:
words = ["alex", "loves", "leetcode"]There is no useful overlap between these words.
One valid answer is:
"alexlovesleetcode"Other permutations such as:
"lovesleetcodealex"also contain every word and have the same length, so they may also be accepted.
Example 2:
words = ["catg", "ctaagt", "gcta", "ttca", "atgcatc"]One shortest answer is:
"gctaagttcatgcatc"It contains every word as a substring.
First Thought: Try Every Ordering
If we choose an order of the words, we can merge them from left to right.
For example, if one word ends with the prefix of the next word, we can overlap them:
"abcde" + "cdefg" -> "abcdefg"The problem becomes:
Find the ordering of words that gives the maximum total overlap.
A direct solution tries all permutations of words.
That is too slow because there are:
n!possible orders.
We need dynamic programming over subsets.
Key Insight
This is similar to a shortest Hamiltonian path problem on a complete directed graph.
Each word is a node.
The edge from word i to word j has a cost:
extra characters needed to append words[j] after words[i]For example:
a = "abcde"
b = "cdefg"The overlap is:
"cde"So appending b after a needs only:
"fg"The cost is 2.
Then we use bitmask DP:
dp[mask][last]means the minimum length of a superstring that contains the set of words in mask and ends with word last.
To reconstruct the actual string, we also store the parent word used before last.
Precompute Append Costs
For each pair (i, j), compute:
cost[i][j]where cost[i][j] is the number of extra characters needed to append words[j] after words[i].
If the maximum overlap length is k, then:
cost[i][j] = len(words[j]) - kExample:
words[i] = "abcde"
words[j] = "cdefg"Maximum suffix-prefix overlap is:
"cde"So:
cost[i][j] = 5 - 3 = 2Algorithm
Let:
n = len(words)First compute all pairwise append costs.
Then initialize:
dp[1 << i][i] = len(words[i])This means a superstring containing only words[i] and ending at i is just words[i].
Then transition:
For each mask, for each last inside mask, try appending a word nxt not in mask:
new_mask = mask | (1 << nxt)
new_cost = dp[mask][last] + cost[last][nxt]If new_cost improves dp[new_mask][nxt], update it and record:
parent[new_mask][nxt] = lastAt the end, choose the best ending word from:
dp[(1 << n) - 1]Then reconstruct the order of words using parent.
Finally build the superstring from that order.
Walkthrough
Use a small example:
words = ["abc", "bcd", "cde"]Pairwise overlaps:
| Append | Overlap | Extra |
|---|---|---|
"abc" -> "bcd" | "bc" | 1 |
"bcd" -> "cde" | "cd" | 1 |
"abc" -> "cde" | "c" | 2 |
The best order is:
["abc", "bcd", "cde"]Build:
"abc" + "d" + "e" = "abcde"The result contains all three words.
Correctness
For a fixed set of words mask and ending word last, dp[mask][last] stores the minimum possible superstring length among all orderings that use exactly the words in mask and end at last.
The base case is correct because a set with one word has only one possible superstring: the word itself.
For the transition, suppose an optimal ordering for new_mask ends at nxt. The word before nxt must be some last in the smaller set mask. The best length for that smaller ordering is dp[mask][last], and appending nxt after last adds exactly cost[last][nxt] characters.
The algorithm tries every possible previous ending word and every possible next word, so it considers the final step of every valid ordering.
Therefore, by induction over subset size, every DP state stores the optimal length for that state.
At the end, every word must be included, so the mask is (1 << n) - 1. The answer may end at any word, so choosing the minimum over all ending words gives the shortest possible superstring length.
The parent table records the transitions that produced the optimal DP value, so reconstructing the path gives an ordering that achieves that shortest length. Building the string from that ordering using the same overlaps produces a shortest valid superstring.
Complexity
Let:
n = len(words)and let L be the maximum word length.
| Metric | Value | Why |
|---|---|---|
| Time | O(n^2 * L^2 + 2^n * n^2) | Pairwise overlap computation plus bitmask DP |
| Space | O(2^n * n + n^2) | DP table, parent table, and cost table |
The overlap part can be optimized, but the simple version is clear and accepted for the usual constraints.
Implementation
class Solution:
def shortestSuperstring(self, words: list[str]) -> str:
n = len(words)
cost = [[0] * n for _ in range(n)]
for i in range(n):
for j in range(n):
if i == j:
continue
overlap = 0
max_len = min(len(words[i]), len(words[j]))
for k in range(1, max_len + 1):
if words[i][-k:] == words[j][:k]:
overlap = k
cost[i][j] = len(words[j]) - overlap
full = (1 << n) - 1
INF = 10**18
dp = [[INF] * n for _ in range(1 << n)]
parent = [[-1] * n for _ in range(1 << n)]
for i in range(n):
dp[1 << i][i] = len(words[i])
for mask in range(1 << n):
for last in range(n):
if dp[mask][last] == INF:
continue
for nxt in range(n):
if mask & (1 << nxt):
continue
new_mask = mask | (1 << nxt)
new_cost = dp[mask][last] + cost[last][nxt]
if new_cost < dp[new_mask][nxt]:
dp[new_mask][nxt] = new_cost
parent[new_mask][nxt] = last
last = min(range(n), key=lambda i: dp[full][i])
order = []
mask = full
while last != -1:
order.append(last)
prev = parent[mask][last]
mask ^= 1 << last
last = prev
order.reverse()
ans = words[order[0]]
for a, b in zip(order, order[1:]):
ans += words[b][len(words[b]) - cost[a][b]:]
return ansCode Explanation
We first compute how many extra characters are needed to append one word after another:
cost[i][j] = len(words[j]) - overlapThen we create the DP table:
dp = [[INF] * n for _ in range(1 << n)]Each state is a subset of words and the last word in the current ordering.
The base case is:
dp[1 << i][i] = len(words[i])Then we try adding each unused word:
new_mask = mask | (1 << nxt)
new_cost = dp[mask][last] + cost[last][nxt]If this improves the result, we store the previous word:
parent[new_mask][nxt] = lastAfter filling the DP table, we choose the best final word:
last = min(range(n), key=lambda i: dp[full][i])Then we walk backward through parent to recover the word order.
Finally, we build the actual answer by appending only the non-overlapping suffix of each next word:
ans += words[b][len(words[b]) - cost[a][b]:]Testing
def contains_all(result, words):
return all(word in result for word in words)
def run_tests():
s = Solution()
words = ["alex", "loves", "leetcode"]
ans = s.shortestSuperstring(words)
assert contains_all(ans, words)
assert len(ans) == len("alexlovesleetcode")
words = ["catg", "ctaagt", "gcta", "ttca", "atgcatc"]
ans = s.shortestSuperstring(words)
assert contains_all(ans, words)
assert len(ans) == len("gctaagttcatgcatc")
words = ["abc", "bcd", "cde"]
ans = s.shortestSuperstring(words)
assert contains_all(ans, words)
assert ans == "abcde"
words = ["ab", "ba"]
ans = s.shortestSuperstring(words)
assert contains_all(ans, words)
assert len(ans) == 3
print("all tests passed")
run_tests()| Test | Why |
|---|---|
["alex","loves","leetcode"] | No useful overlaps |
["catg","ctaagt","gcta","ttca","atgcatc"] | Standard hard example |
["abc","bcd","cde"] | Clean chain overlap |
["ab","ba"] | Cyclic overlap with multiple valid answers |