A clear explanation of finding all words that can be formed by concatenating at least two shorter words from the same list.
Problem Restatement
We are given an array of unique strings:
wordsWe need to return all words in the array that can be formed by concatenating at least two shorter words from the same array.
The shorter words do not need to be distinct. The same word may be used more than once. The official statement defines a concatenated word as a word comprised entirely of at least two shorter words from the given array.
For example:
words = ["cat", "cats", "dog", "catsdog"]The word:
"catsdog"can be formed as:
"cats" + "dog"So it is a concatenated word.
Input and Output
| Item | Meaning |
|---|---|
| Input | words, a list of unique strings |
| Output | All concatenated words in any order |
| Concatenated word | A word made from at least two shorter words |
| Word reuse | Allowed |
| Important rule | A word cannot count as concatenated by using only itself |
Function shape:
def findAllConcatenatedWordsInADict(words: list[str]) -> list[str]:
...Examples
Example 1:
words = [
"cat",
"cats",
"catsdogcats",
"dog",
"dogcatsdog",
"hippopotamuses",
"rat",
"ratcatdogcat",
]Concatenated words:
"catsdogcats" = "cats" + "dog" + "cats"
"dogcatsdog" = "dog" + "cats" + "dog"
"ratcatdogcat" = "rat" + "cat" + "dog" + "cat"Answer:
["catsdogcats", "dogcatsdog", "ratcatdogcat"]Example 2:
words = ["cat", "dog", "catdog"]The word "catdog" can be formed as:
"cat" + "dog"Answer:
["catdog"]Example 3:
words = ["cat", "dog", "bird"]No word can be formed from at least two shorter words.
Answer:
[]First Thought: Try Every Split Recursively
For each word, try every split:
prefix + suffixIf the prefix is in the dictionary, recursively check whether the suffix can also be built.
For example:
word = "catsdogcats"Try:
"c" + "atsdogcats"
"ca" + "tsdogcats"
"cat" + "sdogcats"
"cats" + "dogcats"When we see "cats" in the dictionary, we recursively check "dogcats".
This is the right idea, but plain recursion can repeat the same suffix many times.
Problem With Plain Recursion
The same substring may be checked repeatedly.
For example:
word = "aaaaaaaaaaaaaaaa"with dictionary words:
"a", "aa", "aaa", "aaaa"There are many ways to split the same suffix.
Without dynamic programming, recursion may explore an exponential number of split sequences.
We need to cache or use iterative DP.
Key Insight
For one word, this is the same structure as the Word Break problem.
Let:
dp[i]mean:
word[:i] can be formed from dictionary wordsInitialize:
dp[0] = TrueThen for each ending position i, check whether there is a previous position j such that:
dp[j] is True
word[j:i] is in the dictionaryIf yes:
dp[i] = TrueTo avoid letting a word use itself as one piece, process words from shortest to longest.
When checking a word, the dictionary contains only shorter words already seen.
That automatically enforces the rule that a concatenated word must be made from shorter words.
Algorithm
Sort words by length.
Create an empty set:
word_set = set()Create an empty answer list.
For each word in sorted order:
- If
wordis empty, skip it. - Check whether
wordcan be formed from words currently inword_set. - If yes, append it to the answer.
- Add
wordtoword_set.
The helper can_form(word) uses Word Break DP:
dp[0] = True.- For every
ifrom1tolen(word):- Try every
jfrom0toi - 1. - If
dp[j]andword[j:i] in word_set, setdp[i] = True.
- Try every
- Return
dp[len(word)].
Walkthrough
Use:
words = ["cat", "cats", "dog", "catsdog"]Sort by length:
["cat", "dog", "cats", "catsdog"]Start:
word_set = set()
answer = []Check "cat".
No shorter words exist, so it cannot be formed.
Add "cat".
Check "dog".
It also cannot be formed.
Add "dog".
Check "cats".
The prefix "cat" exists, but suffix "s" does not.
So "cats" cannot be formed.
Add "cats".
Check "catsdog".
DP finds:
"cats" + "dog"So append "catsdog".
Final answer:
["catsdog"]Correctness
Words are processed from shortest to longest.
When checking a word, word_set contains only shorter words. Therefore, if the helper returns True, the word is formed entirely from shorter words in the original array.
The DP helper is correct because dp[i] is set to True exactly when some valid earlier prefix word[:j] can be formed and the remaining segment word[j:i] is a dictionary word.
By induction over i, dp[i] correctly describes whether word[:i] can be formed from the available shorter words.
Thus, dp[len(word)] is True exactly when the whole word can be formed by concatenating available shorter words.
Since the helper is run for every word, the algorithm returns exactly all concatenated words.
Complexity
Let:
n = len(words)
L = maximum word length| Metric | Value | Why |
|---|---|---|
| Time | O(n * L^3) in Python slicing form | Each word tries O(L^2) substrings, and slicing can cost O(L) |
| Space | O(n * L + L) | Store words plus one DP array |
In practice this passes because word lengths are moderate.
The trie version can avoid substring slicing cost, but the hash-set DP version is simpler and usually preferred for clarity.
Implementation
class Solution:
def findAllConcatenatedWordsInADict(self, words: list[str]) -> list[str]:
words.sort(key=len)
word_set = set()
answer = []
def can_form(word: str) -> bool:
if not word_set:
return False
n = len(word)
dp = [False] * (n + 1)
dp[0] = True
for i in range(1, n + 1):
for j in range(i):
if not dp[j]:
continue
if word[j:i] in word_set:
dp[i] = True
break
return dp[n]
for word in words:
if word == "":
continue
if can_form(word):
answer.append(word)
word_set.add(word)
return answerCode Explanation
We sort by length:
words.sort(key=len)This ensures that when we check a word, only shorter words have already been added to word_set.
The set:
word_set = set()allows fast lookup of dictionary words.
The DP array:
dp = [False] * (n + 1)
dp[0] = Truemeans the empty prefix can always be formed.
For each end index:
for i in range(1, n + 1):we try all split positions:
for j in range(i):If the prefix is formable and the suffix piece is a known shorter word:
if dp[j] and word[j:i] in word_set:then the prefix ending at i is formable:
dp[i] = TrueAfter checking the word, we add it to the set:
word_set.add(word)so longer words may use it later.
Alternative: DFS With Memoization
This version checks a word recursively.
class Solution:
def findAllConcatenatedWordsInADict(self, words: list[str]) -> list[str]:
word_set = set(words)
answer = []
def can_form(word: str, start: int, count: int, memo: dict[tuple[int, int], bool]) -> bool:
if start == len(word):
return count >= 2
key = (start, count)
if key in memo:
return memo[key]
for end in range(start + 1, len(word) + 1):
part = word[start:end]
if part not in word_set:
continue
if part == word and count == 0:
continue
if can_form(word, end, count + 1, memo):
memo[key] = True
return True
memo[key] = False
return False
for word in words:
if word and can_form(word, 0, 0, {}):
answer.append(word)
return answerThe sorted DP version avoids the special case where a word tries to use itself.
Testing
def run_tests():
s = Solution()
assert sorted(s.findAllConcatenatedWordsInADict([
"cat",
"cats",
"catsdogcats",
"dog",
"dogcatsdog",
"hippopotamuses",
"rat",
"ratcatdogcat",
])) == sorted([
"catsdogcats",
"dogcatsdog",
"ratcatdogcat",
])
assert s.findAllConcatenatedWordsInADict([
"cat",
"dog",
"catdog",
]) == ["catdog"]
assert s.findAllConcatenatedWordsInADict([
"cat",
"dog",
"bird",
]) == []
assert sorted(s.findAllConcatenatedWordsInADict([
"a",
"aa",
"aaa",
"aaaa",
])) == sorted([
"aa",
"aaa",
"aaaa",
])
assert s.findAllConcatenatedWordsInADict([
"constructor",
]) == []
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
| Standard example | Checks multiple concatenated words |
"catdog" | Simple two-word concatenation |
| No concatenated words | Should return empty list |
| Repeated same word | Same shorter word may be reused |
| Single word only | A word cannot be formed by itself |