A clear explanation of marking matching substrings and merging overlapping bold ranges.
Problem Restatement
We are given:
words
swords is a list of strings.
s is the target string.
We need to wrap substrings of s with:
<b> ... </b>if those substrings match any word in words.
There are two important rules:
- If two bold parts overlap, merge them into one bold section.
- If two bold parts are adjacent, merge them too.
Return the final formatted string.
Input and Output
| Item | Meaning |
|---|---|
| Input | words, a list of strings |
| Input | s, the target string |
| Output | The string with bold tags inserted |
| Merge Rule | Overlapping or adjacent bold ranges become one section |
Example function shape:
def boldWords(words: list[str], s: str) -> str:
...Examples
Example 1:
words = ["ab", "bc"]
s = "aabcd"Output:
"a<b>abc</b>d"Explanation:
The substring "ab" matches:
a[ab]cdThe substring "bc" matches:
aa[bc]dThese matches overlap inside "abc".
So they merge into one bold section:
<b>abc</b>Final answer:
"a<b>abc</b>d"Example 2:
words = ["ab", "cb"]
s = "aabcd"Output:
"a<b>ab</b>cd"Only "ab" appears.
So only that substring becomes bold.
First Thought: Find Every Match
We can scan the string and find every occurrence of every word.
For example:
words = ["ab", "bc"]
s = "aabcd"Matches are:
| Word | Interval |
|---|---|
"ab" | [1, 2] |
"bc" | [2, 3] |
These intervals overlap.
So after finding all matches, we need to merge the covered positions.
Key Insight
Instead of storing intervals explicitly, we can mark every character position that should be bold.
For example:
s = "aabcd"Index table:
| Index | Character |
|---|---|
0 | a |
1 | a |
2 | b |
3 | c |
4 | d |
If "ab" matches at index 1, then positions:
1, 2become bold.
If "bc" matches at index 2, then positions:
2, 3become bold.
Final marked positions:
1, 2, 3Then we build the answer by:
- inserting
<b>when entering a bold region - inserting
</b>when leaving a bold region
This automatically merges overlapping and adjacent ranges.
Marking Bold Positions
Create a boolean array:
bold = [False] * len(s)Then for every starting index and every word:
if s.startswith(word, i):mark all covered positions:
for j in range(i, i + len(word)):
bold[j] = TrueBuilding the Result
After marking positions, scan the string from left to right.
When entering a bold section:
bold[i] is True
and
(i == 0 or not bold[i - 1])insert:
<b>When leaving a bold section:
bold[i] is True
and
(i == len(s) - 1 or not bold[i + 1])insert:
</b>This merges both overlapping and adjacent ranges automatically because the whole continuous region stays inside one pair of tags.
Algorithm
- Create a boolean array
bold. - For every index
iins:- For every word in
words:- If the word starts at
i, mark all covered positions as bold.
- If the word starts at
- For every word in
- Create an answer list.
- Scan the string:
- Insert
<b>when entering a bold region. - Append the current character.
- Insert
</b>when leaving a bold region.
- Insert
- Join and return the result.
Correctness
The algorithm marks a position as bold exactly when that position belongs to at least one substring matching a word in words.
For every index i and every word, the condition:
s.startswith(word, i)is true exactly when the word appears starting at position i. The algorithm then marks all positions covered by that occurrence. Therefore, after processing all words and positions, the array bold correctly identifies all characters that must appear inside bold tags.
While building the result, the algorithm inserts <b> only when entering a maximal continuous bold region, meaning the current position is bold and the previous position is not bold.
Similarly, it inserts </b> only when leaving a maximal continuous bold region, meaning the current position is bold and the next position is not bold.
Therefore:
- every bold character appears inside bold tags
- no non-bold character appears inside bold tags
- overlapping and adjacent bold regions are merged into one continuous section
So the produced string satisfies the problem requirements.
Complexity
Let:
n = len(s)
m = number of words
L = maximum word length| Metric | Value | Why |
|---|---|---|
| Time | O(n * m * L) | Each position may compare against every word |
| Space | O(n) | The bold array stores one flag per character |
Implementation
class Solution:
def boldWords(self, words: list[str], s: str) -> str:
n = len(s)
bold = [False] * n
for i in range(n):
for word in words:
if s.startswith(word, i):
for j in range(i, i + len(word)):
bold[j] = True
ans = []
for i in range(n):
if bold[i] and (i == 0 or not bold[i - 1]):
ans.append("<b>")
ans.append(s[i])
if bold[i] and (i == n - 1 or not bold[i + 1]):
ans.append("</b>")
return "".join(ans)Code Explanation
We first create a boolean array:
bold = [False] * nEach position tells whether the corresponding character should appear inside bold tags.
For every index and every word:
if s.startswith(word, i):we mark the matching substring:
for j in range(i, i + len(word)):
bold[j] = TrueAfter all matches are processed, bold[i] is True exactly when s[i] belongs to at least one matching word.
Then we build the final string.
If we are entering a bold region:
if bold[i] and (i == 0 or not bold[i - 1]):we insert:
<b>Then we always append the current character:
ans.append(s[i])If we are leaving a bold region:
if bold[i] and (i == n - 1 or not bold[i + 1]):we insert:
</b>Finally:
return "".join(ans)builds the final formatted string.
Testing
def run_tests():
s = Solution()
assert s.boldWords(
["ab", "bc"],
"aabcd",
) == "a<b>abc</b>d"
assert s.boldWords(
["ab", "cb"],
"aabcd",
) == "a<b>ab</b>cd"
assert s.boldWords(
["aa", "b"],
"aaabb",
) == "<b>aaabb</b>"
assert s.boldWords(
["xyz"],
"abc",
) == "abc"
assert s.boldWords(
["a"],
"aaa",
) == "<b>aaa</b>"
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
| Overlapping matches | Confirms merging |
| Single match | Basic behavior |
| Adjacent matches | Adjacent bold regions must merge |
| No matches | No tags inserted |
| Repeated matches | Continuous bold section across whole string |