A string-marking guide for adding bold tags around all matched words while merging overlapping and adjacent bold regions.
Problem Restatement
We are given a string s and an array of strings words.
We need to wrap every substring of s that appears in words with a pair of bold tags:
<b>...</b>If two matching substrings overlap, they should be wrapped together with one pair of tags.
If two matching substrings are adjacent, they should also be combined into one bold region.
Return the final string after inserting the bold tags.
The problem states that substrings in s matching words should be wrapped with <b> and </b>, with overlapping or consecutive regions merged.
Input and Output
Function signature:
def addBoldTag(s: str, words: list[str]) -> str:
...Input:
| Parameter | Meaning |
|---|---|
s | Original string |
words | List of words to bold when they appear as substrings |
Output:
| Type | Meaning |
|---|---|
str | String with bold tags inserted |
Examples
Example 1:
s = "abcxyz123"
words = ["abc", "123"]The matching substrings are:
| Match | Range |
|---|---|
"abc" | indices 0..2 |
"123" | indices 6..8 |
They are separate, so the output is:
"<b>abc</b>xyz<b>123</b>"Example 2:
s = "aaabbcc"
words = ["aaa", "aab", "bc"]Matches:
| Word | Covered Range |
|---|---|
"aaa" | 0..2 |
"aab" | 1..3 |
"bc" | 4..5 |
The first two matches overlap, and the third match is adjacent to the merged range.
So the final bold range is:
0..5Output:
"<b>aaabbc</b>c"First Thought: Find Intervals and Merge Them
Each word occurrence creates an interval in s.
For example:
s = "aaabbcc"
words = ["aaa", "aab", "bc"]The intervals are:
[0, 2], [1, 3], [4, 5]After merging overlapping or adjacent intervals:
[0, 5]Then we insert <b> before index 0 and </b> after index 5.
This works, but there is an even simpler way: mark each character that should be bold.
Key Insight
Create a boolean array bold with the same length as s.
bold[i] = Truemeans character s[i] should be inside bold tags.
For every word occurrence, mark all covered positions as True.
After all matches are processed, consecutive True positions automatically represent one merged bold region.
This handles both overlap and adjacency without an explicit interval merge step.
Algorithm
Create:
bold = [False] * len(s)For each index i in s:
- For each word in
words. - Check whether
sstarts with that word at indexi. - If yes, mark the covered positions in
bold.
Then build the answer:
- Scan
sfrom left to right. - If
bold[i]is true and eitheri == 0orbold[i - 1]is false, append<b>. - Append
s[i]. - If
bold[i]is true and eitheri == len(s) - 1orbold[i + 1]is false, append</b>.
Implementation
class Solution:
def addBoldTag(self, s: str, words: list[str]) -> str:
n = len(s)
bold = [False] * n
for i in range(n):
for word in words:
if s.startswith(word, i):
for j in range(i, i + len(word)):
bold[j] = True
result = []
for i, ch in enumerate(s):
if bold[i] and (i == 0 or not bold[i - 1]):
result.append("<b>")
result.append(ch)
if bold[i] and (i == n - 1 or not bold[i + 1]):
result.append("</b>")
return "".join(result)Code Explanation
We create one boolean flag per character:
bold = [False] * nThen we check every starting position:
for i in range(n):For each word, we test whether the word begins at index i:
if s.startswith(word, i):When a match is found, we mark every covered index:
for j in range(i, i + len(word)):
bold[j] = TrueNow all bold regions are encoded in the bold array.
When building the result, this condition detects the start of a bold region:
if bold[i] and (i == 0 or not bold[i - 1]):
result.append("<b>")This condition detects the end of a bold region:
if bold[i] and (i == n - 1 or not bold[i + 1]):
result.append("</b>")Because adjacent and overlapping matches create continuous True values, they naturally receive one pair of tags.
Correctness
Every substring match from words is detected by checking each word at each possible starting index.
When a match is found, the algorithm marks exactly the characters covered by that substring as bold.
Therefore, after the marking phase, bold[i] is true exactly when character s[i] belongs to at least one matched substring.
During output construction, the algorithm inserts <b> only at the first character of each maximal continuous bold region, and inserts </b> only after the last character of that region.
Because overlapping and adjacent matches produce one continuous bold region in the bold array, they are wrapped with one pair of tags.
Therefore, the final string contains exactly the required bold tags.
Complexity
Let:
| Symbol | Meaning |
|---|---|
n | Length of s |
m | Number of words |
k | Average word length |
| Metric | Value | Why |
|---|---|---|
| Time | O(n * m * k) | For each position, we may check each word |
| Space | O(n) | The bold array and output buffer are used |
The constraints allow this direct marking solution because s.length <= 1000 and words.length <= 100.
Alternative: Interval Merging
We can also collect intervals first.
class Solution:
def addBoldTag(self, s: str, words: list[str]) -> str:
intervals = []
for i in range(len(s)):
for word in words:
if s.startswith(word, i):
intervals.append((i, i + len(word)))
if not intervals:
return s
intervals.sort()
merged = []
start, end = intervals[0]
for next_start, next_end in intervals[1:]:
if next_start <= end:
end = max(end, next_end)
else:
merged.append((start, end))
start, end = next_start, next_end
merged.append((start, end))
result = []
prev = 0
for start, end in merged:
result.append(s[prev:start])
result.append("<b>")
result.append(s[start:end])
result.append("</b>")
prev = end
result.append(s[prev:])
return "".join(result)This version treats intervals as half-open ranges:
[start, end)Adjacent intervals merge because:
next_start <= endTesting
def run_tests():
sol = Solution()
assert sol.addBoldTag(
"abcxyz123",
["abc", "123"]
) == "<b>abc</b>xyz<b>123</b>"
assert sol.addBoldTag(
"aaabbcc",
["aaa", "aab", "bc"]
) == "<b>aaabbc</b>c"
assert sol.addBoldTag(
"aaabbb",
["aa", "b"]
) == "<b>aaabbb</b>"
assert sol.addBoldTag(
"abcdef",
["gh"]
) == "abcdef"
assert sol.addBoldTag(
"abc",
["abc"]
) == "<b>abc</b>"
print("all tests passed")
run_tests()Test coverage:
| Case | Why |
|---|---|
| Separate matches | Confirms multiple bold regions |
| Overlapping and adjacent matches | Confirms merge behavior |
| Many adjacent single-character matches | Confirms continuous tag region |
| No matches | Confirms original string is returned |
| Whole string match | Confirms tags at both boundaries |