LeetCode 616: Add Bold Tag in String

Problem Restatement

We are given a string s and an array of strings words.

We need to wrap every substring of s that appears in words with a pair of bold tags:

<b>...</b>

If two matching substrings overlap, they should be wrapped together with one pair of tags.

If two matching substrings are adjacent, they should also be combined into one bold region.

Return the final string after inserting the bold tags.

The problem states that substrings in s matching words should be wrapped with  and , with overlapping or consecutive regions merged.

Input and Output

Function signature:

def addBoldTag(s: str, words: list[str]) -> str:
    ...

Input:

Parameter	Meaning
`s`	Original string
`words`	List of words to bold when they appear as substrings

Output:

Type	Meaning
`str`	String with bold tags inserted

Examples

Example 1:

s = "abcxyz123"
words = ["abc", "123"]

The matching substrings are:

Match	Range
`"abc"`	indices `0..2`
`"123"`	indices `6..8`

They are separate, so the output is:

"<b>abc</b>xyz<b>123</b>"

Example 2:

s = "aaabbcc"
words = ["aaa", "aab", "bc"]

Matches:

Word	Covered Range
`"aaa"`	`0..2`
`"aab"`	`1..3`
`"bc"`	`4..5`

The first two matches overlap, and the third match is adjacent to the merged range.

So the final bold range is:

0..5

Output:

"<b>aaabbc</b>c"

First Thought: Find Intervals and Merge Them

Each word occurrence creates an interval in s.

For example:

s = "aaabbcc"
words = ["aaa", "aab", "bc"]

The intervals are:

[0, 2], [1, 3], [4, 5]

After merging overlapping or adjacent intervals:

[0, 5]

Then we insert  before index 0 and  after index 5.

This works, but there is an even simpler way: mark each character that should be bold.

Key Insight

Create a boolean array bold with the same length as s.

bold[i] = True

means character s[i] should be inside bold tags.

For every word occurrence, mark all covered positions as True.

After all matches are processed, consecutive True positions automatically represent one merged bold region.

This handles both overlap and adjacency without an explicit interval merge step.

Algorithm

Create:

bold = [False] * len(s)

For each index i in s:

For each word in words.
Check whether s starts with that word at index i.
If yes, mark the covered positions in bold.

Then build the answer:

Scan s from left to right.
If bold[i] is true and either i == 0 or bold[i - 1] is false, append .
Append s[i].
If bold[i] is true and either i == len(s) - 1 or bold[i + 1] is false, append .

Implementation

class Solution:
    def addBoldTag(self, s: str, words: list[str]) -> str:
        n = len(s)
        bold = [False] * n

        for i in range(n):
            for word in words:
                if s.startswith(word, i):
                    for j in range(i, i + len(word)):
                        bold[j] = True

        result = []

        for i, ch in enumerate(s):
            if bold[i] and (i == 0 or not bold[i - 1]):
                result.append("<b>")

            result.append(ch)

            if bold[i] and (i == n - 1 or not bold[i + 1]):
                result.append("</b>")

        return "".join(result)

Code Explanation

We create one boolean flag per character:

bold = [False] * n

Then we check every starting position:

for i in range(n):

For each word, we test whether the word begins at index i:

if s.startswith(word, i):

When a match is found, we mark every covered index:

for j in range(i, i + len(word)):
    bold[j] = True

Now all bold regions are encoded in the bold array.

When building the result, this condition detects the start of a bold region:

if bold[i] and (i == 0 or not bold[i - 1]):
    result.append("<b>")

This condition detects the end of a bold region:

if bold[i] and (i == n - 1 or not bold[i + 1]):
    result.append("</b>")

Because adjacent and overlapping matches create continuous True values, they naturally receive one pair of tags.

Correctness

Every substring match from words is detected by checking each word at each possible starting index.

When a match is found, the algorithm marks exactly the characters covered by that substring as bold.

Therefore, after the marking phase, bold[i] is true exactly when character s[i] belongs to at least one matched substring.

During output construction, the algorithm inserts  only at the first character of each maximal continuous bold region, and inserts  only after the last character of that region.

Because overlapping and adjacent matches produce one continuous bold region in the bold array, they are wrapped with one pair of tags.

Therefore, the final string contains exactly the required bold tags.

Complexity

Let:

Symbol	Meaning
`n`	Length of `s`
`m`	Number of words
`k`	Average word length

Metric	Value	Why
Time	`O(n * m * k)`	For each position, we may check each word
Space	`O(n)`	The `bold` array and output buffer are used

The constraints allow this direct marking solution because s.length <= 1000 and words.length <= 100.

Alternative: Interval Merging

We can also collect intervals first.

class Solution:
    def addBoldTag(self, s: str, words: list[str]) -> str:
        intervals = []

        for i in range(len(s)):
            for word in words:
                if s.startswith(word, i):
                    intervals.append((i, i + len(word)))

        if not intervals:
            return s

        intervals.sort()

        merged = []
        start, end = intervals[0]

        for next_start, next_end in intervals[1:]:
            if next_start <= end:
                end = max(end, next_end)
            else:
                merged.append((start, end))
                start, end = next_start, next_end

        merged.append((start, end))

        result = []
        prev = 0

        for start, end in merged:
            result.append(s[prev:start])
            result.append("<b>")
            result.append(s[start:end])
            result.append("</b>")
            prev = end

        result.append(s[prev:])

        return "".join(result)

This version treats intervals as half-open ranges:

[start, end)

Adjacent intervals merge because:

next_start <= end

Testing

def run_tests():
    sol = Solution()

    assert sol.addBoldTag(
        "abcxyz123",
        ["abc", "123"]
    ) == "<b>abc</b>xyz<b>123</b>"

    assert sol.addBoldTag(
        "aaabbcc",
        ["aaa", "aab", "bc"]
    ) == "<b>aaabbc</b>c"

    assert sol.addBoldTag(
        "aaabbb",
        ["aa", "b"]
    ) == "<b>aaabbb</b>"

    assert sol.addBoldTag(
        "abcdef",
        ["gh"]
    ) == "abcdef"

    assert sol.addBoldTag(
        "abc",
        ["abc"]
    ) == "<b>abc</b>"

    print("all tests passed")

run_tests()

Test coverage:

Case	Why
Separate matches	Confirms multiple bold regions
Overlapping and adjacent matches	Confirms merge behavior
Many adjacent single-character matches	Confirms continuous tag region
No matches	Confirms original string is returned
Whole string match	Confirms tags at both boundaries