A clear explanation of Remove Invalid Parentheses using BFS to guarantee the minimum number of removals.
Problem Restatement
We are given a string s containing parentheses and letters.
We need to remove the minimum number of invalid parentheses so that the remaining string becomes valid.
Return all unique valid strings that can be produced with the minimum number of removals. The answer may be returned in any order.
A valid parentheses string means:
- Every
(has a matching). - Every
)matches some earlier(. - Letters do not affect validity.
For example:
s = "()())()"The valid answers are:
["(())()", "()()()"]Both remove exactly one invalid parenthesis.
Input and Output
| Item | Meaning |
|---|---|
| Input | A string s containing letters, (, and ) |
| Output | A list of unique valid strings |
| Goal | Remove the fewest parentheses possible |
| Order | Any order is accepted |
Function shape:
def removeInvalidParentheses(s: str) -> list[str]:
...Examples
Example 1:
s = "()())()"Output:
["(())()", "()()()"]We only need to remove one ).
Removing the third ) gives:
"(())()"Removing the fourth ) gives:
"()()()"Example 2:
s = "(a)())()"Output:
["(a())()", "(a)()()"]The letter a stays in the string. It has no effect on the parentheses balance.
Example 3:
s = ")("Output:
[""]Both parentheses must be removed.
First Thought: Brute Force
One direct idea is to generate every possible string by removing some parentheses.
Then we check which generated strings are valid.
Among all valid strings, we keep only the ones with the greatest length, because greatest length means fewest removals.
This works, but it generates too many strings.
For a string of length n, every parenthesis can either be kept or removed. In the worst case, this creates exponential work.
Problem With Brute Force
The problem asks for the minimum number of removals.
That phrase suggests a shortest-path style search.
Each operation removes one parenthesis.
So we can think of each string as a node, and each deletion as moving to the next level.
Level 0: original string
Level 1: strings after removing one parenthesis
Level 2: strings after removing two parentheses
Level 3: strings after removing three parentheses
The first level that contains valid strings gives the minimum number of removals.
This is exactly what BFS is good at.
Key Insight
Use BFS.
At each BFS level, remove one parenthesis from every current string.
Before expanding the next level, check whether any string at the current level is valid.
Once we find valid strings at some level, we stop.
We do not need to search deeper, because deeper levels remove more parentheses.
Validity Check
To check whether a string has valid parentheses, scan from left to right.
Use a counter called balance.
When we see (, increase balance.
When we see ), decrease balance.
If balance ever becomes negative, we saw a closing parenthesis before a matching opening parenthesis.
At the end, the string is valid only if balance == 0.
def is_valid(expr: str) -> bool:
balance = 0
for ch in expr:
if ch == "(":
balance += 1
elif ch == ")":
balance -= 1
if balance < 0:
return False
return balance == 0Letters are ignored.
Algorithm
Start with the original string in a queue.
Also keep a visited set so we do not process the same string multiple times.
Then repeat:
- Process all strings in the current BFS level.
- Check which strings are valid.
- If any valid strings are found, return them.
- Otherwise, generate the next level by removing one parenthesis from each string.
- Skip removing letters, because only parentheses can make the string invalid.
- Skip strings already seen.
The moment we find valid strings, they are guaranteed to use the minimum number of removals.
Correctness
BFS processes strings by number of removals.
The original string uses 0 removals.
All strings generated from it use 1 removal.
All strings generated after that use 2 removals.
So when BFS first finds a valid string, no valid string with fewer removals exists. If such a string existed, BFS would have found it at an earlier level.
At the level where we find valid strings, we collect all valid strings from that level.
Because all strings in the same level use the same number of removals, every collected string has the minimum number of removals.
The visited set does not remove any needed answer. It only prevents processing the same string more than once. If the same string can be produced in multiple ways, its content is identical, so keeping one copy is enough.
Therefore, the algorithm returns exactly the unique valid strings with the minimum number of removals.
Complexity
| Metric | Value | Why |
|---|---|---|
| Time | O(n * 2^n) | In the worst case, we may generate many substrings, and each validity check costs O(n) |
| Space | O(2^n) | The queue and visited set may store many generated strings |
The practical runtime is usually better because BFS stops as soon as it finds the first valid level.
Implementation
from collections import deque
class Solution:
def removeInvalidParentheses(self, s: str) -> list[str]:
def is_valid(expr: str) -> bool:
balance = 0
for ch in expr:
if ch == "(":
balance += 1
elif ch == ")":
balance -= 1
if balance < 0:
return False
return balance == 0
queue = deque([s])
visited = {s}
while queue:
level_size = len(queue)
valid = []
for _ in range(level_size):
cur = queue.popleft()
if is_valid(cur):
valid.append(cur)
if valid:
continue
for i, ch in enumerate(cur):
if ch != "(" and ch != ")":
continue
nxt = cur[:i] + cur[i + 1:]
if nxt not in visited:
visited.add(nxt)
queue.append(nxt)
if valid:
return valid
return [""]Code Explanation
We first define is_valid.
balance = 0balance counts unmatched opening parentheses.
When we see (, we add one.
if ch == "(":
balance += 1When we see ), we subtract one.
elif ch == ")":
balance -= 1If balance becomes negative, the string has too many closing parentheses too early.
if balance < 0:
return FalseAt the end, the string is valid only when all opening parentheses have been matched.
return balance == 0Now we start BFS.
queue = deque([s])
visited = {s}The queue stores strings to process.
The visited set prevents duplicate work.
For each BFS level, we store how many strings are currently in the queue.
level_size = len(queue)
valid = []This matters because we only want to process one removal depth at a time.
For each string in the current level, we check validity.
if is_valid(cur):
valid.append(cur)If a valid string is found, we still finish checking the rest of the current level, because other strings at the same level may also be valid answers.
But we do not expand this string further.
if valid:
continueThis avoids generating deeper strings after we already know the current level has valid answers.
To generate the next level, remove one parenthesis at a time.
for i, ch in enumerate(cur):
if ch != "(" and ch != ")":
continue
nxt = cur[:i] + cur[i + 1:]We do not remove letters, because letters cannot cause invalid parentheses.
If the new string has not been seen, add it to the queue.
if nxt not in visited:
visited.add(nxt)
queue.append(nxt)After processing the full level, return all valid strings if any were found.
if valid:
return validTesting
def normalize(ans):
return sorted(ans)
def run_tests():
s = Solution()
assert normalize(s.removeInvalidParentheses("()())()")) == normalize([
"(())()",
"()()()",
])
assert normalize(s.removeInvalidParentheses("(a)())()")) == normalize([
"(a())()",
"(a)()()",
])
assert normalize(s.removeInvalidParentheses(")(")) == normalize([
"",
])
assert normalize(s.removeInvalidParentheses("abc")) == normalize([
"abc",
])
assert normalize(s.removeInvalidParentheses("(((")) == normalize([
"",
])
assert normalize(s.removeInvalidParentheses("(()")) == normalize([
"()",
])
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
"()())()" | Standard case with two valid answers |
"(a)())()" | Letters must be preserved |
")(" | All parentheses must be removed |
"abc" | String with no parentheses is already valid |
"(((" | Only opening parentheses |
"(()" | Remove one unmatched opening parenthesis |