A clear explanation of the Positions of Large Groups problem using a simple two-pointer scan.
Problem Restatement
We are given a string s containing lowercase English letters.
Consecutive equal characters form a group.
For example:
s = "abbxxxxzyy"The groups are:
| Group | Interval |
|---|---|
"a" | [0, 0] |
"bb" | [1, 2] |
"xxxx" | [3, 6] |
"z" | [7, 7] |
"yy" | [8, 9] |
A group is large if its length is at least 3.
We need to return the start and end indices of every large group, sorted by increasing start index. Since we scan from left to right, the result naturally has this order.
Input and Output
| Item | Meaning |
|---|---|
| Input | A string s |
| Output | A list of intervals [start, end] |
| Large group | A consecutive group with length at least 3 |
| Indexing | start and end are inclusive |
Function shape:
class Solution:
def largeGroupPositions(self, s: str) -> list[list[int]]:
...Examples
Example 1:
s = "abbxxxxzzy"The groups are:
"a", "bb", "xxxx", "zz", "y"Only "xxxx" has length at least 3.
It starts at index 3 and ends at index 6.
So the answer is:
[[3, 6]]Example 2:
s = "abc"The groups are:
"a", "b", "c"No group has length at least 3.
So the answer is:
[]Example 3:
s = "abcdddeeeeaabbbcd"The large groups are:
| Group | Interval |
|---|---|
"ddd" | [3, 5] |
"eeee" | [6, 9] |
"bbb" | [12, 14] |
So the answer is:
[[3, 5], [6, 9], [12, 14]]First Thought: Count Each Group
The problem is asking us to identify runs of equal adjacent characters.
A run starts at some index i.
It continues while the next characters are the same as s[i].
Once the character changes, the run ends.
If the run length is at least 3, we add its interval to the answer.
Key Insight
We do not need any extra data structure to group the string.
Use two pointers:
| Pointer | Meaning |
|---|---|
i | Start index of the current group |
j | First index after the current group |
For each group:
- Set
j = i. - Move
jwhiles[j] == s[i]. - The group is from
itoj - 1. - Its length is
j - i. - If
j - i >= 3, add[i, j - 1]. - Move
itoj.
Algorithm
Initialize an empty answer list.
Start at index i = 0.
While i is inside the string:
- Let
j = i. - Move
jforward whilej < len(s)ands[j] == s[i]. - Check the group length.
- If the length is at least
3, append[i, j - 1]. - Set
i = jto move to the next group.
Walkthrough
Use:
s = "abbxxxxzzy"Start with i = 0.
The current character is "a".
The group is:
"a"Its interval is [0, 0], and its length is 1.
Not large.
Move to i = 1.
The current character is "b".
The group is:
"bb"Its interval is [1, 2], and its length is 2.
Not large.
Move to i = 3.
The current character is "x".
The group is:
"xxxx"Its interval is [3, 6], and its length is 4.
This is large, so we add:
[3, 6]Move to i = 7.
The current character is "z".
The group is:
"zz"Its interval is [7, 8], and its length is 2.
Not large.
Move to i = 9.
The current character is "y".
The group is:
"y"Its interval is [9, 9], and its length is 1.
Not large.
Final answer:
[[3, 6]]Correctness
The algorithm processes the string from left to right.
At the start of each iteration, i is the first index of a group that has not been processed yet.
The inner loop advances j until either the string ends or s[j] differs from s[i]. Therefore, all indices from i to j - 1 belong to the same group, and index j, if it exists, starts a different group.
So [i, j - 1] is exactly the interval of the current group.
The algorithm adds this interval exactly when:
j - i >= 3which is exactly the definition of a large group.
After processing the group, the algorithm sets:
i = jso the next iteration starts at the next unprocessed group.
Thus every group is checked once, every large group is included, and no non-large group is included.
Complexity
| Metric | Value | Why |
|---|---|---|
| Time | O(n) | Each character is visited once by the scan |
| Space | O(1) extra | Apart from the output list, only pointers are used |
The output list can contain up to O(n) intervals in the worst case, but auxiliary memory is constant.
Implementation
class Solution:
def largeGroupPositions(self, s: str) -> list[list[int]]:
answer = []
n = len(s)
i = 0
while i < n:
j = i
while j < n and s[j] == s[i]:
j += 1
if j - i >= 3:
answer.append([i, j - 1])
i = j
return answerCode Explanation
We store the result here:
answer = []The pointer i marks the start of the current group:
i = 0For each group, we start j at the same position:
j = iThen we extend j while the characters stay equal:
while j < n and s[j] == s[i]:
j += 1When this loop ends, the group is:
[i, j - 1]Its length is:
j - iIf that length is at least 3, we save the interval:
if j - i >= 3:
answer.append([i, j - 1])Finally, we skip the whole group:
i = jand continue with the next group.
Testing
def run_tests():
s = Solution()
assert s.largeGroupPositions("abbxxxxzzy") == [[3, 6]]
assert s.largeGroupPositions("abc") == []
assert s.largeGroupPositions("abcdddeeeeaabbbcd") == [[3, 5], [6, 9], [12, 14]]
assert s.largeGroupPositions("aaa") == [[0, 2]]
assert s.largeGroupPositions("aaaa") == [[0, 3]]
assert s.largeGroupPositions("aaabbb") == [[0, 2], [3, 5]]
assert s.largeGroupPositions("a") == []
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
"abbxxxxzzy" | Standard case with one large group |
"abc" | No large groups |
"abcdddeeeeaabbbcd" | Multiple large groups |
"aaa" | Minimum large group length |
"aaaa" | Group longer than minimum |
"aaabbb" | Adjacent large groups |
"a" | Minimum string length |