A clear explanation of partitioning a string into the maximum number of parts so each character appears in at most one part.
Problem Restatement
We are given a string s.
We need to split it into as many parts as possible so that each letter appears in at most one part.
After concatenating all parts in order, we must get the original string back.
Return the sizes of the parts.
Input and Output
| Item | Meaning |
|---|---|
| Input | A string s |
| Output | A list of partition sizes |
| Goal | Make as many partitions as possible |
| Rule | Each letter may appear in at most one partition |
Example function shape:
def partitionLabels(s: str) -> list[int]:
...Examples
Example 1:
s = "ababcbacadefegdehijhklij"Output:
[9, 7, 8]The partitions are:
"ababcbaca"
"defegde"
"hijhklij"Every letter appears in at most one part.
For example, a, b, and c appear only in the first part. The letters d, e, f, and g appear only in the second part. The letters h, i, j, k, and l appear only in the third part.
Example 2:
s = "eccbbbbdec"Output:
[10]The whole string must stay together because several letters force the partition to extend to the end.
First Thought: Cut Whenever Possible
We want many partitions, so we want each partition to be as small as possible.
But we cannot cut after an index if some character in the current part appears again later.
For example:
ababcbacaOnce the partition includes a, it must include the last a.
Once it includes b, it must include the last b.
Once it includes c, it must include the last c.
So a partition can end only when every character seen in it has already reached its final occurrence.
Key Insight
For every character, record its last position in the string.
Then scan from left to right.
While scanning the current partition, keep the farthest last occurrence of any character seen so far.
Call this value:
endThe current partition cannot end before end.
When the current index reaches end, all characters in the current partition have no occurrence outside it. So we can cut there.
This greedy cut is always best because it creates the smallest valid current partition, which leaves the rest of the string available for more partitions.
Last Occurrence Map
Build a dictionary:
last = {}For every character, store its final index.
for i, ch in enumerate(s):
last[ch] = iFor:
s = "ababcbaca"the last positions are:
| Character | Last Index |
|---|---|
a | 8 |
b | 5 |
c | 7 |
So the first partition must go at least to index 8.
Algorithm
- Store the last index of each character.
- Initialize:
start = 0end = 0answer = []
- Scan the string with index
i. - For each character
ch, update:
end = max(end, last[ch])- If
i == end, the current partition is complete. - Append its size:
end - start + 1- Start the next partition at
i + 1. - Return
answer.
Walkthrough
Take:
s = "ababcbacadefegdehijhklij"The first character is a.
The last a is at index 8, so the first partition must reach at least 8.
As we scan, we also see b and c.
Their last positions are inside the same range:
a last index = 8
b last index = 5
c last index = 7So the first partition ends at index 8.
Its size is:
8 - 0 + 1 = 9The next partition starts at index 9.
It contains d, e, f, and g, and must end at index 15.
Its size is:
15 - 9 + 1 = 7The final partition has size 8.
So the result is:
[9, 7, 8]Correctness
For each character, last[ch] is the final index where that character appears in the string.
During the scan of a partition, end is the maximum last occurrence of all characters seen in the current partition. Therefore, the partition cannot legally end before end, because at least one character would appear again outside the partition.
When the scan reaches index end, every character seen in the current partition has its final occurrence at or before end. Therefore, no character in this partition appears later, so cutting here is valid.
The algorithm cuts at the earliest valid position. This creates the smallest possible current partition. Since the current partition cannot end earlier, no other valid solution can create more partitions before this point.
After making this cut, the same reasoning applies to the remaining suffix of the string.
Therefore, the greedy algorithm returns the maximum possible number of valid partitions and their correct sizes.
Complexity
Let n = len(s).
| Metric | Value | Why |
|---|---|---|
| Time | O(n) | We scan the string twice |
| Space | O(1) | There are only 26 lowercase English letters |
If written for a general character set, the space complexity is O(m), where m is the number of distinct characters.
Implementation
class Solution:
def partitionLabels(self, s: str) -> list[int]:
last = {}
for i, ch in enumerate(s):
last[ch] = i
answer = []
start = 0
end = 0
for i, ch in enumerate(s):
end = max(end, last[ch])
if i == end:
answer.append(end - start + 1)
start = i + 1
return answerCode Explanation
First, we record the last occurrence of each character:
last = {}
for i, ch in enumerate(s):
last[ch] = iIf a character appears multiple times, the dictionary value is overwritten, so the final stored value is its last index.
Then we scan the string again.
start = 0
end = 0start is the beginning of the current partition.
end is the farthest index the current partition must reach.
For each character:
end = max(end, last[ch])If this character appears later, we extend the current partition.
When:
if i == end:we know all characters in the current partition are fully contained inside it.
So we append the partition size:
answer.append(end - start + 1)Then the next partition begins after the current index:
start = i + 1Finally:
return answerreturns all partition lengths.
Testing
def run_tests():
s = Solution()
assert s.partitionLabels("ababcbacadefegdehijhklij") == [9, 7, 8]
assert s.partitionLabels("eccbbbbdec") == [10]
assert s.partitionLabels("abc") == [1, 1, 1]
assert s.partitionLabels("aaaa") == [4]
assert s.partitionLabels("abac") == [3, 1]
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
| Main example | Confirms standard multi-part partition |
"eccbbbbdec" | Whole string must stay together |
"abc" | Every character appears once |
"aaaa" | Repeated character forces one partition |
"abac" | First and last partition sizes differ |