A guide to implementing a lazy iterator over a run-length encoded string without fully decompressing it.
Problem Restatement
We need to design a class called StringIterator.
The input is a compressed string. The compressed string has repeated groups of:
letter + positive integerThe integer tells us how many times that letter appears in the original uncompressed string.
For example:
"L1e2t1C1o1d1e1"represents:
"LeetCode"because:
| Compressed Part | Meaning |
|---|---|
L1 | L appears once |
e2 | e appears twice |
t1 | t appears once |
C1 | C appears once |
o1 | o appears once |
d1 | d appears once |
e1 | e appears once |
The class must support two operations:
| Method | Behavior |
|---|---|
next() | Return the next character from the uncompressed string. If no character remains, return a single space " " |
hasNext() | Return True if there is still at least one character left, otherwise return False |
The official problem defines the compressed string as each letter followed by a positive integer, and asks us to implement next() and hasNext().
Input and Output
Constructor input:
StringIterator(compressedString)Method outputs:
| Call | Output |
|---|---|
next() | A one-character string, or " " if exhausted |
hasNext() | Boolean |
Example object usage:
iterator = StringIterator("L1e2t1C1o1d1e1")Example
Calls:
StringIterator("L1e2t1C1o1d1e1")
next()
next()
next()
next()
next()
next()
hasNext()
next()
hasNext()Output:
null
"L"
"e"
"e"
"t"
"C"
"o"
true
"d"
trueThe iterator expands characters only when needed.
It does not need to build the full string "LeetCode" in memory.
First Thought: Fully Decompress the String
A simple approach is to expand the compressed string during initialization.
For example:
"a3b2"becomes:
"aaabb"Then next() just returns the next character from this expanded string.
This is easy, but it can waste memory.
For example:
"a1000000"represents one million characters. Building that full string is unnecessary because the iterator may only ask for a few characters.
Key Insight
We should keep the compressed form and consume it lazily.
At any time, the iterator only needs to know:
| State | Meaning |
|---|---|
index | Current position inside the compressed string |
current_char | Character currently being returned |
remaining | How many times current_char is still available |
When remaining becomes 0, we parse the next letter and its count.
This gives us an iterator that avoids full decompression.
Algorithm
Store the compressed string.
Initialize:
self.s = compressedString
self.index = 0
self.current_char = " "
self.remaining = 0For next():
- If
hasNext()is false, return" ". - If
remaining == 0, parse the next letter and its count fromself.s. - Decrease
remainingby1. - Return
current_char.
For hasNext():
Return true if either:
- The current parsed character still has remaining copies.
- There is still unparsed compressed input.
That condition is:
self.remaining > 0 or self.index < len(self.s)Implementation
class StringIterator:
def __init__(self, compressedString: str):
self.s = compressedString
self.index = 0
self.current_char = " "
self.remaining = 0
def next(self) -> str:
if not self.hasNext():
return " "
if self.remaining == 0:
self.current_char = self.s[self.index]
self.index += 1
count = 0
while self.index < len(self.s) and self.s[self.index].isdigit():
count = count * 10 + int(self.s[self.index])
self.index += 1
self.remaining = count
self.remaining -= 1
return self.current_char
def hasNext(self) -> bool:
return self.remaining > 0 or self.index < len(self.s)Code Explanation
The constructor stores the compressed string and initializes the iterator state:
self.s = compressedString
self.index = 0
self.current_char = " "
self.remaining = 0The variable index points to the next unread position in the compressed string.
The variable remaining stores how many copies of current_char are still available.
The first check in next() handles exhaustion:
if not self.hasNext():
return " "If there are no more characters, the problem asks us to return a single white space.
When remaining == 0, the iterator needs to load a new compressed group:
self.current_char = self.s[self.index]
self.index += 1After reading the letter, we parse all following digits:
count = 0
while self.index < len(self.s) and self.s[self.index].isdigit():
count = count * 10 + int(self.s[self.index])
self.index += 1This supports multi-digit counts such as:
"a12"After parsing the count, next() consumes one copy:
self.remaining -= 1
return self.current_charThe hasNext() method checks both already-parsed and not-yet-parsed characters:
return self.remaining > 0 or self.index < len(self.s)This matters because the iterator may still have characters left in either place.
Correctness
The iterator processes the compressed string from left to right.
Whenever remaining == 0, it reads exactly one compressed group: one letter followed by its positive integer count. It stores the letter in current_char and stores the count in remaining.
Each call to next() returns current_char once and decreases remaining by one. Therefore, a group like a5 returns exactly five copies of "a".
After all copies of one group are consumed, remaining becomes zero, so the next call to next() parses the next compressed group.
Because groups are parsed in their original order, characters are returned in the same order as the uncompressed string.
If no parsed characters remain and no compressed input remains, hasNext() returns false, and next() returns " ". Therefore, the implementation matches the required iterator behavior.
Complexity
Let n be the length of compressedString.
| Operation | Time | Space | Why |
|---|---|---|---|
| Constructor | O(1) | O(1) | It only stores references and counters |
next() | Amortized O(1) | O(1) | Each compressed character and digit is parsed once total |
hasNext() | O(1) | O(1) | It checks two variables |
Across all calls, every character in the compressed string is scanned at most once.
Alternative: Parse All Groups First
We can also parse the compressed string during initialization into pairs:
[("L", 1), ("e", 2), ("t", 1)]Then next() walks through this list.
class StringIterator:
def __init__(self, compressedString: str):
self.groups = []
self.pos = 0
i = 0
while i < len(compressedString):
ch = compressedString[i]
i += 1
count = 0
while i < len(compressedString) and compressedString[i].isdigit():
count = count * 10 + int(compressedString[i])
i += 1
self.groups.append([ch, count])
def next(self) -> str:
if not self.hasNext():
return " "
ch, count = self.groups[self.pos]
self.groups[self.pos][1] -= 1
if self.groups[self.pos][1] == 0:
self.pos += 1
return ch
def hasNext(self) -> bool:
return self.pos < len(self.groups)This version is simple, but it uses O(n) space for the parsed groups.
The lazy version above uses less extra space.
Testing
def run_tests():
iterator = StringIterator("L1e2t1C1o1d1e1")
assert iterator.next() == "L"
assert iterator.next() == "e"
assert iterator.next() == "e"
assert iterator.next() == "t"
assert iterator.next() == "C"
assert iterator.next() == "o"
assert iterator.hasNext() is True
assert iterator.next() == "d"
assert iterator.hasNext() is True
assert iterator.next() == "e"
assert iterator.hasNext() is False
assert iterator.next() == " "
iterator = StringIterator("a12")
for _ in range(12):
assert iterator.next() == "a"
assert iterator.hasNext() is False
assert iterator.next() == " "
iterator = StringIterator("x1y1")
assert iterator.hasNext() is True
assert iterator.next() == "x"
assert iterator.hasNext() is True
assert iterator.next() == "y"
assert iterator.hasNext() is False
print("all tests passed")
run_tests()Test coverage:
| Case | Why |
|---|---|
"L1e2t1C1o1d1e1" | Official-style example |
"a12" | Multi-digit count |
"x1y1" | Several single-count groups |
| Exhausted iterator | Confirms next() returns " " |