A clear explanation of finding the kth character in a decoded string without building the full decoded string.
Problem Restatement
We are given an encoded string s and an integer k.
The string is decoded from left to right using these rules:
| Character | Meaning |
|---|---|
| Letter | Append the letter to the tape |
Digit d | Repeat the entire current tape d - 1 more times |
Return the kth character of the final decoded string, using 1-based indexing.
The decoded string can be extremely large, so we must not build it directly. The problem guarantees that k is within the decoded length, and the decoded length is less than 2^63.
Input and Output
| Item | Meaning |
|---|---|
| Input | Encoded string s and integer k |
| Output | The kth decoded character |
| Indexing | 1-based |
| Digits | Only 2 through 9 |
| String | Starts with a lowercase English letter |
Function shape:
def decodeAtIndex(self, s: str, k: int) -> str:
...Examples
Example 1:
Input: s = "leet2code3", k = 10
Output: "o"The decoded string is:
leetleetcodeleetleetcodeleetleetcodeThe 10th character is "o".
Example 2:
Input: s = "ha22", k = 5
Output: "h"Decode step by step:
h
ha
haha
hahahahaThe 5th character is "h".
Example 3:
Input: s = "a2345678999999999999999", k = 1
Output: "a"The decoded string is a huge repetition of "a", so the first character is "a".
First Thought: Build the Decoded String
The direct approach is to simulate the tape.
When we see a letter, append it.
When we see a digit, repeat the whole current string.
class Solution:
def decodeAtIndex(self, s: str, k: int) -> str:
decoded = ""
for ch in s:
if ch.isdigit():
decoded *= int(ch)
else:
decoded += ch
return decoded[k - 1]This matches the decoding rule, but it fails for large inputs.
For example:
s = "a2345678999999999999999"The decoded string is far too large to store in memory.
Problem With Direct Decoding
The encoded string has length at most 100, but the decoded string may contain billions or trillions of characters.
So the encoded input is small, but the expanded output can be enormous.
We only need one character. We should track lengths and positions instead of constructing the decoded string.
Key Insight
First compute the length of the decoded string.
Let size be the decoded length after reading some prefix of s.
When we read a letter:
size += 1When we read a digit d:
size *= dAfter this forward pass, size is the total decoded length.
Then work backward through s.
The idea is to reverse the encoding process.
If the current character is a digit d, then the current decoded tape was created by repeating the previous tape d times.
So if the current length is size, the previous length was:
size // dThe target index k can be mapped back into one copy using modulo.
If the current character is a letter, then it was appended at the end of the previous tape.
If k points to the end of the current tape, that letter is the answer.
Algorithm
First pass:
- Set
size = 0. - For each character
chins:- If
chis a digit, multiplysizeby that digit. - Otherwise, add
1tosize.
- If
Second pass, from right to left:
- Reduce
kusing:
k %= size- If
k == 0and the current character is a letter, return it. - If the current character is a digit
d, undo the repeat:
size //= d- If the current character is a letter, undo the append:
size -= 1Walkthrough
Use:
s = "leet2code3"
k = 10First compute decoded lengths:
| Character | Operation | Size |
|---|---|---|
l | add letter | 1 |
e | add letter | 2 |
e | add letter | 3 |
t | add letter | 4 |
2 | repeat tape twice | 8 |
c | add letter | 9 |
o | add letter | 10 |
d | add letter | 11 |
e | add letter | 12 |
3 | repeat tape three times | 36 |
Now go backward.
At size 36, the 10th character in the full string is the same as the 10th character inside one copy of length 12, because the final 3 repeats a length-12 tape.
Then we inspect backward through code.
When we reach the letter o, the current size is 10.
Since:
10 % 10 == 0and the current character is a letter, the answer is "o".
Correctness
The forward pass computes the decoded length after each prefix of the encoded string.
For a letter, the decoded tape gains one character, so increasing size by 1 is correct.
For a digit d, the current tape is repeated until there are d total copies, so multiplying size by d is correct.
Now consider the reverse pass.
At any reverse step, size is the decoded length produced by the prefix ending at the current character. The target k refers to a position inside that decoded tape.
If the current character is a digit d, the current tape consists of d identical copies of the previous tape. Therefore, position k in the current tape corresponds to position k % previous_size in the previous tape, with remainder 0 meaning the last character of a copy. Since previous_size = size // d, dividing size by d correctly restores the previous tape length.
If the current character is a letter, then that letter is the final character of the current tape. If k % size == 0, then k points exactly to this final character, so the current letter is the answer. Otherwise, the target lies earlier in the tape, and reducing size by 1 correctly removes this appended letter from consideration.
The loop applies these reverse steps until it finds the unique letter that occupies the requested decoded position. Therefore, the algorithm returns the correct character.
Complexity
| Metric | Value | Why |
|---|---|---|
| Time | O(n) | One forward pass and one backward pass over s |
| Space | O(1) | Only size, k, and loop variables are stored |
Implementation
class Solution:
def decodeAtIndex(self, s: str, k: int) -> str:
size = 0
for ch in s:
if ch.isdigit():
size *= int(ch)
else:
size += 1
for ch in reversed(s):
k %= size
if k == 0 and ch.isalpha():
return ch
if ch.isdigit():
size //= int(ch)
else:
size -= 1
return ""Code Explanation
We first compute the decoded length:
size = 0
for ch in s:
if ch.isdigit():
size *= int(ch)
else:
size += 1This does not build the decoded string. It only records how long the decoded string would be.
Then we walk backward:
for ch in reversed(s):At each step, we map k into the current decoded length:
k %= sizeIf k == 0 and the current character is a letter, then the requested position is exactly this letter:
if k == 0 and ch.isalpha():
return chIf the current character is a digit, we undo the repetition:
size //= int(ch)If the current character is a letter, we undo the append:
size -= 1The final return "" is only a fallback. Valid LeetCode input always returns inside the loop.
Testing
def run_tests():
s = Solution()
assert s.decodeAtIndex("leet2code3", 10) == "o"
assert s.decodeAtIndex("ha22", 5) == "h"
assert s.decodeAtIndex("a2345678999999999999999", 1) == "a"
assert s.decodeAtIndex("abc3", 8) == "b"
assert s.decodeAtIndex("y959q969u3hb22odq595", 222280369) == "y"
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
"leet2code3", 10 | Standard mixed letters and repeats |
"ha22", 5 | Repeated tape more than once |
Large repeated "a" | Huge decoded length without building string |
"abc3", 8 | Position inside repeated copy |
| Long encoded string | Checks large k and multiple reverse reductions |