Skip to content

LeetCode 880: Decoded String at Index

A clear explanation of finding the kth character in a decoded string without building the full decoded string.

Problem Restatement

We are given an encoded string s and an integer k.

The string is decoded from left to right using these rules:

CharacterMeaning
LetterAppend the letter to the tape
Digit dRepeat the entire current tape d - 1 more times

Return the kth character of the final decoded string, using 1-based indexing.

The decoded string can be extremely large, so we must not build it directly. The problem guarantees that k is within the decoded length, and the decoded length is less than 2^63.

Input and Output

ItemMeaning
InputEncoded string s and integer k
OutputThe kth decoded character
Indexing1-based
DigitsOnly 2 through 9
StringStarts with a lowercase English letter

Function shape:

def decodeAtIndex(self, s: str, k: int) -> str:
    ...

Examples

Example 1:

Input: s = "leet2code3", k = 10
Output: "o"

The decoded string is:

leetleetcodeleetleetcodeleetleetcode

The 10th character is "o".

Example 2:

Input: s = "ha22", k = 5
Output: "h"

Decode step by step:

h
ha
haha
hahahaha

The 5th character is "h".

Example 3:

Input: s = "a2345678999999999999999", k = 1
Output: "a"

The decoded string is a huge repetition of "a", so the first character is "a".

First Thought: Build the Decoded String

The direct approach is to simulate the tape.

When we see a letter, append it.

When we see a digit, repeat the whole current string.

class Solution:
    def decodeAtIndex(self, s: str, k: int) -> str:
        decoded = ""

        for ch in s:
            if ch.isdigit():
                decoded *= int(ch)
            else:
                decoded += ch

        return decoded[k - 1]

This matches the decoding rule, but it fails for large inputs.

For example:

s = "a2345678999999999999999"

The decoded string is far too large to store in memory.

Problem With Direct Decoding

The encoded string has length at most 100, but the decoded string may contain billions or trillions of characters.

So the encoded input is small, but the expanded output can be enormous.

We only need one character. We should track lengths and positions instead of constructing the decoded string.

Key Insight

First compute the length of the decoded string.

Let size be the decoded length after reading some prefix of s.

When we read a letter:

size += 1

When we read a digit d:

size *= d

After this forward pass, size is the total decoded length.

Then work backward through s.

The idea is to reverse the encoding process.

If the current character is a digit d, then the current decoded tape was created by repeating the previous tape d times.

So if the current length is size, the previous length was:

size // d

The target index k can be mapped back into one copy using modulo.

If the current character is a letter, then it was appended at the end of the previous tape.

If k points to the end of the current tape, that letter is the answer.

Algorithm

First pass:

  1. Set size = 0.
  2. For each character ch in s:
    1. If ch is a digit, multiply size by that digit.
    2. Otherwise, add 1 to size.

Second pass, from right to left:

  1. Reduce k using:
k %= size
  1. If k == 0 and the current character is a letter, return it.
  2. If the current character is a digit d, undo the repeat:
size //= d
  1. If the current character is a letter, undo the append:
size -= 1

Walkthrough

Use:

s = "leet2code3"
k = 10

First compute decoded lengths:

CharacterOperationSize
ladd letter1
eadd letter2
eadd letter3
tadd letter4
2repeat tape twice8
cadd letter9
oadd letter10
dadd letter11
eadd letter12
3repeat tape three times36

Now go backward.

At size 36, the 10th character in the full string is the same as the 10th character inside one copy of length 12, because the final 3 repeats a length-12 tape.

Then we inspect backward through code.

When we reach the letter o, the current size is 10.

Since:

10 % 10 == 0

and the current character is a letter, the answer is "o".

Correctness

The forward pass computes the decoded length after each prefix of the encoded string.

For a letter, the decoded tape gains one character, so increasing size by 1 is correct.

For a digit d, the current tape is repeated until there are d total copies, so multiplying size by d is correct.

Now consider the reverse pass.

At any reverse step, size is the decoded length produced by the prefix ending at the current character. The target k refers to a position inside that decoded tape.

If the current character is a digit d, the current tape consists of d identical copies of the previous tape. Therefore, position k in the current tape corresponds to position k % previous_size in the previous tape, with remainder 0 meaning the last character of a copy. Since previous_size = size // d, dividing size by d correctly restores the previous tape length.

If the current character is a letter, then that letter is the final character of the current tape. If k % size == 0, then k points exactly to this final character, so the current letter is the answer. Otherwise, the target lies earlier in the tape, and reducing size by 1 correctly removes this appended letter from consideration.

The loop applies these reverse steps until it finds the unique letter that occupies the requested decoded position. Therefore, the algorithm returns the correct character.

Complexity

MetricValueWhy
TimeO(n)One forward pass and one backward pass over s
SpaceO(1)Only size, k, and loop variables are stored

Implementation

class Solution:
    def decodeAtIndex(self, s: str, k: int) -> str:
        size = 0

        for ch in s:
            if ch.isdigit():
                size *= int(ch)
            else:
                size += 1

        for ch in reversed(s):
            k %= size

            if k == 0 and ch.isalpha():
                return ch

            if ch.isdigit():
                size //= int(ch)
            else:
                size -= 1

        return ""

Code Explanation

We first compute the decoded length:

size = 0

for ch in s:
    if ch.isdigit():
        size *= int(ch)
    else:
        size += 1

This does not build the decoded string. It only records how long the decoded string would be.

Then we walk backward:

for ch in reversed(s):

At each step, we map k into the current decoded length:

k %= size

If k == 0 and the current character is a letter, then the requested position is exactly this letter:

if k == 0 and ch.isalpha():
    return ch

If the current character is a digit, we undo the repetition:

size //= int(ch)

If the current character is a letter, we undo the append:

size -= 1

The final return "" is only a fallback. Valid LeetCode input always returns inside the loop.

Testing

def run_tests():
    s = Solution()

    assert s.decodeAtIndex("leet2code3", 10) == "o"
    assert s.decodeAtIndex("ha22", 5) == "h"
    assert s.decodeAtIndex("a2345678999999999999999", 1) == "a"
    assert s.decodeAtIndex("abc3", 8) == "b"
    assert s.decodeAtIndex("y959q969u3hb22odq595", 222280369) == "y"

    print("all tests passed")

run_tests()

Test meaning:

TestWhy
"leet2code3", 10Standard mixed letters and repeats
"ha22", 5Repeated tape more than once
Large repeated "a"Huge decoded length without building string
"abc3", 8Position inside repeated copy
Long encoded stringChecks large k and multiple reverse reductions