Skip to content

LeetCode 316: Remove Duplicate Letters

A clear explanation of Remove Duplicate Letters using a greedy monotonic stack.

Problem Restatement

We are given a string s containing lowercase English letters.

We need to remove duplicate letters so that every distinct letter appears exactly once.

Among all valid results, return the lexicographically smallest one.

The official constraints are:

ConstraintValue
s.length1 <= s.length <= 10^4
CharactersLowercase English letters

This problem is also the same as LeetCode 1081, Smallest Subsequence of Distinct Characters. (github.com)

Input and Output

ItemMeaning
InputA lowercase string s
OutputSmallest lexicographical string containing each distinct letter once
OperationRemove characters only
Important ruleRelative order of kept characters must come from the original string

Function shape:

def removeDuplicateLetters(s: str) -> str:
    ...

Examples

Example 1:

s = "bcabc"

Output:

"abc"

We can keep the later b and later c, giving "abc".

Another possible result is "bca", but "abc" is lexicographically smaller.

Example 2:

s = "cbacdcbc"

Output:

"acdb"

The result must contain:

a, b, c, d

exactly once.

The smallest valid subsequence is:

"acdb"

We cannot simply sort the letters into "abcd", because "abcd" is not a subsequence of "cbacdcbc".

First Thought: Sort Unique Characters

A tempting idea is:

return "".join(sorted(set(s)))

For:

s = "bcabc"

this gives:

"abc"

That works for this case.

But for:

s = "cbacdcbc"

it gives:

"abcd"

This is invalid, because the letters must remain in an order possible by deleting characters from the original string.

We can delete characters, but we cannot reorder them.

Key Insight

We build the answer from left to right using a stack.

The stack stores the current best answer prefix.

When we see a new character ch, we want to place it as early as possible if it makes the result smaller.

If the top of the stack is lexicographically larger than ch, then replacing that top character with ch would make the result smaller.

But we are allowed to pop the top character only if it appears again later.

If it does not appear later, popping it would lose that required character forever.

So the pop condition is:

stack[-1] > ch and last[stack[-1]] > i

Meaning:

  1. The stack top is larger than the current character.
  2. The stack top appears again later.
  3. So we can safely remove it now and use its later occurrence.

Last Occurrence Map

Before scanning, record the last index of every character.

For:

s = "cbacdcbc"

the last occurrence map is:

CharacterLast Index
c7
b6
a2
d4

This tells us whether a character can be safely removed from the stack.

For example, at index 6, the current character is b.

If the stack top is d, we cannot pop d, because its last occurrence was index 4.

That means there is no later d.

Algorithm

  1. Record the last occurrence index of each character.

  2. Create an empty stack.

  3. Create a set used for letters already in the stack.

  4. Scan s from left to right.

  5. If the current character is already in used, skip it.

  6. Otherwise, while:

    • stack is not empty,
    • stack top is greater than current character,
    • stack top appears again later,

    pop the stack top and remove it from used.

  7. Push the current character and mark it as used.

  8. Return "".join(stack).

Walkthrough

Use:

s = "bcabc"

Last positions:

CharacterLast Index
b3
c4
a2

Start:

stack = []
used = set()

Index 0, character b.

b is unused, so push it.

stack = ["b"]

Index 1, character c.

c is larger than b, so push it.

stack = ["b", "c"]

Index 2, character a.

Stack top c is greater than a, and c appears again later at index 4, so pop c.

stack = ["b"]

Now stack top b is greater than a, and b appears again later at index 3, so pop b.

stack = []

Push a.

stack = ["a"]

Index 3, character b.

Push b.

stack = ["a", "b"]

Index 4, character c.

Push c.

stack = ["a", "b", "c"]

Final answer:

"abc"

Correctness

The stack always contains distinct characters because we skip any character already in used.

When processing a character ch, the algorithm removes larger characters from the end of the current result only when those characters appear again later. This makes the prefix lexicographically smaller while preserving the ability to include every required character.

If a larger stack-top character does not appear later, the algorithm keeps it. Removing it would make it impossible to produce a valid answer containing every distinct letter.

So every pop is safe, and every refused pop is necessary.

Because the algorithm applies this rule at every position from left to right, the earliest possible position is always given to the smallest character that can safely appear there. This is exactly the greedy condition needed for the lexicographically smallest valid subsequence.

Therefore, the final stack contains every distinct character exactly once and forms the smallest valid result.

Complexity

MetricValueWhy
TimeO(n)Each character is pushed and popped at most once
SpaceO(1)Only lowercase English letters are used

If we write space in terms of the alphabet size k, it is O(k).

Implementation

class Solution:
    def removeDuplicateLetters(self, s: str) -> str:
        last = {
            ch: i
            for i, ch in enumerate(s)
        }

        stack = []
        used = set()

        for i, ch in enumerate(s):
            if ch in used:
                continue

            while (
                stack
                and stack[-1] > ch
                and last[stack[-1]] > i
            ):
                removed = stack.pop()
                used.remove(removed)

            stack.append(ch)
            used.add(ch)

        return "".join(stack)

Code Explanation

First, record where each character appears last.

last = {
    ch: i
    for i, ch in enumerate(s)
}

The stack stores the answer being built.

stack = []

The used set tells us which characters are already in the stack.

used = set()

Scan the string:

for i, ch in enumerate(s):

If ch already exists in the stack, skip it.

if ch in used:
    continue

Now remove larger characters from the stack when safe.

while (
    stack
    and stack[-1] > ch
    and last[stack[-1]] > i
):

This condition says:

  1. There is something to remove.
  2. The current character gives a smaller lexicographical prefix.
  3. The removed character can still be used later.

When we pop, we must also remove it from used.

removed = stack.pop()
used.remove(removed)

Finally, add the current character.

stack.append(ch)
used.add(ch)

The answer is the stack joined into one string.

return "".join(stack)

Testing

def run_tests():
    s = Solution()

    assert s.removeDuplicateLetters("bcabc") == "abc"
    assert s.removeDuplicateLetters("cbacdcbc") == "acdb"
    assert s.removeDuplicateLetters("abacb") == "abc"
    assert s.removeDuplicateLetters("bbcaac") == "bac"
    assert s.removeDuplicateLetters("a") == "a"
    assert s.removeDuplicateLetters("aaaa") == "a"
    assert s.removeDuplicateLetters("edebbed") == "bed"

    print("all tests passed")

run_tests()

Test meaning:

TestWhy
"bcabc"Basic lexicographical improvement
"cbacdcbc"Standard example
"abacb"Skips duplicates after characters are already used
"bbcaac"Cannot always sort distinct letters
"a"Single character
"aaaa"All duplicates collapse to one
"edebbed"Tests popping only when a later copy exists