Skip to content

LeetCode 960: Delete Columns to Make Sorted III

A clear explanation of deleting the minimum number of columns so every remaining row is individually sorted.

Problem Restatement

We are given an array strs of equal-length strings.

We may delete any set of column indices. When we delete a column, that position is removed from every string.

After deletions, every remaining row must be lexicographically sorted by itself:

strs[row][0] <= strs[row][1] <= strs[row][2] <= ...

Return the minimum number of deleted columns.

Important: the array of strings does not need to be sorted. Each individual row must be sorted after deletion. This is different from LeetCode 955.

The official statement describes deleting indices from every string and requiring every final row to be lexicographically ordered.

Input and Output

ItemMeaning
InputArray strs of equal-length lowercase strings
OutputMinimum number of columns to delete
OperationDelete the same column indices from every row
RequirementEvery remaining row must be nondecreasing

Example function shape:

def minDeletionSize(strs: list[str]) -> int:
    ...

Examples

Example 1:

strs = ["babca", "bbazb"]

Output:

3

Delete columns 0, 1, and 4.

The remaining rows are:

"bc"
"az"

Both rows are individually sorted.

Example 2:

strs = ["edcba"]

Output:

4

With one row, we need to delete enough columns so the remaining characters are nondecreasing.

The longest sorted subsequence is any one character, so we keep 1 column and delete 4.

Example 3:

strs = ["ghi", "def", "abc"]

Output:

0

Every row is already sorted, so no column must be deleted.

First Thought: Try All Column Sets

If each string has length m, we could try every subset of columns to keep.

There are:

2 ** m

possible subsets.

For each subset, we would check whether every row is sorted after keeping those columns.

This is too slow because m can be 100.

Key Insight

Instead of choosing columns to delete, choose the longest valid sequence of columns to keep.

If we can keep L columns, then we delete:

m - L

So the problem becomes:

Find the longest sequence of column indices:

c1 < c2 < c3 < ...

such that for every row:

strs[row][c1] <= strs[row][c2] <= strs[row][c3] <= ...

This is a longest increasing subsequence style dynamic programming problem.

The difference is that one column can follow another column only if the condition holds for all rows.

Column Compatibility

Column prev can come before column cur if:

strs[row][prev] <= strs[row][cur]

for every row.

Example:

strs = ["babca", "bbazb"]

Column 2 can come before column 3 if:

strs[0][2] <= strs[0][3]
strs[1][2] <= strs[1][3]

That means:

'b' <= 'c'
'a' <= 'z'

Both are true, so column 2 can precede column 3.

Algorithm

Let:

m = len(strs[0])

Define:

dp[i] = longest valid kept-column sequence ending at column i

Initialize every value to 1, because any single column can be kept alone.

For every column cur from left to right:

  1. Try every previous column prev < cur.
  2. Check whether prev can come before cur in all rows.
  3. If yes, update:
dp[cur] = max(dp[cur], dp[prev] + 1)

After filling dp, the maximum number of columns we can keep is:

max(dp)

So the answer is:

m - max(dp)

Correctness

We prove that the algorithm returns the minimum number of deleted columns.

A valid final result is exactly a sequence of kept columns in increasing index order. Since deleting columns preserves the original order of the remaining columns, we only need to choose which column indices stay.

For any two consecutive kept columns prev and cur, every row must satisfy:

strs[row][prev] <= strs[row][cur]

Otherwise, that row would decrease at this adjacent kept-column pair, so the final row would not be sorted.

The dynamic programming state dp[cur] stores the longest valid kept-column sequence that ends at cur.

The initialization dp[cur] = 1 is correct because a single column is always valid.

For each earlier column prev, if prev can legally precede cur in every row, then any valid sequence ending at prev can be extended by cur. This gives a candidate length dp[prev] + 1.

If prev cannot precede cur, then no valid sorted row sequence can place cur immediately after prev, so that transition is correctly rejected.

Thus every valid kept-column sequence is considered by the DP, and every DP transition creates a valid kept-column sequence. Therefore, max(dp) is the maximum number of columns that can be kept.

Since every column is either kept or deleted, minimizing deletions is the same as maximizing kept columns. The answer is m - max(dp).

Complexity

Let:

n = len(strs)
m = len(strs[0])
MetricValueWhy
TimeO(n * m^2)For each pair of columns, check all rows
SpaceO(m)The DP array stores one value per column

The official constraints allow n <= 100 and m <= 100, so this is efficient.

Implementation

class Solution:
    def minDeletionSize(self, strs: list[str]) -> int:
        m = len(strs[0])
        dp = [1] * m

        for cur in range(m):
            for prev in range(cur):
                can_follow = True

                for row in strs:
                    if row[prev] > row[cur]:
                        can_follow = False
                        break

                if can_follow:
                    dp[cur] = max(dp[cur], dp[prev] + 1)

        return m - max(dp)

Code Explanation

We start with:

m = len(strs[0])
dp = [1] * m

Each column can form a valid kept sequence by itself.

Then we try to end a sequence at every column:

for cur in range(m):

For each cur, we test all previous columns:

for prev in range(cur):

Now we check whether prev can come before cur for every row:

for row in strs:
    if row[prev] > row[cur]:
        can_follow = False
        break

If any row decreases, this transition is invalid.

If all rows pass, we extend the best sequence ending at prev:

dp[cur] = max(dp[cur], dp[prev] + 1)

Finally:

return m - max(dp)

The maximum value in dp is the most columns we can keep. The rest must be deleted.

Testing

def run_tests():
    s = Solution()

    assert s.minDeletionSize(["babca", "bbazb"]) == 3
    assert s.minDeletionSize(["edcba"]) == 4
    assert s.minDeletionSize(["ghi", "def", "abc"]) == 0

    assert s.minDeletionSize(["abc"]) == 0
    assert s.minDeletionSize(["cba"]) == 2
    assert s.minDeletionSize(["aaa", "aaa"]) == 0
    assert s.minDeletionSize(["zyx", "wvu", "tsr"]) == 2
    assert s.minDeletionSize(["abcdef", "uvwxyz"]) == 0

    print("all tests passed")

run_tests()

Test meaning:

TestWhy
["babca","bbazb"]Official-style example with shared deletion set
["edcba"]Single decreasing row
["ghi","def","abc"]Rows already sorted
["abc"]Single sorted row
["cba"]Single-row LIS behavior
Equal rowsNondecreasing allows equality
All rows decreasingCan keep only one column
Two increasing rowsNo deletion needed