Skip to content

LeetCode 955: Delete Columns to Make Sorted II

A clear explanation of deleting the minimum number of columns so rows become lexicographically sorted.

Problem Restatement

We are given an array strs of strings.

All strings have the same length.

We may delete any set of column indices. When we delete a column, we delete that character position from every string.

After deleting some columns, the remaining strings must be in lexicographic order:

strs[0] <= strs[1] <= strs[2] <= ... <= strs[n - 1]

Return the minimum number of deleted columns.

The official constraints are:

ConstraintValue
1 <= strs.length <= 100Number of strings
1 <= strs[i].length <= 100Length of each string
CharactersLowercase English letters

Source: LeetCode problem statement.

Input and Output

ItemMeaning
InputArray strs of equal-length lowercase strings
OutputMinimum number of columns to delete
OperationDelete one character position from every string

Example function shape:

def minDeletionSize(strs: list[str]) -> int:
    ...

Examples

Example 1:

strs = ["ca", "bb", "ac"]

Output:

1

If we delete column 0, the strings become:

["a", "b", "c"]

Now the array is sorted.

Example 2:

strs = ["xc", "yb", "za"]

Output:

0

The array is already sorted lexicographically:

"xc" <= "yb" <= "za"

We do not need to delete any column.

Example 3:

strs = ["zyx", "wvu", "tsr"]

Output:

3

Every column creates an ordering problem, so all columns must be deleted.

First Thought: Try All Column Subsets

A direct approach is to try every subset of columns.

For each subset, delete those columns and check whether the resulting rows are sorted.

But if each string has length m, there are:

2^m

possible subsets.

Since m can be 100, this is far too large.

We need a greedy method.

Key Insight

Lexicographic order is decided from left to right.

Once two adjacent rows are already ordered by an earlier kept column, later columns cannot break that pair.

For example:

row1 = "a..."
row2 = "b..."

Since a < b, we already know:

row1 < row2

No later character matters for this pair.

So we only need to worry about adjacent row pairs that are still tied by all kept columns so far.

We track these pairs with an array:

sorted_pair[i] = True

meaning:

strs[i] is already strictly less than strs[i + 1]

using previous kept columns.

Algorithm

Let:

n = len(strs)
m = len(strs[0])

Create:

sorted_pair = [False] * (n - 1)

Then scan columns from left to right.

For each column col:

  1. Check whether keeping this column would break any unresolved adjacent pair.
  2. A pair is unresolved if sorted_pair[row] == False.
  3. If for any unresolved pair:
strs[row][col] > strs[row + 1][col]

then this column must be deleted.

  1. If the column is deleted, do not update sorted_pair.
  2. If the column is kept, mark newly ordered pairs:
strs[row][col] < strs[row + 1][col]

as resolved.

The answer is the number of deleted columns.

Why Deleting a Bad Column Is Safe

Suppose a column causes this for an unresolved pair:

strs[i][col] > strs[i + 1][col]

Unresolved means all previously kept columns were equal for these two rows.

So if we keep this column, the first differing character between the two rows becomes this bad column.

That would force:

strs[i] > strs[i + 1]

which violates sorted order.

Therefore, this column cannot be kept in any valid solution extending the previous kept columns. We must delete it.

Correctness

We prove that the algorithm deletes the minimum number of columns.

The algorithm processes columns from left to right, the same order used by lexicographic comparison.

For each adjacent pair of rows, once an earlier kept column makes the upper row strictly smaller than the lower row, that pair is permanently safe. Later columns cannot change their relative order.

Now consider a column that the algorithm deletes. It deletes the column only when there exists an unresolved adjacent pair i such that:

strs[i][col] > strs[i + 1][col]

Because the pair is unresolved, all previously kept columns are equal for these two rows. If this column were kept, it would be the first differing kept column for this pair, making the upper row larger than the lower row. That would make the final array unsorted.

So every deleted column is necessary.

For a column that the algorithm keeps, it does not create any inversion among unresolved adjacent pairs. Already resolved pairs are safe by earlier columns. Therefore, keeping the column preserves the possibility of a sorted final array.

After all columns are processed, every adjacent pair is either already resolved by some kept column, or all kept characters are equal between the two rows. In both cases, the upper row is less than or equal to the lower row.

Thus the final array is sorted, and since every deleted column was necessary, the number of deletions is minimal.

Complexity

Let:

n = len(strs)
m = len(strs[0])
MetricValueWhy
TimeO(n * m)Each column checks adjacent row pairs
SpaceO(n)The sorted_pair array stores state for adjacent pairs

Implementation

class Solution:
    def minDeletionSize(self, strs: list[str]) -> int:
        n = len(strs)
        m = len(strs[0])

        sorted_pair = [False] * (n - 1)
        deletions = 0

        for col in range(m):
            must_delete = False

            for row in range(n - 1):
                if (
                    not sorted_pair[row]
                    and strs[row][col] > strs[row + 1][col]
                ):
                    must_delete = True
                    break

            if must_delete:
                deletions += 1
                continue

            for row in range(n - 1):
                if strs[row][col] < strs[row + 1][col]:
                    sorted_pair[row] = True

        return deletions

Code Explanation

We store whether each adjacent pair is already safely ordered:

sorted_pair = [False] * (n - 1)

For example, sorted_pair[2] describes the pair:

strs[2], strs[3]

For each column:

for col in range(m):

we first check whether this column would create an invalid ordering.

if (
    not sorted_pair[row]
    and strs[row][col] > strs[row + 1][col]
):

This condition means:

  1. This adjacent pair is still tied by previous kept columns.
  2. The current column would make the upper row larger.

So this column must be deleted.

If the column is deleted:

deletions += 1
continue

we skip the update step, because deleted columns cannot help sort any pair.

If the column is kept, we update resolved pairs:

if strs[row][col] < strs[row + 1][col]:
    sorted_pair[row] = True

Once a pair is resolved, it remains resolved.

Finally:

return deletions

returns the minimum number of deleted columns.

Testing

def run_tests():
    s = Solution()

    assert s.minDeletionSize(["ca", "bb", "ac"]) == 1
    assert s.minDeletionSize(["xc", "yb", "za"]) == 0
    assert s.minDeletionSize(["zyx", "wvu", "tsr"]) == 3

    assert s.minDeletionSize(["abc"]) == 0
    assert s.minDeletionSize(["aa", "aa"]) == 0
    assert s.minDeletionSize(["ba", "aa"]) == 1
    assert s.minDeletionSize(["abx", "agz", "bgc", "bfc"]) == 1

    print("all tests passed")

run_tests()

Test meaning:

TestWhy
["ca","bb","ac"]Basic deletion example
["xc","yb","za"]Already sorted
["zyx","wvu","tsr"]Every column must be deleted
Single stringAlways sorted
Equal stringsNondecreasing order allows equality
["ba","aa"]Bad first column must be deleted
Mixed resolved and unresolved pairsChecks greedy state tracking