A clear explanation of deleting the minimum number of columns so every remaining row is individually sorted.
Problem Restatement
We are given an array strs of equal-length strings.
We may delete any set of column indices. When we delete a column, that position is removed from every string.
After deletions, every remaining row must be lexicographically sorted by itself:
strs[row][0] <= strs[row][1] <= strs[row][2] <= ...Return the minimum number of deleted columns.
Important: the array of strings does not need to be sorted. Each individual row must be sorted after deletion. This is different from LeetCode 955.
The official statement describes deleting indices from every string and requiring every final row to be lexicographically ordered.
Input and Output
| Item | Meaning |
|---|---|
| Input | Array strs of equal-length lowercase strings |
| Output | Minimum number of columns to delete |
| Operation | Delete the same column indices from every row |
| Requirement | Every remaining row must be nondecreasing |
Example function shape:
def minDeletionSize(strs: list[str]) -> int:
...Examples
Example 1:
strs = ["babca", "bbazb"]Output:
3Delete columns 0, 1, and 4.
The remaining rows are:
"bc"
"az"Both rows are individually sorted.
Example 2:
strs = ["edcba"]Output:
4With one row, we need to delete enough columns so the remaining characters are nondecreasing.
The longest sorted subsequence is any one character, so we keep 1 column and delete 4.
Example 3:
strs = ["ghi", "def", "abc"]Output:
0Every row is already sorted, so no column must be deleted.
First Thought: Try All Column Sets
If each string has length m, we could try every subset of columns to keep.
There are:
2 ** mpossible subsets.
For each subset, we would check whether every row is sorted after keeping those columns.
This is too slow because m can be 100.
Key Insight
Instead of choosing columns to delete, choose the longest valid sequence of columns to keep.
If we can keep L columns, then we delete:
m - LSo the problem becomes:
Find the longest sequence of column indices:
c1 < c2 < c3 < ...such that for every row:
strs[row][c1] <= strs[row][c2] <= strs[row][c3] <= ...This is a longest increasing subsequence style dynamic programming problem.
The difference is that one column can follow another column only if the condition holds for all rows.
Column Compatibility
Column prev can come before column cur if:
strs[row][prev] <= strs[row][cur]for every row.
Example:
strs = ["babca", "bbazb"]Column 2 can come before column 3 if:
strs[0][2] <= strs[0][3]
strs[1][2] <= strs[1][3]That means:
'b' <= 'c'
'a' <= 'z'Both are true, so column 2 can precede column 3.
Algorithm
Let:
m = len(strs[0])Define:
dp[i] = longest valid kept-column sequence ending at column iInitialize every value to 1, because any single column can be kept alone.
For every column cur from left to right:
- Try every previous column
prev < cur. - Check whether
prevcan come beforecurin all rows. - If yes, update:
dp[cur] = max(dp[cur], dp[prev] + 1)After filling dp, the maximum number of columns we can keep is:
max(dp)So the answer is:
m - max(dp)Correctness
We prove that the algorithm returns the minimum number of deleted columns.
A valid final result is exactly a sequence of kept columns in increasing index order. Since deleting columns preserves the original order of the remaining columns, we only need to choose which column indices stay.
For any two consecutive kept columns prev and cur, every row must satisfy:
strs[row][prev] <= strs[row][cur]Otherwise, that row would decrease at this adjacent kept-column pair, so the final row would not be sorted.
The dynamic programming state dp[cur] stores the longest valid kept-column sequence that ends at cur.
The initialization dp[cur] = 1 is correct because a single column is always valid.
For each earlier column prev, if prev can legally precede cur in every row, then any valid sequence ending at prev can be extended by cur. This gives a candidate length dp[prev] + 1.
If prev cannot precede cur, then no valid sorted row sequence can place cur immediately after prev, so that transition is correctly rejected.
Thus every valid kept-column sequence is considered by the DP, and every DP transition creates a valid kept-column sequence. Therefore, max(dp) is the maximum number of columns that can be kept.
Since every column is either kept or deleted, minimizing deletions is the same as maximizing kept columns. The answer is m - max(dp).
Complexity
Let:
n = len(strs)
m = len(strs[0])| Metric | Value | Why |
|---|---|---|
| Time | O(n * m^2) | For each pair of columns, check all rows |
| Space | O(m) | The DP array stores one value per column |
The official constraints allow n <= 100 and m <= 100, so this is efficient.
Implementation
class Solution:
def minDeletionSize(self, strs: list[str]) -> int:
m = len(strs[0])
dp = [1] * m
for cur in range(m):
for prev in range(cur):
can_follow = True
for row in strs:
if row[prev] > row[cur]:
can_follow = False
break
if can_follow:
dp[cur] = max(dp[cur], dp[prev] + 1)
return m - max(dp)Code Explanation
We start with:
m = len(strs[0])
dp = [1] * mEach column can form a valid kept sequence by itself.
Then we try to end a sequence at every column:
for cur in range(m):For each cur, we test all previous columns:
for prev in range(cur):Now we check whether prev can come before cur for every row:
for row in strs:
if row[prev] > row[cur]:
can_follow = False
breakIf any row decreases, this transition is invalid.
If all rows pass, we extend the best sequence ending at prev:
dp[cur] = max(dp[cur], dp[prev] + 1)Finally:
return m - max(dp)The maximum value in dp is the most columns we can keep. The rest must be deleted.
Testing
def run_tests():
s = Solution()
assert s.minDeletionSize(["babca", "bbazb"]) == 3
assert s.minDeletionSize(["edcba"]) == 4
assert s.minDeletionSize(["ghi", "def", "abc"]) == 0
assert s.minDeletionSize(["abc"]) == 0
assert s.minDeletionSize(["cba"]) == 2
assert s.minDeletionSize(["aaa", "aaa"]) == 0
assert s.minDeletionSize(["zyx", "wvu", "tsr"]) == 2
assert s.minDeletionSize(["abcdef", "uvwxyz"]) == 0
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
["babca","bbazb"] | Official-style example with shared deletion set |
["edcba"] | Single decreasing row |
["ghi","def","abc"] | Rows already sorted |
["abc"] | Single sorted row |
["cba"] | Single-row LIS behavior |
| Equal rows | Nondecreasing allows equality |
| All rows decreasing | Can keep only one column |
| Two increasing rows | No deletion needed |