# LeetCode 955: Delete Columns to Make Sorted II

## Problem Restatement

We are given an array `strs` of strings.

All strings have the same length.

We may delete any set of column indices. When we delete a column, we delete that character position from every string.

After deleting some columns, the remaining strings must be in lexicographic order:

```python
strs[0] <= strs[1] <= strs[2] <= ... <= strs[n - 1]
```

Return the minimum number of deleted columns.

The official constraints are:

| Constraint | Value |
|---|---|
| `1 <= strs.length <= 100` | Number of strings |
| `1 <= strs[i].length <= 100` | Length of each string |
| Characters | Lowercase English letters |

Source: LeetCode problem statement.

## Input and Output

| Item | Meaning |
|---|---|
| Input | Array `strs` of equal-length lowercase strings |
| Output | Minimum number of columns to delete |
| Operation | Delete one character position from every string |

Example function shape:

```python
def minDeletionSize(strs: list[str]) -> int:
    ...
```

## Examples

Example 1:

```python
strs = ["ca", "bb", "ac"]
```

Output:

```python
1
```

If we delete column `0`, the strings become:

```python
["a", "b", "c"]
```

Now the array is sorted.

Example 2:

```python
strs = ["xc", "yb", "za"]
```

Output:

```python
0
```

The array is already sorted lexicographically:

```python
"xc" <= "yb" <= "za"
```

We do not need to delete any column.

Example 3:

```python
strs = ["zyx", "wvu", "tsr"]
```

Output:

```python
3
```

Every column creates an ordering problem, so all columns must be deleted.

## First Thought: Try All Column Subsets

A direct approach is to try every subset of columns.

For each subset, delete those columns and check whether the resulting rows are sorted.

But if each string has length `m`, there are:

```python
2^m
```

possible subsets.

Since `m` can be `100`, this is far too large.

We need a greedy method.

## Key Insight

Lexicographic order is decided from left to right.

Once two adjacent rows are already ordered by an earlier kept column, later columns cannot break that pair.

For example:

```python
row1 = "a..."
row2 = "b..."
```

Since `a < b`, we already know:

```python
row1 < row2
```

No later character matters for this pair.

So we only need to worry about adjacent row pairs that are still tied by all kept columns so far.

We track these pairs with an array:

```python
sorted_pair[i] = True
```

meaning:

```python
strs[i] is already strictly less than strs[i + 1]
```

using previous kept columns.

## Algorithm

Let:

```python
n = len(strs)
m = len(strs[0])
```

Create:

```python
sorted_pair = [False] * (n - 1)
```

Then scan columns from left to right.

For each column `col`:

1. Check whether keeping this column would break any unresolved adjacent pair.
2. A pair is unresolved if `sorted_pair[row] == False`.
3. If for any unresolved pair:

```python
strs[row][col] > strs[row + 1][col]
```

then this column must be deleted.

4. If the column is deleted, do not update `sorted_pair`.
5. If the column is kept, mark newly ordered pairs:

```python
strs[row][col] < strs[row + 1][col]
```

as resolved.

The answer is the number of deleted columns.

## Why Deleting a Bad Column Is Safe

Suppose a column causes this for an unresolved pair:

```python
strs[i][col] > strs[i + 1][col]
```

Unresolved means all previously kept columns were equal for these two rows.

So if we keep this column, the first differing character between the two rows becomes this bad column.

That would force:

```python
strs[i] > strs[i + 1]
```

which violates sorted order.

Therefore, this column cannot be kept in any valid solution extending the previous kept columns. We must delete it.

## Correctness

We prove that the algorithm deletes the minimum number of columns.

The algorithm processes columns from left to right, the same order used by lexicographic comparison.

For each adjacent pair of rows, once an earlier kept column makes the upper row strictly smaller than the lower row, that pair is permanently safe. Later columns cannot change their relative order.

Now consider a column that the algorithm deletes. It deletes the column only when there exists an unresolved adjacent pair `i` such that:

```python
strs[i][col] > strs[i + 1][col]
```

Because the pair is unresolved, all previously kept columns are equal for these two rows. If this column were kept, it would be the first differing kept column for this pair, making the upper row larger than the lower row. That would make the final array unsorted.

So every deleted column is necessary.

For a column that the algorithm keeps, it does not create any inversion among unresolved adjacent pairs. Already resolved pairs are safe by earlier columns. Therefore, keeping the column preserves the possibility of a sorted final array.

After all columns are processed, every adjacent pair is either already resolved by some kept column, or all kept characters are equal between the two rows. In both cases, the upper row is less than or equal to the lower row.

Thus the final array is sorted, and since every deleted column was necessary, the number of deletions is minimal.

## Complexity

Let:

```python
n = len(strs)
m = len(strs[0])
```

| Metric | Value | Why |
|---|---|---|
| Time | `O(n * m)` | Each column checks adjacent row pairs |
| Space | `O(n)` | The `sorted_pair` array stores state for adjacent pairs |

## Implementation

```python
class Solution:
    def minDeletionSize(self, strs: list[str]) -> int:
        n = len(strs)
        m = len(strs[0])

        sorted_pair = [False] * (n - 1)
        deletions = 0

        for col in range(m):
            must_delete = False

            for row in range(n - 1):
                if (
                    not sorted_pair[row]
                    and strs[row][col] > strs[row + 1][col]
                ):
                    must_delete = True
                    break

            if must_delete:
                deletions += 1
                continue

            for row in range(n - 1):
                if strs[row][col] < strs[row + 1][col]:
                    sorted_pair[row] = True

        return deletions
```

## Code Explanation

We store whether each adjacent pair is already safely ordered:

```python
sorted_pair = [False] * (n - 1)
```

For example, `sorted_pair[2]` describes the pair:

```python
strs[2], strs[3]
```

For each column:

```python
for col in range(m):
```

we first check whether this column would create an invalid ordering.

```python
if (
    not sorted_pair[row]
    and strs[row][col] > strs[row + 1][col]
):
```

This condition means:

1. This adjacent pair is still tied by previous kept columns.
2. The current column would make the upper row larger.

So this column must be deleted.

If the column is deleted:

```python
deletions += 1
continue
```

we skip the update step, because deleted columns cannot help sort any pair.

If the column is kept, we update resolved pairs:

```python
if strs[row][col] < strs[row + 1][col]:
    sorted_pair[row] = True
```

Once a pair is resolved, it remains resolved.

Finally:

```python
return deletions
```

returns the minimum number of deleted columns.

## Testing

```python
def run_tests():
    s = Solution()

    assert s.minDeletionSize(["ca", "bb", "ac"]) == 1
    assert s.minDeletionSize(["xc", "yb", "za"]) == 0
    assert s.minDeletionSize(["zyx", "wvu", "tsr"]) == 3

    assert s.minDeletionSize(["abc"]) == 0
    assert s.minDeletionSize(["aa", "aa"]) == 0
    assert s.minDeletionSize(["ba", "aa"]) == 1
    assert s.minDeletionSize(["abx", "agz", "bgc", "bfc"]) == 1

    print("all tests passed")

run_tests()
```

Test meaning:

| Test | Why |
|---|---|
| `["ca","bb","ac"]` | Basic deletion example |
| `["xc","yb","za"]` | Already sorted |
| `["zyx","wvu","tsr"]` | Every column must be deleted |
| Single string | Always sorted |
| Equal strings | Nondecreasing order allows equality |
| `["ba","aa"]` | Bad first column must be deleted |
| Mixed resolved and unresolved pairs | Checks greedy state tracking |

