# LeetCode 72: Edit Distance

## Problem Restatement

We are given two strings, `word1` and `word2`.

We need to return the minimum number of operations required to convert `word1` into `word2`.

There are three allowed operations:

1. Insert one character
2. Delete one character
3. Replace one character

The official constraints are `0 <= word1.length, word2.length <= 500`, and both strings contain lowercase English letters.

## Input and Output

| Item | Meaning |
|---|---|
| Input | Two strings `word1` and `word2` |
| Output | Minimum number of edit operations |
| Operation 1 | Insert a character |
| Operation 2 | Delete a character |
| Operation 3 | Replace a character |

Function shape:

```python
def minDistance(word1: str, word2: str) -> int:
    ...
```

## Examples

For:

```python
word1 = "horse"
word2 = "ros"
```

One optimal conversion is:

```text
horse -> rorse
rorse -> rose
rose -> ros
```

The operations are:

| Step | Operation |
|---|---|
| `horse -> rorse` | Replace `h` with `r` |
| `rorse -> rose` | Delete one `r` |
| `rose -> ros` | Delete `e` |

So the answer is:

```python
3
```

For:

```python
word1 = "intention"
word2 = "execution"
```

The answer is:

```python
5
```

One optimal sequence uses delete, replace, replace, replace, and insert.

## First Thought: Try All Operations

A direct recursive idea is:

If the first characters are equal, move to the next characters.

If they are different, try all three possible operations:

1. Insert
2. Delete
3. Replace

Then take the minimum result.

This matches the problem, but it recomputes the same substring pairs many times.

For example, different edit sequences may lead to the same remaining pair of prefixes. That means the problem has overlapping subproblems.

## Key Insight

Instead of thinking about full strings, think about prefixes.

Let:

```python
dp[i][j]
```

mean:

```text
the minimum number of operations needed to convert
word1[:i] into word2[:j]
```

So:

```python
dp[m][n]
```

is the final answer, where:

```python
m = len(word1)
n = len(word2)
```

## Base Cases

If `word2` is empty, we must delete every character from `word1`.

```python
dp[i][0] = i
```

If `word1` is empty, we must insert every character from `word2`.

```python
dp[0][j] = j
```

For example:

```python
word1 = "abc"
word2 = ""
```

The answer is `3`, because we delete `a`, `b`, and `c`.

## Transition

Now consider `dp[i][j]`.

We compare the last characters of the two prefixes:

```python
word1[i - 1]
word2[j - 1]
```

If they are equal, no new operation is needed for the last character:

```python
dp[i][j] = dp[i - 1][j - 1]
```

If they are different, we have three choices.

Delete from `word1`:

```python
dp[i - 1][j] + 1
```

Insert into `word1`:

```python
dp[i][j - 1] + 1
```

Replace the last character of `word1`:

```python
dp[i - 1][j - 1] + 1
```

So when the characters differ:

```python
dp[i][j] = 1 + min(
    dp[i - 1][j],
    dp[i][j - 1],
    dp[i - 1][j - 1],
)
```

## Algorithm

Create a DP table of size `(m + 1) x (n + 1)`.

Initialize:

```python
dp[i][0] = i
dp[0][j] = j
```

Then fill the table from top-left to bottom-right.

For each `i` from `1` to `m`, and each `j` from `1` to `n`:

1. If the current characters match, copy the diagonal value.
2. Otherwise, take one plus the minimum of delete, insert, and replace.

Return:

```python
dp[m][n]
```

## Correctness

`dp[i][j]` represents the minimum number of operations needed to convert `word1[:i]` into `word2[:j]`.

The base cases are correct. Converting a non-empty prefix into an empty string requires deleting every character. Converting an empty string into a non-empty prefix requires inserting every character.

For `i > 0` and `j > 0`, consider the last characters of both prefixes.

If they are equal, the last character already matches. The problem reduces to converting the previous prefixes, so `dp[i - 1][j - 1]` is sufficient.

If they differ, the final operation in an optimal solution must be one of the three allowed operations. A delete comes from `dp[i - 1][j]`, an insert comes from `dp[i][j - 1]`, and a replace comes from `dp[i - 1][j - 1]`. Taking the minimum of these three possibilities and adding one gives the optimal value for `dp[i][j]`.

The algorithm fills every state after its dependencies are available. Therefore every table entry is correct, including `dp[m][n]`.

## Complexity

| Metric | Value | Why |
|---|---|---|
| Time | `O(m * n)` | We compute one state for every pair of prefix lengths |
| Space | `O(m * n)` | The DP table stores `(m + 1) * (n + 1)` values |

Here, `m = len(word1)` and `n = len(word2)`.

## Implementation

```python
class Solution:
    def minDistance(self, word1: str, word2: str) -> int:
        m = len(word1)
        n = len(word2)

        dp = [[0] * (n + 1) for _ in range(m + 1)]

        for i in range(m + 1):
            dp[i][0] = i

        for j in range(n + 1):
            dp[0][j] = j

        for i in range(1, m + 1):
            for j in range(1, n + 1):
                if word1[i - 1] == word2[j - 1]:
                    dp[i][j] = dp[i - 1][j - 1]
                else:
                    dp[i][j] = 1 + min(
                        dp[i - 1][j],
                        dp[i][j - 1],
                        dp[i - 1][j - 1],
                    )

        return dp[m][n]
```

## Code Explanation

Get both lengths:

```python
m = len(word1)
n = len(word2)
```

Create the DP table:

```python
dp = [[0] * (n + 1) for _ in range(m + 1)]
```

Initialize the first column:

```python
for i in range(m + 1):
    dp[i][0] = i
```

This means deleting `i` characters.

Initialize the first row:

```python
for j in range(n + 1):
    dp[0][j] = j
```

This means inserting `j` characters.

Then fill the table:

```python
for i in range(1, m + 1):
    for j in range(1, n + 1):
```

If the current characters match:

```python
if word1[i - 1] == word2[j - 1]:
    dp[i][j] = dp[i - 1][j - 1]
```

Otherwise, try all three operations:

```python
dp[i][j] = 1 + min(
    dp[i - 1][j],
    dp[i][j - 1],
    dp[i - 1][j - 1],
)
```

Return the full-string answer:

```python
return dp[m][n]
```

## Space Optimization

Each row depends only on:

1. The current row
2. The previous row

So we can reduce space to `O(n)`.

`dp[j]` stores the value for the previous row before update and the current row after update.

We also keep `prev_diag`, which represents `dp[i - 1][j - 1]`.

```python
class Solution:
    def minDistance(self, word1: str, word2: str) -> int:
        m = len(word1)
        n = len(word2)

        dp = list(range(n + 1))

        for i in range(1, m + 1):
            prev_diag = dp[0]
            dp[0] = i

            for j in range(1, n + 1):
                old_dp_j = dp[j]

                if word1[i - 1] == word2[j - 1]:
                    dp[j] = prev_diag
                else:
                    dp[j] = 1 + min(
                        dp[j],
                        dp[j - 1],
                        prev_diag,
                    )

                prev_diag = old_dp_j

        return dp[n]
```

## Testing

```python
def run_tests():
    s = Solution()

    assert s.minDistance("horse", "ros") == 3
    assert s.minDistance("intention", "execution") == 5
    assert s.minDistance("", "") == 0
    assert s.minDistance("abc", "") == 3
    assert s.minDistance("", "abc") == 3
    assert s.minDistance("abc", "abc") == 0
    assert s.minDistance("abc", "axc") == 1
    assert s.minDistance("kitten", "sitting") == 3

    print("all tests passed")

run_tests()
```

| Test | Why |
|---|---|
| `"horse"`, `"ros"` | Official example |
| `"intention"`, `"execution"` | Official example |
| `""`, `""` | Both strings empty |
| `"abc"`, `""` | Only deletions |
| `""`, `"abc"` | Only insertions |
| `"abc"`, `"abc"` | No operation needed |
| `"abc"`, `"axc"` | Single replacement |
| `"kitten"`, `"sitting"` | Standard edit-distance case |