# LeetCode 10: Regular Expression Matching

## Problem Restatement

We are given two strings:

```python
s
p
```

`s` is the input string.

`p` is the pattern.

The pattern supports two special characters:

| Pattern symbol | Meaning |
|---|---|
| `.` | Matches any single character |
| `*` | Matches zero or more of the preceding element |

The match must cover the entire input string, not only part of it. This is the key rule in the problem statement.

For example:

```text
s = "aa"
p = "a"
```

This returns `false`, because `"a"` matches only one character, not the whole string `"aa"`.

But:

```text
s = "aa"
p = "a*"
```

returns `true`, because `a*` can match `"aa"`.

## Input and Output

| Item | Meaning |
|---|---|
| Input | A string `s` and a pattern `p` |
| Output | `true` if `p` matches the whole string `s`, otherwise `false` |
| `.` | Matches any one character |
| `*` | Matches zero or more copies of the previous pattern element |
| Match type | Full-string match |

Example function shape:

```python
def isMatch(s: str, p: str) -> bool:
    ...
```

## Examples

Example 1:

```text
s = "aa"
p = "a"
```

Output:

```text
false
```

The pattern only matches one `a`.

The string has two characters.

Example 2:

```text
s = "aa"
p = "a*"
```

Output:

```text
true
```

`a*` means zero or more `a` characters.

So it can match `"aa"`.

Example 3:

```text
s = "ab"
p = ".*"
```

Output:

```text
true
```

`.` can match any single character.

`.*` can match any sequence of characters, including `"ab"`.

## First Thought: Compare Characters One by One

If the pattern had only normal characters and `.`, the problem would be simple.

We could compare `s[i]` and `p[j]`.

They match if:

```python
s[i] == p[j] or p[j] == "."
```

Then move both pointers forward.

But `*` changes the problem.

For example:

```text
s = "aaa"
p = "a*"
```

The pattern element `a*` can match:

```text
""
"a"
"aa"
"aaa"
```

So one pattern position can match many string positions.

That branching is why we need dynamic programming.

## Key Insight

Think of the problem as matching suffixes.

Let:

```python
dp(i, j)
```

mean:

```text
Does s[i:] match p[j:]?
```

Then the answer is:

```python
dp(0, 0)
```

At each state, we inspect the current pattern character `p[j]`.

There are two main cases.

## Case 1: No Star After Current Pattern Character

Suppose the next pattern character is not `*`.

Then the current characters must match once.

They match if:

```python
i < len(s) and (s[i] == p[j] or p[j] == ".")
```

If they match, continue with:

```python
dp(i + 1, j + 1)
```

If they do not match, the answer is `false`.

## Case 2: Star After Current Pattern Character

Suppose:

```python
p[j + 1] == "*"
```

Then `p[j] *` gives us two choices.

First, use it zero times.

That means we skip this pattern block:

```python
dp(i, j + 2)
```

Second, use it one or more times.

This is allowed only if the current string character matches `p[j]`.

Then we consume one character from `s`, but keep the same pattern position:

```python
dp(i + 1, j)
```

We keep `j` unchanged because `*` may match more copies.

So the recurrence is:

```python
dp(i, j) = dp(i, j + 2) or (first_match and dp(i + 1, j))
```

This is the core of the solution.

## Visual Walkthrough

Use:

```text
s = "aa"
p = "a*"
```

Start:

```text
dp(0, 0)
```

Pattern block:

```text
a*
```

There are two choices.

Choice 1: match zero `a` characters.

```text
dp(0, 2)
```

But pattern is exhausted while string still has `"aa"`, so this is `false`.

Choice 2: match one `a`.

```text
dp(1, 0)
```

Again with `a*`.

Choice 1 from here: match zero more `a`.

```text
dp(1, 2)
```

Pattern exhausted while string still has `"a"`, so false.

Choice 2: match one more `a`.

```text
dp(2, 0)
```

Now the string is exhausted.

From here, `a*` can match zero more `a`.

```text
dp(2, 2)
```

Both string and pattern are exhausted.

So this is `true`.

Therefore:

```text
isMatch("aa", "a*") == true
```

## Algorithm

Use recursion with memoization.

For a state `dp(i, j)`:

1. If `j == len(p)`, return whether `i == len(s)`.
2. Compute whether the first current character matches.
3. If `j + 1 < len(p)` and `p[j + 1] == "*"`, return:
   - zero occurrences: `dp(i, j + 2)`
   - or one or more occurrences: `first_match and dp(i + 1, j)`
4. Otherwise, return:
   - `first_match and dp(i + 1, j + 1)`

Memoization stores each `(i, j)` state so repeated branches do not recompute the same suffix match.

## Correctness

Define `dp(i, j)` as whether `s[i:]` matches `p[j:]`.

If `j == len(p)`, no pattern remains. The match succeeds exactly when no string remains, so the base case is correct.

For a normal pattern character, the pattern must match exactly one string character. The algorithm checks that the current characters match, then recursively matches the remaining suffixes. This directly follows the meaning of normal characters and `.`.

For a starred pattern block `p[j]*`, there are only two possible categories:

1. It matches zero copies of `p[j]`.
2. It matches at least one copy of `p[j]`.

The first case skips the block with `dp(i, j + 2)`.

The second case requires the current string character to match `p[j]`, consumes one string character, and keeps the same pattern block with `dp(i + 1, j)`.

These two cases cover every valid use of `*`.

Since the recurrence exactly follows the definition of the pattern language, and since the answer is `dp(0, 0)`, the algorithm returns `true` exactly when the pattern matches the entire string.

## Complexity

| Metric | Value | Why |
|---|---|---|
| Time | `O(mn)` | There are at most `m * n` suffix states |
| Space | `O(mn)` | Memoization stores states |

Here:

```text
m = len(s)
n = len(p)
```

Strictly, the number of states is `(m + 1) * (n + 1)`.

## Implementation

```python
from functools import cache

class Solution:
    def isMatch(self, s: str, p: str) -> bool:
        @cache
        def dp(i: int, j: int) -> bool:
            if j == len(p):
                return i == len(s)

            first_match = (
                i < len(s)
                and (s[i] == p[j] or p[j] == ".")
            )

            if j + 1 < len(p) and p[j + 1] == "*":
                return (
                    dp(i, j + 2)
                    or (first_match and dp(i + 1, j))
                )

            return first_match and dp(i + 1, j + 1)

        return dp(0, 0)
```

## Code Explanation

The memoized function:

```python
dp(i, j)
```

means:

```text
s[i:] matches p[j:]
```

The base case:

```python
if j == len(p):
    return i == len(s)
```

If the pattern is exhausted, the string must also be exhausted.

The current character match check:

```python
first_match = (
    i < len(s)
    and (s[i] == p[j] or p[j] == ".")
)
```

This handles both ordinary characters and `.`.

The star case:

```python
if j + 1 < len(p) and p[j + 1] == "*":
```

The zero-occurrence branch:

```python
dp(i, j + 2)
```

The one-or-more branch:

```python
first_match and dp(i + 1, j)
```

The normal case:

```python
return first_match and dp(i + 1, j + 1)
```

Finally:

```python
return dp(0, 0)
```

## Testing

```python
def run_tests():
    s = Solution()

    assert s.isMatch("aa", "a") is False
    assert s.isMatch("aa", "a*") is True
    assert s.isMatch("ab", ".*") is True
    assert s.isMatch("aab", "c*a*b") is True
    assert s.isMatch("mississippi", "mis*is*p*.") is False
    assert s.isMatch("", "c*") is True
    assert s.isMatch("", ".") is False
    assert s.isMatch("ab", ".*c") is False
    assert s.isMatch("aaa", "ab*a*c*a") is True

    print("all tests passed")

run_tests()
```

| Test | Why |
|---|---|
| `"aa"`, `"a"` | Pattern must match the whole string |
| `"aa"`, `"a*"` | `*` matches multiple characters |
| `"ab"`, `".*"` | Dot with star can match any sequence |
| `"aab"`, `"c*a*b"` | Star can match zero or more copies |
| `"mississippi"`, `"mis*is*p*."` | Complex false case |
| `""`, `"c*"` | Star can match empty string |
| `""`, `"."` | Dot needs one character |
| `"ab"`, `".*c"` | Full match prevents partial success |
| `"aaa"`, `"ab*a*c*a"` | Mixed zero and nonzero star use |

