# LeetCode 244: Shortest Word Distance II

## Problem Restatement

We need to design a class called `WordDistance`.

The class receives a list of words in the constructor.

Then it must support a method called `shortest(word1, word2)`.

This method returns the shortest index distance between `word1` and `word2` in the original list.

The important detail is that `shortest` will be called many times with different pairs of words. So we should avoid scanning the whole word list every time. The problem guarantees that `word1` and `word2` are different and both exist in the list.

## Input and Output

| Item | Meaning |
|---|---|
| Constructor input | A list of words |
| Method input | Two different words, `word1` and `word2` |
| Method output | The minimum distance between any occurrence of the two words |
| Guarantee | Both words exist in the list |
| Repeated calls | `shortest` may be called many times |

Example class shape:

```python
class WordDistance:

    def __init__(self, wordsDict: list[str]):
        ...

    def shortest(self, word1: str, word2: str) -> int:
        ...
```

## Examples

Consider:

```python
wordsDict = ["practice", "makes", "perfect", "coding", "makes"]
```

Query:

```python
word1 = "coding"
word2 = "practice"
```

`"coding"` appears at index `3`.

`"practice"` appears at index `0`.

The distance is:

```python
abs(3 - 0) = 3
```

So the answer is:

```python
3
```

Another query:

```python
word1 = "makes"
word2 = "coding"
```

`"makes"` appears at indices `1` and `4`.

`"coding"` appears at index `3`.

The closest pair is index `4` and index `3`.

The answer is:

```python
1
```

## First Thought: Scan Every Time

For each call to `shortest`, we could scan the whole list and use the same idea from LeetCode 243.

```python
class WordDistance:

    def __init__(self, wordsDict: list[str]):
        self.wordsDict = wordsDict

    def shortest(self, word1: str, word2: str) -> int:
        last1 = -1
        last2 = -1
        answer = float("inf")

        for i, word in enumerate(self.wordsDict):
            if word == word1:
                last1 = i
            elif word == word2:
                last2 = i

            if last1 != -1 and last2 != -1:
                answer = min(answer, abs(last1 - last2))

        return answer
```

This is correct, but each query costs `O(n)`.

Since the method may be called many times, repeated full scans are wasteful.

## Key Insight

The word list does not change after construction.

So we can preprocess it once.

For each word, store all indices where it appears.

For example:

```python
wordsDict = ["practice", "makes", "perfect", "coding", "makes"]
```

We store:

```python
{
    "practice": [0],
    "makes": [1, 4],
    "perfect": [2],
    "coding": [3],
}
```

Now each query only needs the two index lists.

Because we collect indices from left to right, each list is already sorted.

So the problem becomes:

Given two sorted lists of indices, find the smallest absolute difference between one index from the first list and one index from the second list.

## Algorithm

During construction:

1. Create a hash map from word to list of indices.
2. Scan `wordsDict`.
3. Append each index to the list for that word.

During `shortest(word1, word2)`:

1. Get the index list for `word1`.
2. Get the index list for `word2`.
3. Use two pointers, one for each list.
4. Compare the current indices.
5. Update the best distance.
6. Move the pointer with the smaller index.

Why move the smaller index?

Suppose:

```python
indices1[i] < indices2[j]
```

Moving `j` forward only makes `indices2[j]` larger, so the distance from `indices1[i]` cannot improve.

The only useful move is to advance `i`, because that may bring the first index closer to `indices2[j]`.

## Correctness

The constructor stores every occurrence of every word. Therefore, for any query, the two lists retrieved by `shortest` contain exactly all valid positions of `word1` and `word2`.

The two lists are sorted because the constructor scans the original list from left to right.

The `shortest` method uses two pointers over these sorted lists. At each step, it checks the distance between the current pair. If the left index is smaller, advancing the right pointer cannot make that left index closer, because the right index would only increase. Therefore, the only pair that may improve the answer uses the next left index. The symmetric argument holds when the right index is smaller.

Thus, the algorithm discards only pairs that cannot improve the current best distance. Since it checks every pair that can possibly be optimal, the final answer is the shortest distance.

## Complexity

Let `n` be the number of words in `wordsDict`.

Let `a` be the number of occurrences of `word1`.

Let `b` be the number of occurrences of `word2`.

| Operation | Time | Space |
|---|---:|---:|
| Constructor | `O(n)` | `O(n)` |
| `shortest(word1, word2)` | `O(a + b)` | `O(1)` |

The hash map stores all indices once, so constructor space is `O(n)`.

The query uses only two pointers and one answer variable.

## Implementation

```python
from collections import defaultdict

class WordDistance:

    def __init__(self, wordsDict: list[str]):
        self.positions = defaultdict(list)

        for i, word in enumerate(wordsDict):
            self.positions[word].append(i)

    def shortest(self, word1: str, word2: str) -> int:
        positions1 = self.positions[word1]
        positions2 = self.positions[word2]

        i = 0
        j = 0
        answer = float("inf")

        while i < len(positions1) and j < len(positions2):
            answer = min(answer, abs(positions1[i] - positions2[j]))

            if positions1[i] < positions2[j]:
                i += 1
            else:
                j += 1

        return answer
```

## Code Explanation

The constructor builds a dictionary:

```python
self.positions = defaultdict(list)
```

Each key is a word.

Each value is a list of indices where that word appears.

```python
for i, word in enumerate(wordsDict):
    self.positions[word].append(i)
```

Since `enumerate` visits indices in increasing order, every index list is sorted automatically.

In `shortest`, we retrieve the two sorted lists:

```python
positions1 = self.positions[word1]
positions2 = self.positions[word2]
```

Then we use two pointers:

```python
i = 0
j = 0
```

At every step, we compute the distance between the current two positions:

```python
answer = min(answer, abs(positions1[i] - positions2[j]))
```

Then we move the pointer with the smaller index:

```python
if positions1[i] < positions2[j]:
    i += 1
else:
    j += 1
```

This moves the smaller index forward, giving it a chance to get closer to the larger index.

## Testing

```python
def run_tests():
    wd = WordDistance(["practice", "makes", "perfect", "coding", "makes"])

    assert wd.shortest("coding", "practice") == 3
    assert wd.shortest("makes", "coding") == 1

    wd = WordDistance(["a", "b"])
    assert wd.shortest("a", "b") == 1

    wd = WordDistance(["a", "x", "b", "a", "b"])
    assert wd.shortest("a", "b") == 1

    wd = WordDistance(["x", "a", "x", "x", "b", "x", "a"])
    assert wd.shortest("a", "b") == 2

    print("all tests passed")

run_tests()
```

| Test | Why |
|---|---|
| Standard example | Checks the given behavior |
| Repeated word | Confirms closest occurrence is selected |
| Two words only | Checks the smallest useful input |
| Multiple occurrences on both sides | Checks two-pointer movement |
| Larger gap | Confirms distance is computed correctly |

