A clear explanation of the Shortest Word Distance II problem using preprocessing and two pointers.
Problem Restatement
We need to design a class called WordDistance.
The class receives a list of words in the constructor.
Then it must support a method called shortest(word1, word2).
This method returns the shortest index distance between word1 and word2 in the original list.
The important detail is that shortest will be called many times with different pairs of words. So we should avoid scanning the whole word list every time. The problem guarantees that word1 and word2 are different and both exist in the list.
Input and Output
| Item | Meaning |
|---|---|
| Constructor input | A list of words |
| Method input | Two different words, word1 and word2 |
| Method output | The minimum distance between any occurrence of the two words |
| Guarantee | Both words exist in the list |
| Repeated calls | shortest may be called many times |
Example class shape:
class WordDistance:
def __init__(self, wordsDict: list[str]):
...
def shortest(self, word1: str, word2: str) -> int:
...Examples
Consider:
wordsDict = ["practice", "makes", "perfect", "coding", "makes"]Query:
word1 = "coding"
word2 = "practice""coding" appears at index 3.
"practice" appears at index 0.
The distance is:
abs(3 - 0) = 3So the answer is:
3Another query:
word1 = "makes"
word2 = "coding""makes" appears at indices 1 and 4.
"coding" appears at index 3.
The closest pair is index 4 and index 3.
The answer is:
1First Thought: Scan Every Time
For each call to shortest, we could scan the whole list and use the same idea from LeetCode 243.
class WordDistance:
def __init__(self, wordsDict: list[str]):
self.wordsDict = wordsDict
def shortest(self, word1: str, word2: str) -> int:
last1 = -1
last2 = -1
answer = float("inf")
for i, word in enumerate(self.wordsDict):
if word == word1:
last1 = i
elif word == word2:
last2 = i
if last1 != -1 and last2 != -1:
answer = min(answer, abs(last1 - last2))
return answerThis is correct, but each query costs O(n).
Since the method may be called many times, repeated full scans are wasteful.
Key Insight
The word list does not change after construction.
So we can preprocess it once.
For each word, store all indices where it appears.
For example:
wordsDict = ["practice", "makes", "perfect", "coding", "makes"]We store:
{
"practice": [0],
"makes": [1, 4],
"perfect": [2],
"coding": [3],
}Now each query only needs the two index lists.
Because we collect indices from left to right, each list is already sorted.
So the problem becomes:
Given two sorted lists of indices, find the smallest absolute difference between one index from the first list and one index from the second list.
Algorithm
During construction:
- Create a hash map from word to list of indices.
- Scan
wordsDict. - Append each index to the list for that word.
During shortest(word1, word2):
- Get the index list for
word1. - Get the index list for
word2. - Use two pointers, one for each list.
- Compare the current indices.
- Update the best distance.
- Move the pointer with the smaller index.
Why move the smaller index?
Suppose:
indices1[i] < indices2[j]Moving j forward only makes indices2[j] larger, so the distance from indices1[i] cannot improve.
The only useful move is to advance i, because that may bring the first index closer to indices2[j].
Correctness
The constructor stores every occurrence of every word. Therefore, for any query, the two lists retrieved by shortest contain exactly all valid positions of word1 and word2.
The two lists are sorted because the constructor scans the original list from left to right.
The shortest method uses two pointers over these sorted lists. At each step, it checks the distance between the current pair. If the left index is smaller, advancing the right pointer cannot make that left index closer, because the right index would only increase. Therefore, the only pair that may improve the answer uses the next left index. The symmetric argument holds when the right index is smaller.
Thus, the algorithm discards only pairs that cannot improve the current best distance. Since it checks every pair that can possibly be optimal, the final answer is the shortest distance.
Complexity
Let n be the number of words in wordsDict.
Let a be the number of occurrences of word1.
Let b be the number of occurrences of word2.
| Operation | Time | Space |
|---|---|---|
| Constructor | O(n) | O(n) |
shortest(word1, word2) | O(a + b) | O(1) |
The hash map stores all indices once, so constructor space is O(n).
The query uses only two pointers and one answer variable.
Implementation
from collections import defaultdict
class WordDistance:
def __init__(self, wordsDict: list[str]):
self.positions = defaultdict(list)
for i, word in enumerate(wordsDict):
self.positions[word].append(i)
def shortest(self, word1: str, word2: str) -> int:
positions1 = self.positions[word1]
positions2 = self.positions[word2]
i = 0
j = 0
answer = float("inf")
while i < len(positions1) and j < len(positions2):
answer = min(answer, abs(positions1[i] - positions2[j]))
if positions1[i] < positions2[j]:
i += 1
else:
j += 1
return answerCode Explanation
The constructor builds a dictionary:
self.positions = defaultdict(list)Each key is a word.
Each value is a list of indices where that word appears.
for i, word in enumerate(wordsDict):
self.positions[word].append(i)Since enumerate visits indices in increasing order, every index list is sorted automatically.
In shortest, we retrieve the two sorted lists:
positions1 = self.positions[word1]
positions2 = self.positions[word2]Then we use two pointers:
i = 0
j = 0At every step, we compute the distance between the current two positions:
answer = min(answer, abs(positions1[i] - positions2[j]))Then we move the pointer with the smaller index:
if positions1[i] < positions2[j]:
i += 1
else:
j += 1This moves the smaller index forward, giving it a chance to get closer to the larger index.
Testing
def run_tests():
wd = WordDistance(["practice", "makes", "perfect", "coding", "makes"])
assert wd.shortest("coding", "practice") == 3
assert wd.shortest("makes", "coding") == 1
wd = WordDistance(["a", "b"])
assert wd.shortest("a", "b") == 1
wd = WordDistance(["a", "x", "b", "a", "b"])
assert wd.shortest("a", "b") == 1
wd = WordDistance(["x", "a", "x", "x", "b", "x", "a"])
assert wd.shortest("a", "b") == 2
print("all tests passed")
run_tests()| Test | Why |
|---|---|
| Standard example | Checks the given behavior |
| Repeated word | Confirms closest occurrence is selected |
| Two words only | Checks the smallest useful input |
| Multiple occurrences on both sides | Checks two-pointer movement |
| Larger gap | Confirms distance is computed correctly |