LeetCode 737: Sentence Similarity II

Problem Restatement

We are given two sentences represented as arrays of words:

sentence1
sentence2

We are also given a list of similar word pairs:

similarPairs

Each pair:

[x, y]

means word x is similar to word y.

Two sentences are similar if:

They have the same number of words.
For every index i, sentence1[i] is similar to sentence2[i].

A word is always similar to itself.

Similarity is symmetric and transitive. If a is similar to b, and b is similar to c, then a is similar to c. This is the main difference from LeetCode 734.

Input and Output

Item	Meaning
Input	`sentence1`, `sentence2`, and `similarPairs`
Output	`True` if the sentences are similar, otherwise `False`
Same word	Always considered similar
Symmetric	If `x` is similar to `y`, then `y` is similar to `x`
Transitive	If `x ~ y` and `y ~ z`, then `x ~ z`

Example function shape:

def areSentencesSimilarTwo(
    sentence1: list[str],
    sentence2: list[str],
    similarPairs: list[list[str]],
) -> bool:
    ...

Examples

Example 1

sentence1 = ["great", "acting", "skills"]
sentence2 = ["fine", "drama", "talent"]
similarPairs = [
    ["great", "good"],
    ["fine", "good"],
    ["drama", "acting"],
    ["skills", "talent"],
]

"great" is similar to "fine" because:

great -> good -> fine

The other aligned pairs are also similar.

So the answer is:

True

Example 2

sentence1 = ["I", "love", "leetcode"]
sentence2 = ["I", "love", "onepiece"]
similarPairs = [
    ["manga", "onepiece"],
    ["platform", "anime"],
    ["leetcode", "platform"],
    ["anime", "manga"],
]

The first two words are equal.

For the last word:

leetcode -> platform -> anime -> manga -> onepiece

So "leetcode" is similar to "onepiece" by transitivity.

The answer is:

True

Example 3

sentence1 = ["great"]
sentence2 = ["doubleplus", "good"]
similarPairs = [
    ["great", "good"],
]

The sentence lengths are different.

So the answer is:

False

First Thought: Build a Graph

Each word is a node.

Each similar pair is an undirected edge.

Then two words are similar if they are in the same connected component.

We could run DFS or BFS for every aligned pair. That works, but it may repeat the same graph search many times.

A better fit is union-find.

Key Insight

Union-find keeps connected components.

For every similar pair (a, b), we merge the components of a and b.

After all pairs are processed, two words are similar exactly when they have the same root.

So the problem becomes:

Union every similar pair.
Compare every aligned sentence word pair.
If the words are equal, continue.
Otherwise, check whether they have the same union-find root.

Algorithm

Create a union-find structure over strings.

For each pair [a, b] in similarPairs:

Add a and b if they are not already present.
Union their components.

Then check the two sentences:

If their lengths differ, return False.
For every aligned pair (a, b):
- If a == b, continue.
- If either word is missing from union-find, return False.
- If find(a) != find(b), return False.
Return True.

Correctness

Each similar pair says that two words belong to the same similarity group. The union operation merges exactly those two groups. Since union is repeated for every pair, all words connected by a chain of similar pairs end up in the same component.

Because similarity is transitive, every word in the same connected component is similar to every other word in that component. Because similarity is symmetric, the edges can be treated as undirected.

When checking the sentences, equal words are always similar. For different words, the algorithm accepts them only when their union-find roots are equal. That means they belong to the same connected component, so they are similar by the transitive rule.

If any aligned pair has different roots, there is no chain of similar pairs connecting them, so the sentences are not similar.

Therefore the algorithm returns True exactly when the two sentences satisfy the problem definition.

Complexity

Let p = len(similarPairs) and n = len(sentence1).

Metric	Value	Why
Time	`O((p + n) * α(w))`	Union and find are almost constant time
Space	`O(w)`	Store parent pointers for distinct words

Here w is the number of distinct words in similarPairs, and α is the inverse Ackermann function.

For practical input sizes, union-find operations are effectively constant time.

Implementation

class Solution:
    def areSentencesSimilarTwo(
        self,
        sentence1: list[str],
        sentence2: list[str],
        similarPairs: list[list[str]],
    ) -> bool:
        if len(sentence1) != len(sentence2):
            return False

        parent = {}

        def find(x: str) -> str:
            if x not in parent:
                parent[x] = x

            if parent[x] != x:
                parent[x] = find(parent[x])

            return parent[x]

        def union(a: str, b: str) -> None:
            root_a = find(a)
            root_b = find(b)

            if root_a != root_b:
                parent[root_b] = root_a

        for a, b in similarPairs:
            union(a, b)

        for a, b in zip(sentence1, sentence2):
            if a == b:
                continue

            if a not in parent or b not in parent:
                return False

            if find(a) != find(b):
                return False

        return True

Code Explanation

We first reject sentences of different lengths:

if len(sentence1) != len(sentence2):
    return False

The parent dictionary stores the union-find forest:

parent = {}

The find function creates a new singleton set when needed:

if x not in parent:
    parent[x] = x

Then it follows parent links until it reaches the component root.

Path compression makes future lookups faster:

if parent[x] != x:
    parent[x] = find(parent[x])

The union function connects two components:

if root_a != root_b:
    parent[root_b] = root_a

After processing all similar pairs, every connected group has one representative root.

For each aligned word pair, equal words pass immediately:

if a == b:
    continue

If two different words do not appear in the similarity structure, no pair chain can connect them:

if a not in parent or b not in parent:
    return False

Otherwise, they must share the same root:

if find(a) != find(b):
    return False

Testing

def run_tests():
    s = Solution()

    assert s.areSentencesSimilarTwo(
        ["great", "acting", "skills"],
        ["fine", "drama", "talent"],
        [
            ["great", "good"],
            ["fine", "good"],
            ["drama", "acting"],
            ["skills", "talent"],
        ],
    ) is True

    assert s.areSentencesSimilarTwo(
        ["I", "love", "leetcode"],
        ["I", "love", "onepiece"],
        [
            ["manga", "onepiece"],
            ["platform", "anime"],
            ["leetcode", "platform"],
            ["anime", "manga"],
        ],
    ) is True

    assert s.areSentencesSimilarTwo(
        ["great"],
        ["doubleplus", "good"],
        [["great", "good"]],
    ) is False

    assert s.areSentencesSimilarTwo(
        ["great"],
        ["great"],
        [],
    ) is True

    assert s.areSentencesSimilarTwo(
        ["great"],
        ["fine"],
        [["great", "good"], ["good", "fine"]],
    ) is True

    assert s.areSentencesSimilarTwo(
        ["great"],
        ["fine"],
        [["great", "good"], ["bad", "fine"]],
    ) is False

    print("all tests passed")

run_tests()

Test	Why
Similar through `"good"`	Checks transitive similarity
Longer chain to `"onepiece"`	Checks multi-step connectivity
Different sentence lengths	Must fail immediately
Same word with no pairs	Self-similarity
Two-edge chain	Confirms transitivity
Separate components	Must fail