# LeetCode 737: Sentence Similarity II

## Problem Restatement

We are given two sentences represented as arrays of words:

```python
sentence1
sentence2
```

We are also given a list of similar word pairs:

```python
similarPairs
```

Each pair:

```python
[x, y]
```

means word `x` is similar to word `y`.

Two sentences are similar if:

1. They have the same number of words.
2. For every index `i`, `sentence1[i]` is similar to `sentence2[i]`.

A word is always similar to itself.

Similarity is symmetric and transitive. If `a` is similar to `b`, and `b` is similar to `c`, then `a` is similar to `c`. This is the main difference from LeetCode 734.

## Input and Output

| Item | Meaning |
|---|---|
| Input | `sentence1`, `sentence2`, and `similarPairs` |
| Output | `True` if the sentences are similar, otherwise `False` |
| Same word | Always considered similar |
| Symmetric | If `x` is similar to `y`, then `y` is similar to `x` |
| Transitive | If `x ~ y` and `y ~ z`, then `x ~ z` |

Example function shape:

```python
def areSentencesSimilarTwo(
    sentence1: list[str],
    sentence2: list[str],
    similarPairs: list[list[str]],
) -> bool:
    ...
```

## Examples

### Example 1

```python
sentence1 = ["great", "acting", "skills"]
sentence2 = ["fine", "drama", "talent"]
similarPairs = [
    ["great", "good"],
    ["fine", "good"],
    ["drama", "acting"],
    ["skills", "talent"],
]
```

`"great"` is similar to `"fine"` because:

```text
great -> good -> fine
```

The other aligned pairs are also similar.

So the answer is:

```python
True
```

### Example 2

```python
sentence1 = ["I", "love", "leetcode"]
sentence2 = ["I", "love", "onepiece"]
similarPairs = [
    ["manga", "onepiece"],
    ["platform", "anime"],
    ["leetcode", "platform"],
    ["anime", "manga"],
]
```

The first two words are equal.

For the last word:

```text
leetcode -> platform -> anime -> manga -> onepiece
```

So `"leetcode"` is similar to `"onepiece"` by transitivity.

The answer is:

```python
True
```

### Example 3

```python
sentence1 = ["great"]
sentence2 = ["doubleplus", "good"]
similarPairs = [
    ["great", "good"],
]
```

The sentence lengths are different.

So the answer is:

```python
False
```

## First Thought: Build a Graph

Each word is a node.

Each similar pair is an undirected edge.

Then two words are similar if they are in the same connected component.

We could run DFS or BFS for every aligned pair. That works, but it may repeat the same graph search many times.

A better fit is union-find.

## Key Insight

Union-find keeps connected components.

For every similar pair `(a, b)`, we merge the components of `a` and `b`.

After all pairs are processed, two words are similar exactly when they have the same root.

So the problem becomes:

1. Union every similar pair.
2. Compare every aligned sentence word pair.
3. If the words are equal, continue.
4. Otherwise, check whether they have the same union-find root.

## Algorithm

Create a union-find structure over strings.

For each pair `[a, b]` in `similarPairs`:

1. Add `a` and `b` if they are not already present.
2. Union their components.

Then check the two sentences:

1. If their lengths differ, return `False`.
2. For every aligned pair `(a, b)`:
   - If `a == b`, continue.
   - If either word is missing from union-find, return `False`.
   - If `find(a) != find(b)`, return `False`.
3. Return `True`.

## Correctness

Each similar pair says that two words belong to the same similarity group. The union operation merges exactly those two groups. Since union is repeated for every pair, all words connected by a chain of similar pairs end up in the same component.

Because similarity is transitive, every word in the same connected component is similar to every other word in that component. Because similarity is symmetric, the edges can be treated as undirected.

When checking the sentences, equal words are always similar. For different words, the algorithm accepts them only when their union-find roots are equal. That means they belong to the same connected component, so they are similar by the transitive rule.

If any aligned pair has different roots, there is no chain of similar pairs connecting them, so the sentences are not similar.

Therefore the algorithm returns `True` exactly when the two sentences satisfy the problem definition.

## Complexity

Let `p = len(similarPairs)` and `n = len(sentence1)`.

| Metric | Value | Why |
|---|---:|---|
| Time | `O((p + n) * α(w))` | Union and find are almost constant time |
| Space | `O(w)` | Store parent pointers for distinct words |

Here `w` is the number of distinct words in `similarPairs`, and `α` is the inverse Ackermann function.

For practical input sizes, union-find operations are effectively constant time.

## Implementation

```python
class Solution:
    def areSentencesSimilarTwo(
        self,
        sentence1: list[str],
        sentence2: list[str],
        similarPairs: list[list[str]],
    ) -> bool:
        if len(sentence1) != len(sentence2):
            return False

        parent = {}

        def find(x: str) -> str:
            if x not in parent:
                parent[x] = x

            if parent[x] != x:
                parent[x] = find(parent[x])

            return parent[x]

        def union(a: str, b: str) -> None:
            root_a = find(a)
            root_b = find(b)

            if root_a != root_b:
                parent[root_b] = root_a

        for a, b in similarPairs:
            union(a, b)

        for a, b in zip(sentence1, sentence2):
            if a == b:
                continue

            if a not in parent or b not in parent:
                return False

            if find(a) != find(b):
                return False

        return True
```

## Code Explanation

We first reject sentences of different lengths:

```python
if len(sentence1) != len(sentence2):
    return False
```

The `parent` dictionary stores the union-find forest:

```python
parent = {}
```

The `find` function creates a new singleton set when needed:

```python
if x not in parent:
    parent[x] = x
```

Then it follows parent links until it reaches the component root.

Path compression makes future lookups faster:

```python
if parent[x] != x:
    parent[x] = find(parent[x])
```

The `union` function connects two components:

```python
if root_a != root_b:
    parent[root_b] = root_a
```

After processing all similar pairs, every connected group has one representative root.

For each aligned word pair, equal words pass immediately:

```python
if a == b:
    continue
```

If two different words do not appear in the similarity structure, no pair chain can connect them:

```python
if a not in parent or b not in parent:
    return False
```

Otherwise, they must share the same root:

```python
if find(a) != find(b):
    return False
```

## Testing

```python
def run_tests():
    s = Solution()

    assert s.areSentencesSimilarTwo(
        ["great", "acting", "skills"],
        ["fine", "drama", "talent"],
        [
            ["great", "good"],
            ["fine", "good"],
            ["drama", "acting"],
            ["skills", "talent"],
        ],
    ) is True

    assert s.areSentencesSimilarTwo(
        ["I", "love", "leetcode"],
        ["I", "love", "onepiece"],
        [
            ["manga", "onepiece"],
            ["platform", "anime"],
            ["leetcode", "platform"],
            ["anime", "manga"],
        ],
    ) is True

    assert s.areSentencesSimilarTwo(
        ["great"],
        ["doubleplus", "good"],
        [["great", "good"]],
    ) is False

    assert s.areSentencesSimilarTwo(
        ["great"],
        ["great"],
        [],
    ) is True

    assert s.areSentencesSimilarTwo(
        ["great"],
        ["fine"],
        [["great", "good"], ["good", "fine"]],
    ) is True

    assert s.areSentencesSimilarTwo(
        ["great"],
        ["fine"],
        [["great", "good"], ["bad", "fine"]],
    ) is False

    print("all tests passed")

run_tests()
```

| Test | Why |
|---|---|
| Similar through `"good"` | Checks transitive similarity |
| Longer chain to `"onepiece"` | Checks multi-step connectivity |
| Different sentence lengths | Must fail immediately |
| Same word with no pairs | Self-similarity |
| Two-edge chain | Confirms transitivity |
| Separate components | Must fail |