# Array Union

# Array Union

Array union combines elements from two arrays and returns all distinct values. It can also be defined as a multiset union that preserves counts.

You use it when merging datasets while eliminating duplicates.

## Problem

Given arrays $A$ and $B$, compute:

$$
A \cup B
$$

Two common variants:

* distinct union: each value appears once
* multiset union: each value appears $\max(count_A, count_B)$ times

## Algorithm

For distinct union, use a set.

```id="array-union-distinct"
union_distinct(A, B):
    S = empty set

    for x in A:
        add x to S

    for x in B:
        add x to S

    return elements of S
```

For multiset union, use counts.

```id="array-union-multiset"
union_multiset(A, B):
    countA = frequency map of A
    countB = frequency map of B

    R = empty array

    for each key in union of keys:
        k = max(countA[key], countB[key])
        append key k times to R

    return R
```

## Example

Let

$$
A = [8, 3, 5, 3]
$$

and

$$
B = [3, 3, 8, 1]
$$

Distinct union:

| step  | set          |
| ----- | ------------ |
| add A | {8, 3, 5}    |
| add B | {8, 3, 5, 1} |

Result:

$$
[8, 3, 5, 1]
$$

Multiset union:

| value | count in A | count in B | max |
| ----- | ---------: | ---------: | --: |
| 8     |          1 |          1 |   1 |
| 3     |          2 |          2 |   2 |
| 5     |          1 |          0 |   1 |
| 1     |          0 |          1 |   1 |

Result:

$$
[8, 3, 3, 5, 1]
$$

## Correctness

For the distinct version, the set contains all unique elements from both arrays. Adding elements from both arrays ensures that every value present in either array appears exactly once.

For the multiset version, taking the maximum count for each value ensures that the resulting frequency matches the definition of multiset union.

## Complexity

| variant  | time       | space      |
| -------- | ---------- | ---------- |
| distinct | $O(n + m)$ | $O(n + m)$ |
| multiset | $O(n + m)$ | $O(n + m)$ |

## When to Use

Array union is appropriate when:

* combining datasets
* removing duplicates
* computing coverage of values

If arrays are sorted, a two-pointer method can compute the union in:

$$
O(n + m)
$$

time with:

$$
O(1)
$$

extra space.

## Implementation

```python id="array-union-python"
def union_distinct(a, b):
    return list(set(a) | set(b))

def union_multiset(a, b):
    from collections import Counter
    ca = Counter(a)
    cb = Counter(b)

    result = []
    keys = set(ca) | set(cb)

    for k in keys:
        count = max(ca[k], cb[k])
        result.extend([k] * count)

    return result
```

```go id="array-union-go"
func UnionDistinct(a, b []int) []int {
    s := make(map[int]struct{})

    for _, x := range a {
        s[x] = struct{}{}
    }

    for _, x := range b {
        s[x] = struct{}{}
    }

    result := make([]int, 0, len(s))
    for k := range s {
        result = append(result, k)
    }

    return result
}

func UnionMultiset(a, b []int) []int {
    countA := make(map[int]int)
    countB := make(map[int]int)

    for _, x := range a {
        countA[x]++
    }

    for _, x := range b {
        countB[x]++
    }

    result := []int{}
    keys := make(map[int]struct{})

    for k := range countA {
        keys[k] = struct{}{}
    }
    for k := range countB {
        keys[k] = struct{}{}
    }

    for k := range keys {
        c := countA[k]
        if countB[k] > c {
            c = countB[k]
        }

        for i := 0; i < c; i++ {
            result = append(result, k)
        }
    }

    return result
}
```

