# K Way Loser Tree Merge

# K Way Loser Tree Merge

K way loser tree merge is an optimized method for merging multiple sorted runs. It replaces the standard heap with a tournament tree that stores the loser of each comparison. This reduces the number of comparisons and improves performance in external sorting.

It is widely used in high-performance external merge sort implementations.

## Problem

Given $k$ sorted input streams, merge them into one sorted output efficiently.

The goal is to minimize comparisons and maintain good cache and I/O behavior.

## Core Idea

Organize the $k$ input streams into a binary tournament tree.

* each leaf represents a stream
* internal nodes store the loser of a comparison
* the root stores the overall winner (smallest element)

After outputting the winner, only the path from that leaf to the root needs to be updated.

## Structure

```text id="r5c7mb"
           root (winner)
          /            \
      loser           loser
     /    \          /    \
   ...    ...      ...    ...
  leaf   leaf    leaf   leaf
```

Each internal node stores the loser of its two children. The winner propagates upward.

## Algorithm

### Initialization

```text id="m8x2qt"
build_loser_tree(streams):
    for each stream i:
        read first element

    build tournament tree:
        compare leaves
        store losers in internal nodes
        propagate winners upward
```

### Merge Loop

```text id="v9k1sd"
loser_tree_merge(streams):
    build_loser_tree(streams)

    while streams not empty:
        winner = root
        output winner value

        advance winner's stream

        if stream not empty:
            update path from leaf to root
        else:
            mark leaf as exhausted
```

## Example

Suppose we merge 4 sorted runs:

$$
A = [1, 5, 9],\quad
B = [2, 6, 10],\quad
C = [3, 7],\quad
D = [4, 8]
$$

Initial tournament selects:

$$
1 \text{ (from A)}
$$

After outputting 1, replace with 5 and update only the path involving A.

Next winners:

$$
2, 3, 4, 5, 6, 7, 8, 9, 10
$$

## Correctness

The tournament tree maintains the smallest element among all current stream heads at the root. Each internal node compares two candidates and stores the loser, while the winner advances upward.

After emitting the root value, only the corresponding leaf changes. Recomputing comparisons along the path from that leaf to the root restores the invariant that the root contains the smallest remaining element.

Since each stream is sorted, no later element in a stream can be smaller than its current head. Therefore the root always holds the smallest remaining element across all streams, and the output sequence is sorted.

## Complexity

Let:

* $N$ be total elements
* $k$ be number of streams

Initialization cost:

$$
O(k)
$$

Each output step updates a path of length $\log k$:

$$
O(\log k)
$$

Total time:

$$
O(N \log k)
$$

The constant factor is smaller than heap-based merging because fewer comparisons are performed per update.

## Comparison with Heap Merge

| method      | comparisons per output                  | structure       |
| ----------- | --------------------------------------- | --------------- |
| binary heap | about $\log k$ with more swaps          | heap            |
| loser tree  | exactly $\log k$ with fewer comparisons | tournament tree |

Loser trees avoid repeated comparisons between unchanged elements.

## When to Use

Use loser tree merge when:

* merging many sorted runs
* performance of merge phase is critical
* external sorting with large $k$
* minimizing comparisons matters
* stable, predictable merge cost is needed

It is common in database systems and high-performance sort engines.

## Implementation Sketch

```python id="p8k4rs"
class LoserTree:
    def __init__(self, streams):
        self.k = len(streams)
        self.streams = streams
        self.tree = [-1] * self.k
        self.keys = [None] * self.k

        for i in range(self.k):
            self._advance(i)

        for i in range(self.k):
            self._play(i)
```

```python id="z2w9ac"
    def _advance(self, i):
        try:
            self.keys[i] = next(self.streams[i])
        except StopIteration:
            self.keys[i] = float("inf")
```

```python id="y1t6qp"
    def _play(self, i):
        parent = (i + self.k) // 2
        while parent > 0:
            j = self.tree[parent]
            if j == -1 or self.keys[i] < self.keys[j]:
                self.tree[parent], i = i, j
            parent //= 2
        self.tree[0] = i
```

```python id="n4d7bm"
    def pop(self):
        winner = self.tree[0]
        value = self.keys[winner]
        self._advance(winner)
        self._play(winner)
        return value
```

## Notes

Loser trees are specialized for merging. They reduce comparison overhead compared to heaps and fit well with external sorting, where merging large numbers of runs is the dominant cost.

