Balanced K Way Merge Sort

Balanced k way merge sort is an external sorting algorithm that repeatedly merges sorted runs in groups of at most $k$ . Unlike polyphase merge sort, it tries to keep the merge work evenly distributed across passes.

It is a simple and common model for large file sorting. The input is first divided into memory-sized chunks, each chunk is sorted internally, and then the resulting sorted runs are merged level by level.

Problem

Given a dataset too large to fit in memory, sort it using bounded memory and sequential external I/O.

The algorithm assumes that memory can hold:

one input buffer for each of $k$ runs
one output buffer
a small priority queue or loser tree

Core Idea

Generate sorted runs first. Then repeatedly merge up to $k$ runs into larger sorted runs.

If there are $R$ initial runs, each merge pass reduces the number of runs to about:

\left\lceil \frac{R}{k} \right\rceil

The process stops when only one run remains.

Algorithm

balanced_k_way_merge_sort(input, k):
    runs = generate_sorted_runs(input)

    while length(runs) > 1:
        next_runs = []

        for each group G of at most k runs:
            merged = k_way_merge(G)
            next_runs.append(merged)

        runs = next_runs

    return runs[0]

The merge step keeps the smallest current element from each input run in a min-heap.

k_way_merge(runs):
    heap = empty min-heap

    for each run r:
        read first element from r
        push (value, r) into heap

    while heap is not empty:
        (value, r) = extract_min(heap)
        write value to output

        if r has another element:
            read next value from r
            push (next value, r) into heap

Example

Suppose there are 10 sorted runs and $k = 3$ .

Initial runs:

R = 10

Pass 1 merges groups of 3:

group	input runs	output runs
1	3	1
2	3	1
3	3	1
4	1	1

After pass 1:

R = 4

Pass 2 merges:

group	input runs	output runs
1	3	1
2	1	1

After pass 2:

R = 2

Pass 3 merges both remaining runs:

R = 1

The final run is the sorted output.

Correctness

Each initial run is sorted by an internal sorting algorithm. During a k way merge, the algorithm repeatedly outputs the smallest head element among all active runs. Since every run is sorted, no later element in a run can be smaller than its current head. Therefore each output element is the smallest remaining element overall.

Each merge pass preserves all elements and produces sorted runs. The number of runs decreases after each pass. When one run remains, it contains every input element in sorted order.

Complexity

Let:

$N$ be the number of elements
$M$ be the memory capacity in elements
$R = \lceil N / M \rceil$ be the number of initial runs
$B$ be the block size
$k$ be the merge fan-in

The number of merge passes is:

\left\lceil \log_k R \right\rceil

The I/O cost is:

O\left(\frac{N}{B} \log_k R\right)

The CPU cost of heap-based merging is:

O(N \log k)

per full merge level, with total comparison cost proportional to all merge passes.

Design Choices

choice	effect
larger run size	fewer initial runs
larger $k$	fewer merge passes
smaller $k$	cheaper heap operations
larger buffers	fewer I/O calls
loser tree instead of heap	fewer comparisons during merge

When to Use

Use balanced k way merge sort when:

data is larger than memory
sequential I/O is preferred
many sorted runs must be merged
implementation simplicity matters
file handles and memory buffers are sufficient for $k$ inputs

It is commonly used in database sort operators, external file sorting, and log processing pipelines.

Implementation Sketch

import heapq

def k_way_merge(runs):
    heap = []
    iters = [iter(run) for run in runs]

    for run_id, it in enumerate(iters):
        try:
            value = next(it)
            heapq.heappush(heap, (value, run_id))
        except StopIteration:
            pass

    output = []

    while heap:
        value, run_id = heapq.heappop(heap)
        output.append(value)

        try:
            nxt = next(iters[run_id])
            heapq.heappush(heap, (nxt, run_id))
        except StopIteration:
            pass

    return output

def balanced_k_way_merge_sort(runs, k):
    while len(runs) > 1:
        next_runs = []

        for i in range(0, len(runs), k):
            group = runs[i:i + k]
            next_runs.append(k_way_merge(group))

        runs = next_runs

    return runs[0] if runs else []

Notes

Balanced k way merge sort favors predictable passes and simple scheduling. It usually fits modern storage better than polyphase merge sort because modern systems can often keep many files open and use large sequential buffers.