Skip to content

Database External Sort

External sorting procedure used by database systems for ORDER BY, GROUP BY, DISTINCT, and sort-merge joins.

Database external sort is the disk-based sorting procedure used inside relational database query engines. It appears in query plans for operations such as ORDER BY, GROUP BY, DISTINCT, window functions, and sort-merge joins.

The method is usually a variant of external merge sort with database-specific concerns: rows, tuple comparison, memory quotas, spill files, collation rules, NULL ordering, and stable execution under limited working memory.

Problem

Given a stream of database rows and a sort key, produce rows ordered by that key.

The input may exceed the memory granted to the query operator, so the sort must spill intermediate data to disk.

Core Idea

The database sort operator has two main modes:

  1. In-memory sort If all rows fit in the memory grant, sort directly and emit results.

  2. External sort If rows exceed memory, sort chunks into temporary runs, write them to disk, then merge the runs.

database_external_sort(rows, key, memory_limit):
    runs = []

    while more rows:
        chunk = read rows until memory_limit
        sort chunk by key
        write chunk as temp run
        runs.append(temp run)

    if runs is empty:
        return sorted in-memory chunk

    return merge_runs(runs, key)

Run Generation

Rows are read from the child operator until the memory limit is reached. The engine sorts those rows in memory and writes a sorted run to temporary storage.

generate_runs(input, key, memory_limit):
    runs = []

    while input has rows:
        chunk = read rows until memory_limit
        sort chunk using database comparator
        runs.append(write_temp_run(chunk))

    return runs

The comparator must respect database semantics:

  • ascending or descending order
  • NULLS FIRST or NULLS LAST
  • collation for strings
  • multiple key columns
  • tie-breaking if stable order is required

Merge Phase

The merge phase performs a k-way merge over sorted runs.

merge_runs(runs, key):
    heap = empty priority queue

    for each run:
        read first row
        push row into heap

    while heap not empty:
        row = extract smallest row
        emit row

        read next row from same run
        if row exists:
            push row into heap

Example

Input rows:

idnamescore
1Ada91
2Ken75
3Lin88
4Bea75
5Sol99

Sort key:

ORDER BY score ASC, name ASC

If memory holds only three rows, run generation may produce:

runrows
R1(75, Ken), (88, Lin), (91, Ada)
R2(75, Bea), (99, Sol)

The merge compares run heads:

stepcandidatesemitted
1Ken 75, Bea 75Bea 75
2Ken 75, Sol 99Ken 75
3Lin 88, Sol 99Lin 88
4Ada 91, Sol 99Ada 91
5Sol 99Sol 99

Final order:

idnamescore
4Bea75
2Ken75
3Lin88
1Ada91
5Sol99

Correctness

Each generated run is sorted according to the database comparator. The merge phase repeatedly emits the smallest available row among all run heads.

Since each run is sorted, no later row in a run can compare smaller than that run head. Therefore each emitted row is the smallest remaining row globally. After all runs are exhausted, every input row has been emitted once in sorted order.

Complexity

Let:

  • NN be the number of rows
  • MM be the memory grant in rows
  • R=N/MR = \lceil N / M \rceil be the number of runs
  • BB be the block size
  • kk be merge fan-in

Run generation costs:

O(NlogM) O(N \log M)

CPU time for in-memory sorting.

Merge CPU cost:

O(Nlogk) O(N \log k)

per merge level.

I/O cost:

O(NBlogkR) O\left(\frac{N}{B} \log_k R\right)

block transfers, plus the initial read and final write or output stream.

Database-Specific Details

concerneffect
memory grantdetermines whether sort spills
temp storageholds sorted runs
row widthaffects number of rows per run
collationaffects string comparison cost
NULL orderingaffects comparator logic
LIMITmay allow top-k sort instead of full sort
DISTINCTmay remove duplicates during merge
GROUP BYmay aggregate while scanning sorted groups

When to Use

Database external sort is used when a query requires ordered rows and no suitable index already provides the needed order.

Common cases:

  • ORDER BY
  • sort-based GROUP BY
  • DISTINCT
  • window functions with ordering
  • sort-merge join
  • index creation

Implementation Sketch

import heapq

def database_external_sort(rows, key, memory_limit):
    runs = []

    for i in range(0, len(rows), memory_limit):
        chunk = rows[i:i + memory_limit]
        chunk.sort(key=key)
        runs.append(chunk)

    return merge_sorted_runs(runs, key)
def merge_sorted_runs(runs, key):
    heap = []
    iters = [iter(run) for run in runs]

    for run_id, it in enumerate(iters):
        try:
            row = next(it)
            heapq.heappush(heap, (key(row), run_id, row))
        except StopIteration:
            pass

    out = []

    while heap:
        _, run_id, row = heapq.heappop(heap)
        out.append(row)

        try:
            nxt = next(iters[run_id])
            heapq.heappush(heap, (key(nxt), run_id, nxt))
        except StopIteration:
            pass

    return out

Notes

Database external sort is an execution operator, not only a textbook algorithm. Its performance depends heavily on memory grants, row representation, temporary storage, and whether the optimizer can avoid sorting by using an existing index order.