Skip to content

Memory Mapped External Sort

External sorting method that uses memory-mapped files so large data can be accessed through virtual memory while still sorting in chunks and merging runs.

Memory mapped external sort uses memory-mapped files to sort data larger than ordinary working memory. Instead of manually reading and writing file blocks with explicit I/O calls, the program maps file regions into virtual memory and lets the operating system page data in and out.

This can simplify implementation, but it does not remove the external-memory problem. The algorithm still needs chunking, sorted runs, and merging to avoid random page faults and excessive memory pressure.

Problem

Given a large file of records, sort the records without loading the entire file into normal memory.

The file may be larger than available RAM, but the operating system can map file pages into the process address space.

Core Idea

Use memory mapping for file access, but keep the sorting pattern external-memory friendly:

  1. map a region of the input file
  2. sort a chunk that fits comfortably in memory
  3. write or map the sorted chunk as a run
  4. merge the sorted runs sequentially

The memory map is an access mechanism. The sorting strategy is still external merge sort.

Algorithm

memory_mapped_external_sort(file, chunk_size):
    runs = []

    for each chunk region in file:
        A = map chunk region
        sort A in memory
        run = write sorted chunk
        runs.append(run)
        unmap A

    return merge_runs(runs)

The merge phase should read runs sequentially.

merge_mapped_runs(runs):
    map small windows from each run
    keep current value from each run in a heap

    while heap is not empty:
        output minimum value
        advance inside that run window

        if window is exhausted:
            map next window from that run

Example

Suppose the file contains:

[8,3,5,1,9,2,7,4] [8, 3, 5, 1, 9, 2, 7, 4]

and memory allows chunks of 3 records.

Run generation:

chunksorted run
[8, 3, 5][3, 5, 8]
[1, 9, 2][1, 2, 9]
[7, 4][4, 7]

Merge the runs:

[3,5,8],[1,2,9],[4,7] [3,5,8], [1,2,9], [4,7]

Final output:

[1,2,3,4,5,7,8,9] [1,2,3,4,5,7,8,9]

Correctness

Each mapped chunk is sorted before being written as a run. Therefore every generated run is sorted.

The merge phase repeatedly emits the smallest current element among all run heads. Since each run is sorted, no later element in a run can be smaller than its current head. Thus every emitted element is the smallest remaining element overall.

All records belong to exactly one chunk and then exactly one run. The final merge emits every record once, so the output is a sorted permutation of the input.

Complexity

Let:

  • NN be the number of records
  • MM be the chosen chunk size
  • R=N/MR = \lceil N/M \rceil be the number of runs
  • BB be the page or block size
  • kk be the merge fan-in

CPU cost for run generation:

O(NlogM) O(N \log M)

Merge CPU cost:

O(Nlogk) O(N \log k)

per merge level.

I/O cost is similar to external merge sort:

O(NBlogkR) O\left(\frac{N}{B} \log_k R\right)

The actual cost depends heavily on page faults, readahead, dirty-page writeback, and mapping granularity.

Design Choices

choiceeffect
chunk sizecontrols memory pressure and run count
mapped window sizecontrols page-fault behavior during merge
sequential accessimproves OS readahead
random accessmay cause many page faults
explicit unmappingreleases address space and page pressure
temp run layoutaffects merge locality

When to Use

Use memory mapped external sort when:

  • the data is file-backed
  • record layout is fixed-width or easy to address
  • the platform has good memory-mapping support
  • implementation simplicity matters
  • sequential access can be preserved

Avoid relying on memory mapping alone for huge random-access sorts. The operating system may thrash if the working set is much larger than RAM.

Implementation Sketch

This simplified example uses arrays, but the structure matches a memory-mapped external sort.

def make_sorted_runs(data, chunk_size):
    runs = []

    for i in range(0, len(data), chunk_size):
        chunk = data[i:i + chunk_size]
        runs.append(sorted(chunk))

    return runs
import heapq

def merge_runs(runs):
    heap = []
    iters = [iter(run) for run in runs]

    for run_id, it in enumerate(iters):
        try:
            heapq.heappush(heap, (next(it), run_id))
        except StopIteration:
            pass

    out = []

    while heap:
        value, run_id = heapq.heappop(heap)
        out.append(value)

        try:
            nxt = next(iters[run_id])
            heapq.heappush(heap, (nxt, run_id))
        except StopIteration:
            pass

    return out
def memory_mapped_external_sort_model(data, chunk_size):
    runs = make_sorted_runs(data, chunk_size)
    return merge_runs(runs)

Notes

Memory mapped external sort is mostly an engineering variant of external merge sort. The main decision is not whether mapping is possible, but whether the algorithm still gives the operating system predictable sequential access.