Skip to content

LSM Tree Compaction Sort

Sorting through hierarchical merging in Log Structured Merge Trees using compaction across levels.

LSM tree compaction sort organizes sorting as a sequence of merges across levels in a Log Structured Merge (LSM) tree. Data is inserted in batches, stored as sorted runs, and periodically merged into larger sorted structures.

Sorting emerges as a byproduct of compaction. Each level maintains sorted order, and merges ensure global ordering across levels.

Model

Assume:

  • data arrives in batches
  • memory buffer of size MM
  • disk organized into levels L0,L1,L_0, L_1, \ldots
  • each level stores sorted runs

Each level is larger than the previous by a factor TT (size ratio).

Core Idea

  1. Accumulate records in memory
  2. Sort and flush as a run to disk
  3. Merge runs across levels when a level exceeds capacity
  4. Continue until data stabilizes

Each compaction step merges sorted runs into a larger sorted run.

Algorithm

lsm_sort(stream):
    mem = empty buffer
    levels = []

    for record in stream:
        insert into mem

        if mem is full:
            run = sort(mem)
            add_to_level(levels, 0, run)
            clear mem

    flush remaining mem
    compact all levels

    return merge_all_levels(levels)

Compaction:

add_to_level(levels, i, run):
    ensure level i exists

    append run to levels[i]

    if size(levels[i]) exceeds threshold:
        merged = merge(levels[i])
        clear levels[i]
        add_to_level(levels, i + 1, merged)

Example

Suppose memory holds 3 elements and input is:

[8,3,5,1,9,2,7] [8, 3, 5, 1, 9, 2, 7]

Step 1: Flush runs

  • [8, 3, 5] → [3, 5, 8] → level 0
  • [1, 9, 2] → [1, 2, 9] → level 0
  • [7] → [7] → level 0

Step 2: Compaction

Level 0 has multiple runs:

merge:

[3,5,8],[1,2,9],[7][1,2,3,5,7,8,9] [3,5,8], [1,2,9], [7] \rightarrow [1,2,3,5,7,8,9]

store at level 1

Correctness

Each flushed run is sorted. Compaction merges sorted runs using a standard merge procedure that preserves order. Each level stores data in sorted form.

When all levels are merged or queried with a merge procedure, the resulting sequence is globally sorted.

No elements are lost or duplicated because compaction only merges existing runs.

Complexity

Let:

  • NN be total elements
  • BB be block size
  • TT be size ratio between levels

I/O cost:

O(NBlogTNM) O\left(\frac{N}{B} \log_T \frac{N}{M}\right)

Each element may be rewritten multiple times during compaction.

CPU cost:

O(Nlogk) O(N \log k)

per merge, where kk is number of runs merged.

Tradeoffs

parametereffect
larger TTfewer levels, more expensive compactions
smaller TTmore levels, cheaper compactions
aggressive compactionbetter read performance
lazy compactionbetter write throughput

When to Use

LSM compaction sort is appropriate when:

  • data arrives continuously
  • write throughput is more important than immediate sorting
  • storage is append-friendly
  • eventual sorted order is acceptable
  • system resembles a database or log store

It is widely used in key-value stores and storage engines.

Comparison

methodbehavior
external merge sortbatch sorting
buffer tree sortbatched insertion
LSM compactioncontinuous merge and sort

Implementation Sketch

import heapq

def merge_runs(runs):
    heap = []
    iters = [iter(r) for r in runs]

    for i, it in enumerate(iters):
        try:
            heapq.heappush(heap, (next(it), i))
        except StopIteration:
            pass

    result = []

    while heap:
        val, i = heapq.heappop(heap)
        result.append(val)
        try:
            heapq.heappush(heap, (next(iters[i]), i))
        except StopIteration:
            pass

    return result
def lsm_sort(data, mem_size):
    levels = []
    mem = []

    def add(level, run):
        while len(levels) <= level:
            levels.append([])
        levels[level].append(run)

        if len(levels[level]) > 2:
            merged = merge_runs(levels[level])
            levels[level] = []
            add(level + 1, merged)

    for x in data:
        mem.append(x)
        if len(mem) >= mem_size:
            add(0, sorted(mem))
            mem = []

    if mem:
        add(0, sorted(mem))

    return merge_runs([run for level in levels for run in level])

Notes

LSM tree compaction reframes sorting as a continuous process. Instead of sorting once, data is gradually organized through merges. This approach trades extra write cost for high ingestion throughput and scalability.