3.6 External-Memory and Database Structures
B-tree indexes, LSM trees, SSTables, write-ahead logs, compaction strategies, cache-oblivious structures, inverted indexes, and external sort/merge.
30 notes
B-tree indexes, LSM trees, SSTables, write-ahead logs, compaction strategies, cache-oblivious structures, inverted indexes, and external sort/merge.
Persistence, functional structures, succinct and compressed encodings, geometric indexing, concurrent and lock-free designs, external-memory structures, and probabilistic data structures.
K-way merge using a tournament tree that stores winners at internal nodes and repeatedly updates the winning path.
Efficient k-way merge using a tournament tree that stores losers to reduce comparisons and improve external merge performance.
External-memory bucket sort that partitions records into ordered buckets stored on disk and sorts each bucket separately.
External-memory sorting algorithm that uses sampling to partition data into balanced buckets, then sorts each bucket independently.
External-memory radix sort that processes large datasets by distributing records into digit-based buckets across multiple passes.
External sorting algorithm that performs run generation and merging concurrently across multiple processors and disks.
Sorting method that keeps data in memory until a memory limit is reached, then spills sorted runs to external storage and merges them.
Sorting method that divides input into bounded-size chunks, sorts each chunk independently, and then combines the chunks.
External sorting method that uses memory-mapped files so large data can be accessed through virtual memory while still sorting in chunks and merging runs.
External sorting algorithm that distributes sorted runs across sequential storage tapes and repeatedly merges them.
External sorting technique that uses a heap to produce initial sorted runs longer than the available memory.
The sorting stage used before a sort-merge join when one or both inputs are not already ordered by the join key.
External sorting procedure used by database systems for ORDER BY, GROUP BY, DISTINCT, and sort-merge joins.
Sorting through hierarchical merging in Log Structured Merge Trees using compaction across levels.
Sorting method that builds a B-tree directly from sorted data using bulk loading to minimize I/O operations.
External-memory sorting method that inserts records into a buffered tree and extracts them in sorted order.
Cache oblivious sorting algorithm that achieves optimal memory transfer complexity using recursive multiway merging structures called funnels.
Sorting techniques that achieve good cache behavior without explicitly knowing cache size or block size.
Sorting techniques that explicitly use knowledge of cache size and block size to minimize memory hierarchy cost.
External-memory sorting method that partitions data into key ranges, sorts each range, and concatenates the sorted partitions.
External sorting method that merges sorted runs through staged levels so output from one level feeds the next.
External sorting algorithm that repeatedly merges groups of k sorted runs until one final sorted run remains.
External sorting algorithm that distributes sorted runs unevenly across files so repeated merges need fewer empty files and less redistribution.
External sorting algorithm that performs run generation and a single multiway merge using limited memory.
Sort data that does not fit in memory by dividing it into sorted runs and merging them using external storage.
Generate long initial runs for external sorting using replacement selection with disk-based streams.
Sort a stream in bounded chunks by buffering records, sorting each chunk, and emitting sorted runs or partially ordered output.
Generate long sorted runs from a stream using a heap, commonly used in external sorting.