15.24 Case Studies
You have learned the mechanics of divide and conquer: ```text split solve recursively
15.24 Case Studies
Problem
You have learned the mechanics of divide and conquer:
split
solve recursively
combine
But in real systems, problems rarely announce themselves as textbook recurrences. You may see a slow batch job, an expensive query engine, a data-processing task, or a geometric search problem. The challenge is recognizing which part of the work can be decomposed and which part must be combined carefully.
How do divide-and-conquer techniques appear in practical systems?
Solution
Study complete cases where the decomposition changes the design.
A useful case study should identify:
- The original bottleneck
- The recursive split
- The combine step
- The correctness argument
- The complexity improvement
- The engineering tradeoffs
The following cases show how the same pattern appears in sorting, analytics, search, geometry, arithmetic, and parallel processing.
Case Study 1: External Merge Sort
Problem
A dataset is too large to fit in memory.
Example:
input file: 1 TB
available memory: 8 GB
Sorting it with an in-memory algorithm is impossible.
Divide-and-Conquer Design
Split the input into chunks that fit in memory.
chunk 1
chunk 2
chunk 3
...
Sort each chunk independently.
Write each sorted run back to disk.
Then repeatedly merge sorted runs.
sorted run 1
sorted run 2
sorted run 3
...
↓
multiway merge
↓
sorted output
Combine Step
The combine step is a k-way merge.
Use a min-heap containing the current smallest value from each run.
type Item struct {
Value int
RunID int
}
At each step:
- Pop the smallest item.
- Write it to output.
- Read the next value from the same run.
- Push it into the heap.
Complexity
If the total input size is (n), CPU work is roughly:
$$ O(n\log k) $$
for a k-way merge.
But the dominant cost is usually disk I/O.
The real optimization target is sequential reads and writes, not just comparison count.
Engineering Notes
Use large buffers.
Avoid random disk access.
Compress intermediate runs if I/O is the bottleneck.
Tune the number of merge streams to available memory and file descriptor limits.
External merge sort is a practical reminder that divide and conquer is also about organizing data movement.
Case Study 2: Counting Out-of-Order Events
Problem
A logging system receives events with timestamps.
You want to measure how disordered the stream is.
Example:
timestamps:
[100, 105, 102, 110, 101]
An inversion represents an out-of-order pair:
earlier position has later timestamp
Divide-and-Conquer Design
Use inversion counting.
Split the timestamp array.
Recursively count disorder inside each half.
During merge, count cross-boundary inversions.
Combine Step
When merging sorted halves:
left[i] > right[j]
means right[j] is out of order relative to every remaining element in left.
Count:
mid - i
inversions at once.
Complexity
Brute force:
$$ O(n^2) $$
Divide and conquer:
$$ O(n\log n) $$
Engineering Notes
Use a 64-bit counter.
Keep the original event records if you need to diagnose actual disorder pairs.
Use the count as a metric:
0 perfectly ordered
near n(n-1)/2 almost reverse ordered
For streaming systems, compute this in windows.
Case Study 3: Top-K Candidate Selection
Problem
A recommendation system scores millions of candidates.
You need only the top 100.
Sorting all candidates costs:
$$ O(n\log n) $$
but full ordering is unnecessary.
Divide-and-Conquer Design
Use Quickselect to partition around the top-k boundary.
After partitioning, the best (k) candidates are on one side of the array.
Then sort only that small side if ordered output is required.
Combine Step
There is no merge step.
The partition step is the key operation.
It places one pivot in final rank position and determines which side contains the desired rank.
Complexity
Expected selection:
$$ O(n) $$
Sorting the selected prefix:
$$ O(k\log k) $$
Total:
$$ O(n+k\log k) $$
This is much better than:
$$ O(n\log n) $$
when (k) is small.
Engineering Notes
If candidates arrive as a stream, a heap may be better:
$$ O(n\log k) $$
If the full batch is available, Quickselect is often faster.
Use three-way partitioning when many candidates share the same score.
Document whether ties are arbitrary or stable.
Case Study 4: Nearest Facility Lookup
Problem
Given a set of facility locations, find the closest pair or detect whether any two facilities are too close.
This appears in:
warehouse planning
cell tower placement
retail site analysis
collision detection
A brute-force distance check compares every pair.
Divide-and-Conquer Design
Use closest-pair divide and conquer.
Sort by x-coordinate.
Split at the median.
Solve each side recursively.
Check cross-boundary pairs in a strip.
Combine Step
The strip is sorted by y-coordinate.
Each point is compared only with a constant number of following points.
This makes the combine step linear.
Complexity
Brute force:
$$ O(n^2) $$
Divide and conquer:
$$ O(n\log n) $$
Engineering Notes
Use squared distances for comparison.
Normalize coordinates if precision varies.
Handle duplicate points early.
For repeated nearest-neighbor queries, a spatial index such as a k-d tree may be more appropriate.
Closest-pair is best for batch analysis.
Case Study 5: Large Integer Arithmetic
Problem
A symbolic mathematics system needs to multiply large integers.
Grade-school multiplication performs:
$$ O(n^2) $$
digit or limb operations.
Divide-and-Conquer Design
Use Karatsuba multiplication beyond a size threshold.
Split both numbers into high and low halves.
Replace four recursive products with three.
Combine Step
Use the identity:
middle = (a+b)(c+d) - ac - bd
Then recombine:
high product
middle product
low product
with digit or limb shifts.
Complexity
Grade-school:
$$ O(n^2) $$
Karatsuba:
$$ O(n^{1.585}) $$
Engineering Notes
Do not use Karatsuba for tiny inputs.
Use limb arrays rather than decimal strings.
Normalize carries and borrows carefully.
Benchmark the cutoff.
For very large inputs, FFT or number-theoretic transforms may be better.
Case Study 6: Batched Range Queries
Problem
An analytics job receives many range queries:
How many values <= x appear in array[l..r]?
Answering each query by scanning the range is too slow.
Divide-and-Conquer Design
Build a merge-sort tree.
Each node represents an index range and stores its values sorted.
A query decomposes into (O(\log n)) segment tree nodes.
Each node answers with binary search.
Combine Step
The build step merges sorted child arrays.
The query step sums results from covered nodes.
Complexity
Build:
$$ O(n\log n) $$
Query:
$$ O(\log^2 n) $$
Batch of (q) queries:
$$ O(n\log n+q\log^2 n) $$
Engineering Notes
Memory usage is (O(n\log n)).
Coordinate compression may reduce storage.
For static arrays with many quantile queries, consider wavelet trees.
For mutable arrays, a merge-sort tree becomes harder to update efficiently.
Case Study 7: Parallel Sorting
Problem
A service must sort large arrays faster by using multiple CPU cores.
A single-threaded algorithm underuses hardware.
Divide-and-Conquer Design
Use parallel merge sort or parallel quick sort.
At high recursion levels, spawn tasks for independent subproblems.
Below a threshold, switch to sequential sorting.
Combine Step
For merge sort, merging can remain sequential or become parallel.
For quick sort, partition first, then recursively sort partitions in parallel.
Complexity
Total work remains:
$$ O(n\log n) $$
Parallel execution reduces wall-clock time when enough independent work exists.
Engineering Notes
Do not spawn a task for every recursive call.
Use depth and size cutoffs.
Avoid shared mutable output.
Benchmark against the standard library sort.
Memory bandwidth may limit speedup.
Case Study 8: Polynomial Convolution
Problem
A signal-processing system must convolve two long sequences.
Naive convolution costs:
$$ O(n^2) $$
Divide-and-Conquer Design
Use FFT.
Convert coefficient representation into value representation.
Multiply pointwise.
Use inverse FFT to return to coefficients.
Combine Step
The recursive FFT combines even and odd parts through butterfly operations.
Complexity
FFT-based convolution:
$$ O(n\log n) $$
Engineering Notes
Pad inputs to avoid circular convolution.
Use complex FFT for approximate numeric work.
Use NTT for exact modular integer convolution.
Round carefully when converting back to integers.
A Common Decision Flow
When facing a new system problem, ask:
| Question | Meaning |
|---|---|
| Can the input be split cleanly? | Look for array halves, tree children, matrix blocks, time intervals, or value ranges. |
| Are subproblems independent? | Recursive calls should not require constant synchronization. |
| Is the combine step cheap? | Linear or near-linear combine steps are usually acceptable. |
| Can one side be discarded? | Selection and search problems often recurse into one branch. |
| Is preprocessing allowed? | Offline query methods rely on seeing the whole workload. |
| Is memory movement the bottleneck? | Cache-oblivious decomposition may help. |
| Is the input adversarial? | Randomization or deterministic pivot guarantees may matter. |
This flow prevents blind application of a familiar algorithm.
Choosing Between Similar Techniques
Sort vs Select
Use sorting when complete order is needed.
Use selection when only a rank or top-k boundary is needed.
Online vs Offline Queries
Use online structures when answers must be immediate.
Use offline divide and conquer when all queries are known and total throughput matters.
Merge Sort vs Quick Sort
Use merge sort when stability and worst-case predictability matter.
Use quick sort or introsort when in-place operation and practical speed matter.
Recursive vs Iterative
Use recursion for clarity and structural decomposition.
Use iterative forms when stack depth, allocation, or overhead becomes a problem.
Cache-Aware vs Cache-Oblivious
Use cache-aware blocking when targeting known hardware.
Use cache-oblivious recursion when portability and hierarchical locality matter.
Common Mistakes
Treating Textbook Algorithms as Drop-In Components
Production workloads add constraints:
memory
latency
streaming
parallelism
precision
failure recovery
Adapt the algorithm to the system.
Ignoring the Combine Cost
Many promising decompositions fail because the combine step is too expensive.
Overlooking Data Movement
For large datasets, memory and disk movement often dominate CPU operations.
Forgetting Workload Shape
Top-k, full sort, streaming top-k, and approximate top-k are different problems.
Missing the Simpler Baseline
A highly optimized standard-library routine may beat a custom divide-and-conquer implementation.
Benchmark before replacing simple code.
Discussion
Case studies make divide and conquer less abstract. The same structure appears in external sorting, analytics, selection, geometry, arithmetic, query engines, and signal processing. The differences lie in what gets divided and how results are combined.
The central design question is always practical: does the decomposition remove enough work, share enough preprocessing, or improve enough locality to justify the added complexity? When the answer is yes, divide and conquer can turn an infeasible job into an ordinary one. When the answer is no, the recursive structure may only add overhead.
Good algorithm engineering starts with a baseline, identifies the bottleneck, chooses a decomposition that attacks that bottleneck, and validates the result with tests and benchmarks. Divide and conquer is one of the strongest tools for that workflow because it exposes structure at multiple scales.