Maintain the median of a sliding window using two heaps while the window moves over a stream.
Rolling median sort maintains the median value over a sliding window. Instead of fully sorting the window, it keeps just enough structure to extract the median efficiently.
You use it when median or percentile queries are needed over recent data.
Problem
Given a stream ( x_1, x_2, \dots ) and window size ( w ), compute the median of:
[ [x_{t-w+1}, \dots, x_t] ]
after each step.
Idea
Maintain two heaps:
- a max heap for the lower half
- a min heap for the upper half
The heaps are balanced so that:
- sizes differ by at most 1
- all elements in the max heap are less than or equal to those in the min heap
The median is:
- top of max heap if sizes differ
- average of both tops if sizes are equal
Algorithm
rolling_median(stream, w):
low = max_heap()
high = min_heap()
window = queue()
for x in stream:
window.push_back(x)
if low empty or x <= low.top():
low.push(x)
else:
high.push(x)
rebalance(low, high)
if length(window) > w:
old = window.pop_front()
remove(old, low, high)
rebalance(low, high)
output median(low, high)Rebalancing
rebalance(low, high):
if size(low) > size(high) + 1:
high.push(low.pop())
elif size(high) > size(low):
low.push(high.pop())Example
Stream:
[ [1, 3, 2, 6, 4] ]
Window size:
[ w = 3 ]
Medians:
| step | window | median |
|---|---|---|
| 1 | [1] | 1 |
| 2 | [1, 3] | 2 |
| 3 | [1, 3, 2] | 2 |
| 4 | [3, 2, 6] | 3 |
| 5 | [2, 6, 4] | 4 |
Correctness
The heaps partition the window into two halves:
- all elements in the lower heap are less than or equal to those in the upper heap
- sizes remain balanced
Therefore, the median is always at the boundary between the two heaps.
Complexity
| operation | cost |
|---|---|
| insert | ( O(\log w) ) |
| delete expired value | ( O(\log w) ) |
| median query | ( O(1) ) |
Space:
[ O(w) ]
Deletion Handling
Removing arbitrary elements from heaps requires auxiliary structures such as:
- lazy deletion with hash maps
- indexed heaps
mark_for_removal(x):
deleted[x] += 1When to Use
Use rolling median when:
- median over recent data is required
- full sorting is unnecessary
- window size is moderate
- real-time analytics is needed
It is widely used in monitoring, finance, and signal processing systems.