Skip to content

Rolling Median Sort

Maintain the median of a sliding window using two heaps while the window moves over a stream.

Rolling median sort maintains the median value over a sliding window. Instead of fully sorting the window, it keeps just enough structure to extract the median efficiently.

You use it when median or percentile queries are needed over recent data.

Problem

Given a stream ( x_1, x_2, \dots ) and window size ( w ), compute the median of:

[ [x_{t-w+1}, \dots, x_t] ]

after each step.

Idea

Maintain two heaps:

  • a max heap for the lower half
  • a min heap for the upper half

The heaps are balanced so that:

  • sizes differ by at most 1
  • all elements in the max heap are less than or equal to those in the min heap

The median is:

  • top of max heap if sizes differ
  • average of both tops if sizes are equal

Algorithm

rolling_median(stream, w):
    low = max_heap()
    high = min_heap()
    window = queue()

    for x in stream:
        window.push_back(x)

        if low empty or x <= low.top():
            low.push(x)
        else:
            high.push(x)

        rebalance(low, high)

        if length(window) > w:
            old = window.pop_front()
            remove(old, low, high)
            rebalance(low, high)

        output median(low, high)

Rebalancing

rebalance(low, high):
    if size(low) > size(high) + 1:
        high.push(low.pop())
    elif size(high) > size(low):
        low.push(high.pop())

Example

Stream:

[ [1, 3, 2, 6, 4] ]

Window size:

[ w = 3 ]

Medians:

stepwindowmedian
1[1]1
2[1, 3]2
3[1, 3, 2]2
4[3, 2, 6]3
5[2, 6, 4]4

Correctness

The heaps partition the window into two halves:

  • all elements in the lower heap are less than or equal to those in the upper heap
  • sizes remain balanced

Therefore, the median is always at the boundary between the two heaps.

Complexity

operationcost
insert( O(\log w) )
delete expired value( O(\log w) )
median query( O(1) )

Space:

[ O(w) ]

Deletion Handling

Removing arbitrary elements from heaps requires auxiliary structures such as:

  • lazy deletion with hash maps
  • indexed heaps
mark_for_removal(x):
    deleted[x] += 1

When to Use

Use rolling median when:

  • median over recent data is required
  • full sorting is unnecessary
  • window size is moderate
  • real-time analytics is needed

It is widely used in monitoring, finance, and signal processing systems.