Quantile Summary

Quantile Summary maintains approximate quantiles of a stream using bounded memory. Instead of storing all values, it stores a compressed representation that allows queries such as median, percentiles, and rank estimates.

The goal is to answer queries of the form:

\text{find } x \text{ such that } rank(x) \approx \phi n

for a given quantile $\phi \in (0,1)$ .

Problem

Given a stream

x_1, x_2, \dots, x_n

support queries:

quantile(phi):
    return approximate value with rank near phi * n

with guaranteed error bounds, using limited memory.

Data Structure

A quantile summary stores a sorted list of tuples:

(value, g, delta)

value: observed data point
g: minimum gap to previous value
delta: maximum additional slack

These parameters bound the possible rank range of each stored value.

Algorithm

Insert

insert(x):
    find position i where value[i] <= x < value[i+1]

    insert (x, 1, floor(2 * epsilon * n)) at position i

    compress if needed

Compress

Merge adjacent entries when their combined uncertainty stays within bounds.

compress():
    for i from end to beginning:
        if g[i] + g[i+1] + delta[i+1] <= floor(2 * epsilon * n):
            merge entries i and i+1

Query

Scan accumulated ranks to find the value whose rank interval covers the target.

quantile(phi):
    r = floor(phi * n)
    running = 0

    for each entry:
        running += g
        if running + delta >= r:
            return value

Error Guarantee

For parameter $\epsilon$ , the summary ensures:

|rank(x) - \phi n| \le \epsilon n

This gives a deterministic bound on rank error.

Example

Let the stream be:

[7, 2, 9, 4, 3]

A quantile summary might store a small subset such as:

[(2, \cdot), (4, \cdot), (7, \cdot)]

Querying for $\phi = 0.5$ returns an approximate median close to $4$ .

Correctness

Each tuple represents a range of possible ranks. The algorithm maintains invariants that bound the total uncertainty for each value.

Insertion adds a new value with minimal gap. Compression merges entries only when the resulting uncertainty remains within the allowed error bound.

Thus the summary always respects the rank error guarantee.

Complexity

operation	cost
insertion	$O(\log m)$ amortized
query	$O(m)$
memory	$O\left(\frac{1}{\epsilon} \log (\epsilon n)\right)$

Here $m$ is the number of stored entries.

When to Use

Use Quantile Summary when:

the data is streaming
approximate percentiles are sufficient
memory must be bounded
deterministic error guarantees are required

For simpler implementations with similar guarantees, use Greenwald Khanna. For probabilistic alternatives, use sketches such as t-digest.

Implementation

class QuantileSummary:
    def __init__(self, epsilon):
        self.epsilon = epsilon
        self.summary = []
        self.n = 0

    def insert(self, x):
        import bisect

        self.n += 1
        pos = bisect.bisect_left([v for v, _, _ in self.summary], x)

        if pos == 0 or pos == len(self.summary):
            delta = 0
        else:
            delta = int(2 * self.epsilon * self.n)

        self.summary.insert(pos, [x, 1, delta])
        self.compress()

    def compress(self):
        i = len(self.summary) - 2
        while i >= 0:
            v1, g1, d1 = self.summary[i]
            v2, g2, d2 = self.summary[i + 1]

            if g1 + g2 + d2 <= int(2 * self.epsilon * self.n):
                self.summary[i + 1][1] += g1
                self.summary.pop(i)

            i -= 1

    def quantile(self, phi):
        target = int(phi * self.n)
        running = 0

        for v, g, d in self.summary:
            running += g
            if running + d >= target:
                return v

        return None