Skip to content

Weighted Median

Select an element whose total weight on each side is at most half of the total weight.

Weighted Median generalizes the ordinary median by assigning a nonnegative weight to each element. Instead of balancing the number of elements on both sides, it balances the total weight.

An element is a weighted median if the total weight of smaller elements is at most half of the total weight, and the total weight of larger elements is also at most half.

Problem

Given pairs

(x1,w1),(x2,w2),,(xn,wn) (x_1, w_1), (x_2, w_2), \dots, (x_n, w_n)

where each wi0w_i \ge 0, find a value xmx_m such that:

xi<xmwiW2 \sum_{x_i < x_m} w_i \le \frac{W}{2}

and

xi>xmwiW2 \sum_{x_i > x_m} w_i \le \frac{W}{2}

where:

W=i=1nwi W = \sum_{i=1}^{n} w_i

Algorithm

The simple method sorts by value, then scans cumulative weight.

weighted_median(items):
    sort items by value
    W = total weight
    prefix = 0

    for each (x, w) in sorted items:
        prefix = prefix + w
        if prefix >= W / 2:
            return x

A linear-time version can be built from selection and partitioning, but sorting is usually simpler and sufficient.

Example

Let:

[(2,1),(4,3),(7,2),(9,4)] [(2, 1), (4, 3), (7, 2), (9, 4)]

The total weight is:

W=10 W = 10

Half of the total weight is:

5 5

Cumulative weights after sorting:

valueweightcumulative
211
434
726
9410

The first cumulative weight at least 55 occurs at value 77, so the weighted median is 77.

Correctness

After sorting, all values before the current item are smaller and all values after it are larger. The scan returns the first item whose cumulative weight reaches at least half of the total weight.

At that point, the weight of smaller elements is less than or equal to W/2W/2, because the previous prefix was below W/2W/2. The weight of larger elements is also less than or equal to W/2W/2, because the current prefix has reached at least W/2W/2.

Therefore the returned value satisfies the weighted median conditions.

Complexity

methodtimespace
sort and scanO(nlogn)O(n \log n)O(1)O(1) or O(n)O(n)
selection basedO(n)O(n) expectedO(1)O(1) possible

The sorting method is easier to implement and less error prone.

When to Use

Use Weighted Median when observations have unequal importance. Common examples include weighted statistics, facility location, robust aggregation, voting, load balancing, and percentile calculations with sample weights.

Implementation

def weighted_median(items):
    items = sorted(items, key=lambda p: p[0])
    total = sum(w for _, w in items)

    prefix = 0
    for x, w in items:
        prefix += w
        if prefix >= total / 2:
            return x

    raise ValueError("empty input")
package main

import "sort"

type WeightedItem struct {
	Value  int
	Weight float64
}

func WeightedMedian(items []WeightedItem) (int, bool) {
	if len(items) == 0 {
		return 0, false
	}

	sort.Slice(items, func(i, j int) bool {
		return items[i].Value < items[j].Value
	})

	total := 0.0
	for _, item := range items {
		total += item.Weight
	}

	prefix := 0.0
	for _, item := range items {
		prefix += item.Weight
		if prefix >= total/2 {
			return item.Value, true
		}
	}

	return 0, false
}