Select an element whose total weight on each side is at most half of the total weight.
Weighted Median generalizes the ordinary median by assigning a nonnegative weight to each element. Instead of balancing the number of elements on both sides, it balances the total weight.
An element is a weighted median if the total weight of smaller elements is at most half of the total weight, and the total weight of larger elements is also at most half.
Problem
Given pairs
where each , find a value such that:
and
where:
Algorithm
The simple method sorts by value, then scans cumulative weight.
weighted_median(items):
sort items by value
W = total weight
prefix = 0
for each (x, w) in sorted items:
prefix = prefix + w
if prefix >= W / 2:
return xA linear-time version can be built from selection and partitioning, but sorting is usually simpler and sufficient.
Example
Let:
The total weight is:
Half of the total weight is:
Cumulative weights after sorting:
| value | weight | cumulative |
|---|---|---|
| 2 | 1 | 1 |
| 4 | 3 | 4 |
| 7 | 2 | 6 |
| 9 | 4 | 10 |
The first cumulative weight at least occurs at value , so the weighted median is .
Correctness
After sorting, all values before the current item are smaller and all values after it are larger. The scan returns the first item whose cumulative weight reaches at least half of the total weight.
At that point, the weight of smaller elements is less than or equal to , because the previous prefix was below . The weight of larger elements is also less than or equal to , because the current prefix has reached at least .
Therefore the returned value satisfies the weighted median conditions.
Complexity
| method | time | space |
|---|---|---|
| sort and scan | or | |
| selection based | expected | possible |
The sorting method is easier to implement and less error prone.
When to Use
Use Weighted Median when observations have unequal importance. Common examples include weighted statistics, facility location, robust aggregation, voting, load balancing, and percentile calculations with sample weights.
Implementation
def weighted_median(items):
items = sorted(items, key=lambda p: p[0])
total = sum(w for _, w in items)
prefix = 0
for x, w in items:
prefix += w
if prefix >= total / 2:
return x
raise ValueError("empty input")package main
import "sort"
type WeightedItem struct {
Value int
Weight float64
}
func WeightedMedian(items []WeightedItem) (int, bool) {
if len(items) == 0 {
return 0, false
}
sort.Slice(items, func(i, j int) bool {
return items[i].Value < items[j].Value
})
total := 0.0
for _, item := range items {
total += item.Weight
}
prefix := 0.0
for _, item := range items {
prefix += item.Weight
if prefix >= total/2 {
return item.Value, true
}
}
return 0, false
}