17.11 Rectangle Union Area
Given a set of axis-aligned rectangles, compute the total area covered by their union.
17.11 Rectangle Union Area
Problem
Given a set of axis-aligned rectangles, compute the total area covered by their union.
The rectangles may overlap. Overlapping regions should be counted once, not once per rectangle.
For example:
+---------+
| |
| +---------+
| | | |
+-----|---+ |
| |
+---------+
The total area is not:
area(rectangle 1) + area(rectangle 2)
because the shared region would be counted twice.
This problem appears in graphics, map processing, VLSI layout, CAD systems, spatial databases, occupancy grids, and image-analysis pipelines.
Solution
Use a sweep line over the x-axis.
Each rectangle contributes two vertical events:
left edge: add its y-interval
right edge: remove its y-interval
As the sweep line moves from one x-coordinate to the next, the active y-intervals describe the vertical coverage at that horizontal slice.
For each adjacent pair of x-events:
area contribution =
covered y-length × width between x-events
The main loop is:
sort vertical events by x
active y-intervals = empty
previous x = first event x
for each x event group:
width = x - previous x
area += covered_y_length(active intervals) × width
apply all events at x
previous x = x
This reduces a two-dimensional union problem to repeated one-dimensional interval coverage.
Rectangle Representation
Use half-open rectangles:
[left, right) × [bottom, top)
This convention avoids double-counting shared boundaries. Boundaries have zero area, so the area result is unchanged, but event handling becomes cleaner.
from dataclasses import dataclass
@dataclass(frozen=True)
class Rectangle:
left: float
bottom: float
right: float
top: float
Normalize or reject invalid rectangles:
def validate_rectangle(rect):
if rect.left > rect.right:
raise ValueError("left must be <= right")
if rect.bottom > rect.top:
raise ValueError("bottom must be <= top")
A rectangle with zero width or zero height contributes no area.
Event Construction
Each rectangle produces two events.
@dataclass(frozen=True)
class Event:
x: float
kind: int
bottom: float
top: float
Use:
kind = +1
for a left edge, and:
kind = -1
for a right edge.
def rectangle_events(rectangles):
events = []
for rect in rectangles:
validate_rectangle(rect)
if rect.left == rect.right or rect.bottom == rect.top:
continue
events.append(Event(rect.left, 1, rect.bottom, rect.top))
events.append(Event(rect.right, -1, rect.bottom, rect.top))
events.sort(key=lambda event: event.x)
return events
Simple Implementation
This version keeps the active intervals in a list and recomputes the covered y-length at each event group. It is not the fastest version, but it is compact and useful as a reference implementation.
@dataclass(frozen=True)
class Interval:
start: float
end: float
def merge_intervals(intervals):
if not intervals:
return []
intervals = sorted(intervals, key=lambda interval: interval.start)
merged = [intervals[0]]
for current in intervals[1:]:
previous = merged[-1]
if current.start <= previous.end:
merged[-1] = Interval(
previous.start,
max(previous.end, current.end)
)
else:
merged.append(current)
return merged
def covered_length(intervals):
total = 0
for interval in merge_intervals(intervals):
total += interval.end - interval.start
return total
Now the sweep:
def rectangle_union_area(rectangles):
events = rectangle_events(rectangles)
if not events:
return 0
active = []
area = 0
previous_x = events[0].x
i = 0
while i < len(events):
x = events[i].x
width = x - previous_x
if width != 0:
area += covered_length(active) * width
while i < len(events) and events[i].x == x:
event = events[i]
interval = Interval(event.bottom, event.top)
if event.kind == 1:
active.append(interval)
else:
active.remove(interval)
i += 1
previous_x = x
return area
This implementation is easy to audit. The active list contains the y-intervals of all rectangles currently crossed by the sweep line.
Example
Rectangles:
A = [0, 4) × [0, 3)
B = [2, 6) × [1, 4)
Individual areas:
A: 4 × 3 = 12
B: 4 × 3 = 12
Overlap:
[2, 4) × [1, 3)
Overlap area:
2 × 2 = 4
Union area:
12 + 12 - 4 = 20
The sweep-line algorithm reaches the same result by processing x-events:
x = 0: add [0,3)
x = 2: add [1,4)
x = 4: remove [0,3)
x = 6: remove [1,4)
Between x = 0 and x = 2:
covered y-length = 3
width = 2
area += 6
Between x = 2 and x = 4:
active intervals = [0,3), [1,4)
covered y-length = 4
width = 2
area += 8
Between x = 4 and x = 6:
covered y-length = 3
width = 2
area += 6
Total:
6 + 8 + 6 = 20
Grouping Events by X
Events at the same x-coordinate must be processed together.
This matters because the area between:
x = current
and itself is zero.
Processing events one by one at the same coordinate can still work, but it is easier to introduce bugs. Grouping makes the invariant explicit:
Before applying events at x, active describes the coverage over the horizontal slab:
previous_x <= X < x
After applying events at x, active describes the coverage for the next slab.
Faster Implementation with Coordinate Compression
The simple implementation recomputes merged y-intervals repeatedly. For large inputs, use coordinate compression and a segment tree.
The idea is:
- Collect all unique y-coordinates from rectangle bottoms and tops.
- Sort them.
- The elementary y-spans are adjacent coordinate pairs.
- Maintain how many active rectangles cover each elementary span.
- Track total covered y-length.
Example y-coordinates:
0, 1, 3, 4
Elementary spans:
[0,1), [1,3), [3,4)
A segment tree can update a y-range in:
O(log n)
and report total covered length in:
O(1)
after each update.
Segment Tree Sketch
A production implementation usually stores:
cover count
covered length
left coordinate
right coordinate
for each node.
Rule:
if cover count > 0:
covered length = right coordinate - left coordinate
else:
covered length = sum of children covered lengths
This lets the tree represent both full coverage and partial coverage.
The sweep then becomes:
for each event group at x:
area += tree.covered_length × (x - previous_x)
apply y-range updates to tree
Complexity:
O(n log n)
where n is the number of rectangles.
Correctness Argument
Between two consecutive x-event coordinates, the active set does not change. Therefore, the union of rectangles over that vertical slab is exactly:
covered y-interval union × slab width
The algorithm computes this area for every slab and adds the contributions.
Because all changes to the active set happen at rectangle left or right edges, and every such edge appears as an event, the sweep accounts for every region covered by at least one rectangle.
Overlapping rectangles do not cause double-counting because y-intervals are merged before measuring covered length.
Complexity Analysis
For the simple implementation:
2n events
sorting events: O(n log n)
covered length recomputation: O(n log n) per event group
worst-case total: O(n² log n)
This version is useful for clarity and moderate input sizes.
For the segment-tree implementation:
event sorting: O(n log n)
each update: O(log n)
total: O(n log n)
Space usage:
O(n)
for events, compressed coordinates, and the segment tree.
Common Pitfalls
Counting Overlap Twice
Summing individual rectangle areas is incorrect when rectangles overlap.
Forgetting to Group Same-X Events
Events at the same x-coordinate should be processed as one group.
Removing the Wrong Interval
If duplicate rectangles exist, active.remove(interval) removes one matching interval. That is usually correct, but custom objects must compare properly.
Ignoring Zero-Area Rectangles
Rectangles with zero width or height should be skipped.
Confusing Boundary Semantics
Use half-open rectangles internally. It keeps edge-touching cases simple and avoids accidental double-counting.
Using the Simple Version for Large Inputs
The list-based implementation is easy to read but can be too slow. Use a segment tree when input size matters.
Discussion
Rectangle union area is a canonical sweep-line problem. The geometric insight is simple: as the sweep line moves across x, only rectangle edges change the active vertical coverage. The implementation challenge is maintaining the union length of active y-intervals efficiently.
For small and medium inputs, recomputing merged intervals is often sufficient and easy to test. For large inputs, coordinate compression plus a segment tree gives the expected O(n log n) behavior. This pattern recurs throughout computational geometry: sweep one dimension, maintain coverage in the other.