LeetCode 4: Median of Two Sorted Arrays

Problem Restatement

We are given two sorted arrays:

nums1
nums2

Their sizes are m and n.

We need to return the median of the two arrays as if both arrays were merged into one sorted array.

The required time complexity is:

O(log(m + n))

So we should not solve this by fully merging the arrays. The LeetCode statement requires a logarithmic-time solution.

Input and Output

Item	Meaning
Input	Two sorted integer arrays `nums1` and `nums2`
Output	The median of all values from both arrays
Required time	`O(log(m + n))`
Important detail	The arrays are already sorted

Example function shape:

def findMedianSortedArrays(nums1: list[int], nums2: list[int]) -> float:
    ...

Examples

Example 1:

nums1 = [1, 3]
nums2 = [2]

The merged array would be:

[1, 2, 3]

There are three elements, so the median is the middle element:

Output:

2.00000

Example 2:

nums1 = [1, 2]
nums2 = [3, 4]

The merged array would be:

[1, 2, 3, 4]

There are four elements, so the median is the average of the two middle elements:

(2 + 3) / 2 = 2.5

Output:

2.50000

First Thought: Merge Both Arrays

Because both arrays are sorted, the simplest idea is to merge them like the merge step in merge sort.

After merging, the problem becomes easy.

If the total length is odd, return the middle value.

If the total length is even, return the average of the two middle values.

Code:

class Solution:
    def findMedianSortedArrays(
        self,
        nums1: list[int],
        nums2: list[int]
    ) -> float:
        merged = []
        i = 0
        j = 0

        while i < len(nums1) and j < len(nums2):
            if nums1[i] <= nums2[j]:
                merged.append(nums1[i])
                i += 1
            else:
                merged.append(nums2[j])
                j += 1

        while i < len(nums1):
            merged.append(nums1[i])
            i += 1

        while j < len(nums2):
            merged.append(nums2[j])
            j += 1

        total = len(merged)
        mid = total // 2

        if total % 2 == 1:
            return float(merged[mid])

        return (merged[mid - 1] + merged[mid]) / 2

Problem With Merging

This solution is correct, but it takes linear time.

Metric	Value
Time	`O(m + n)`
Space	`O(m + n)`

The problem explicitly asks for O(log(m + n)) time, so full merging does not satisfy the requirement.

Key Insight

The median splits a sorted array into two halves.

For example:

[1, 2, 3, 4, 5, 6]

The left half is:

[1, 2, 3]

The right half is:

[4, 5, 6]

For the split to be correct:

The left half must contain the correct number of elements.
Every value in the left half must be less than or equal to every value in the right half.

We do not need to build the merged array.

We only need to find where to cut nums1 and where to cut nums2.

The cuts form a partition.

nums1: left part | right part
nums2: left part | right part

Together, both left parts should contain half of the total elements.

Together, both right parts should contain the rest.

Partition Idea

Assume we cut nums1 at index i.

That means:

nums1[0:i]      goes to the left half
nums1[i:]       goes to the right half

Then we must cut nums2 at index j so that the total left side has the correct size.

Let:

total = len(nums1) + len(nums2)
half = (total + 1) // 2

Then:

j = half - i

Now we only need to check whether the partition is valid.

The boundary values are:

Name	Meaning
`A_left`	Biggest value on the left side of `nums1`
`A_right`	Smallest value on the right side of `nums1`
`B_left`	Biggest value on the left side of `nums2`
`B_right`	Smallest value on the right side of `nums2`

The partition is valid when:

A_left <= B_right
B_left <= A_right

If both are true, all values on the combined left side are less than or equal to all values on the combined right side.

Why Binary Search Works

We binary search the cut position i in the smaller array.

If:

A_left > B_right

then we took too many elements from nums1.

The cut in nums1 is too far right.

So we move left.

If:

B_left > A_right

then we took too few elements from nums1.

The cut in nums1 is too far left.

So we move right.

Because each step discards half of the possible cut positions, the search is logarithmic.

Visual Walkthrough

Use:

nums1 = [1, 3]
nums2 = [2]

We always binary search the smaller array, so swap them:

A = [2]
B = [1, 3]

Total length:

total = 3
half = 2

Try cutting A after one element:

A: [2] | []
B: [1] | [3]

Boundary values:

A_left = 2
A_right = infinity

B_left = 1
B_right = 3

Check:

A_left <= B_right   means   2 <= 3   true
B_left <= A_right   means   1 <= infinity   true

So the partition is valid.

Because total length is odd, the median is the largest value on the left side:

max(2, 1) = 2

Algorithm

Let A be the smaller array and B be the larger array.

Set total = len(A) + len(B).
Set half = (total + 1) // 2.
Binary search the cut position i in A.
Compute j = half - i.
Read the four boundary values:
- A_left
- A_right
- B_left
- B_right
If the partition is valid, compute the median.
If A_left > B_right, move the search left.
Otherwise, move the search right.

Correctness

The algorithm searches for a partition where the combined left side has exactly half elements.

The value of j is chosen from half - i, so every tested partition has the correct left-side size.

The partition condition checks the only two cross-array cases that can break sorted order:

A_left <= B_right
B_left <= A_right

Inside each array, order is already guaranteed because both arrays are sorted.

So when both inequalities hold, every element on the combined left side is less than or equal to every element on the combined right side.

At that point:

If the total length is odd, the median is the largest value on the left side.

If the total length is even, the median is the average of:

largest value on the left side
smallest value on the right side

The binary search update is also correct.

When A_left > B_right, the cut in A is too far right, so the algorithm moves left.

When B_left > A_right, the cut in A is too far left, so the algorithm moves right.

Therefore the algorithm finds the correct partition and returns the correct median.

Complexity

Metric	Value	Why
Time	`O(log(min(m, n)))`	Binary search runs on the smaller array
Space	`O(1)`	Only a few variables are stored

This satisfies the required O(log(m + n)) time bound.

Implementation

class Solution:
    def findMedianSortedArrays(
        self,
        nums1: list[int],
        nums2: list[int]
    ) -> float:
        A = nums1
        B = nums2

        if len(A) > len(B):
            A, B = B, A

        m = len(A)
        n = len(B)

        total = m + n
        half = (total + 1) // 2

        left = 0
        right = m

        while left <= right:
            i = (left + right) // 2
            j = half - i

            A_left = A[i - 1] if i > 0 else float("-inf")
            A_right = A[i] if i < m else float("inf")

            B_left = B[j - 1] if j > 0 else float("-inf")
            B_right = B[j] if j < n else float("inf")

            if A_left <= B_right and B_left <= A_right:
                if total % 2 == 1:
                    return float(max(A_left, B_left))

                left_max = max(A_left, B_left)
                right_min = min(A_right, B_right)

                return (left_max + right_min) / 2

            if A_left > B_right:
                right = i - 1
            else:
                left = i + 1

        return 0.0

Code Explanation

We first make sure A is the smaller array:

if len(A) > len(B):
    A, B = B, A

This keeps the binary search range small and avoids invalid partition ranges.

Then:

total = m + n
half = (total + 1) // 2

half is the number of elements that should appear on the combined left side.

The binary search range is:

left = 0
right = m

A cut at 0 means no elements from A go left.

A cut at m means all elements from A go left.

For each cut:

i = (left + right) // 2
j = half - i

i cuts A.

j cuts B.

Then we read the boundary values:

A_left = A[i - 1] if i > 0 else float("-inf")
A_right = A[i] if i < m else float("inf")

The infinities handle empty sides.

If the cut is before the first element, there is no left value, so we use negative infinity.

If the cut is after the last element, there is no right value, so we use positive infinity.

The same idea applies to B.

The partition is valid when:

if A_left <= B_right and B_left <= A_right:

For odd total length:

return float(max(A_left, B_left))

For even total length:

return (max(A_left, B_left) + min(A_right, B_right)) / 2

If A_left is too large:

right = i - 1

Otherwise:

left = i + 1

Testing

def run_tests():
    s = Solution()

    assert s.findMedianSortedArrays([1, 3], [2]) == 2.0
    assert s.findMedianSortedArrays([1, 2], [3, 4]) == 2.5
    assert s.findMedianSortedArrays([], [1]) == 1.0
    assert s.findMedianSortedArrays([2], []) == 2.0
    assert s.findMedianSortedArrays([0, 0], [0, 0]) == 0.0
    assert s.findMedianSortedArrays([1, 2, 3], [4, 5, 6, 7]) == 4.0
    assert s.findMedianSortedArrays([1, 4, 7], [2, 3, 5, 6]) == 4.0
    assert s.findMedianSortedArrays([-5, -3, -1], [-2]) == -2.5

    print("all tests passed")

run_tests()

Test	Why
`[1, 3]`, `[2]`	Standard odd-length example
`[1, 2]`, `[3, 4]`	Standard even-length example
`[]`, `[1]`	One array is empty
`[2]`, `[]`	Other array is empty
`[0, 0]`, `[0, 0]`	Duplicate values
`[1, 2, 3]`, `[4, 5, 6, 7]`	Arrays do not overlap
`[1, 4, 7]`, `[2, 3, 5, 6]`	Interleaved arrays
Negative values	Median can be negative or fractional