A clear explanation of finding the longest contiguous subarray with equal numbers of 0 and 1 using prefix sums and a hash map.
Problem Restatement
We are given a binary array nums.
Each element is either:
0 or 1We need to return the maximum length of a contiguous subarray that contains an equal number of 0s and 1s.
The official problem asks for the maximum length of a contiguous subarray with an equal number of 0 and 1.
Input and Output
| Item | Meaning |
|---|---|
| Input | A binary array nums |
| Output | Maximum length of a valid contiguous subarray |
| Valid subarray | Has equal number of 0s and 1s |
| If none exists | Return 0 |
Function shape:
class Solution:
def findMaxLength(self, nums: List[int]) -> int:
...Examples
Example 1:
nums = [0, 1]The whole array has one 0 and one 1.
So the answer is:
2Example 2:
nums = [0, 1, 0]The valid subarrays of length 2 are:
[0, 1]
[1, 0]Both have one 0 and one 1.
So the answer is:
2Example 3:
nums = [0, 1, 1, 1, 1, 1, 0, 0, 0]One longest valid subarray is:
[1, 1, 1, 0, 0, 0]Its length is:
6So the answer is:
6First Thought: Check Every Subarray
A direct approach is to try every subarray.
For each starting index, extend the ending index and count how many 0s and 1s appear.
If the counts are equal, update the best length.
from typing import List
class Solution:
def findMaxLength(self, nums: List[int]) -> int:
best = 0
n = len(nums)
for left in range(n):
zeros = 0
ones = 0
for right in range(left, n):
if nums[right] == 0:
zeros += 1
else:
ones += 1
if zeros == ones:
best = max(best, right - left + 1)
return bestThis works, but it checks too many subarrays.
Problem With Brute Force
There are O(n^2) possible subarrays.
The constraint allows nums.length up to 10^5, so a quadratic solution is too slow.
We need to detect balanced subarrays in one pass.
Key Insight
Convert the problem into a prefix sum problem.
Treat:
| Original value | Transformed value |
|---|---|
0 | -1 |
1 | +1 |
Now a subarray has equal numbers of 0s and 1s exactly when its transformed sum is 0.
For example:
nums = [0, 1, 0, 1]becomes:
[-1, +1, -1, +1]The sum is 0, so the whole array is balanced.
Now use prefix sums.
If the same prefix sum appears at two different indices, then the subarray between those indices has sum 0.
That means it has equal numbers of 0s and 1s.
Example of the Prefix Sum Trick
For:
nums = [0, 1, 0]Track count, where:
0 adds -1
1 adds +1| Index | Value | Count |
|---|---|---|
| -1 | start | 0 |
| 0 | 0 | -1 |
| 1 | 1 | 0 |
| 2 | 0 | -1 |
The count 0 appears at index -1 and index 1.
So the subarray from index 0 to index 1 has equal 0s and 1s.
Its length is:
1 - (-1) = 2The count -1 appears at index 0 and index 2.
So the subarray from index 1 to index 2 is also balanced.
Its length is:
2 - 0 = 2Algorithm
Maintain a running count:
count += 1 for value 1
count -= 1 for value 0Use a hash map:
first_index = {0: -1}This stores the first index where each count appeared.
Then scan the array.
For each index i:
- Update
count. - If
counthas appeared before:- the subarray between the first occurrence and
iis balanced - update the best length
- the subarray between the first occurrence and
- Otherwise:
- store this index as the first occurrence of
count
- store this index as the first occurrence of
We store only the first occurrence because it gives the longest possible subarray for future matches.
Correctness
After converting 0 to -1 and 1 to +1, a subarray has equal numbers of 0s and 1s exactly when its transformed sum is 0.
The running count is the prefix sum of this transformed array.
If the same prefix sum appears at indices j and i, then the sum between j + 1 and i is:
count[i] - count[j] = 0So that subarray has equal numbers of 0s and 1s.
The algorithm checks every index and compares the current prefix sum with its earliest previous occurrence.
Using the earliest occurrence gives the longest balanced subarray ending at the current index.
By taking the maximum over all indices, the algorithm finds the maximum possible length.
Therefore the returned value is correct.
Complexity
| Metric | Value | Why |
|---|---|---|
| Time | O(n) | The array is scanned once |
| Space | O(n) | The hash map can store up to 2n + 1 count values |
Implementation
from typing import List
class Solution:
def findMaxLength(self, nums: List[int]) -> int:
first_index = {0: -1}
count = 0
best = 0
for i, num in enumerate(nums):
if num == 0:
count -= 1
else:
count += 1
if count in first_index:
best = max(best, i - first_index[count])
else:
first_index[count] = i
return bestCode Explanation
Initialize the hash map with count 0 at index -1:
first_index = {0: -1}This handles balanced subarrays that start at index 0.
Track the running balance:
count = 0Track the best length found so far:
best = 0For each number:
if num == 0:
count -= 1
else:
count += 1If we have seen this count before, the subarray between the old index and the current index is balanced:
if count in first_index:
best = max(best, i - first_index[count])If the count has not appeared before, store its first index:
else:
first_index[count] = iThe first index is kept because it gives the largest length later.
Testing
def test_find_max_length():
s = Solution()
assert s.findMaxLength([0, 1]) == 2
assert s.findMaxLength([0, 1, 0]) == 2
assert s.findMaxLength([0, 1, 1, 1, 1, 1, 0, 0, 0]) == 6
assert s.findMaxLength([0, 0, 0]) == 0
assert s.findMaxLength([1, 1, 1]) == 0
assert s.findMaxLength([0, 0, 1, 0, 0, 0, 1, 1]) == 6
assert s.findMaxLength([1, 0, 1, 0, 1, 0]) == 6
print("all tests passed")Test meaning:
| Test | Why |
|---|---|
[0, 1] | Smallest balanced array |
[0, 1, 0] | Two possible balanced subarrays |
| Longer mixed case | Checks nontrivial maximum |
| All zeros | No balanced subarray |
| All ones | No balanced subarray |
| Repeated prefix count | Checks first-index logic |
| Entire array balanced | Confirms full-length answer |