LeetCode 275: H-Index II

Problem Restatement

We are given an integer array citations.

The array is sorted in ascending order.

Each value:

citations[i]

is the number of citations received by one paper.

Return the researcher’s h-index.

The h-index is the maximum value h such that the researcher has at least h papers with at least h citations each.

The required runtime is logarithmic, so we need binary search. The official examples include [0,1,3,5,6] -> 3 and [1,2,100] -> 2.

Input and Output

Item	Meaning
Input	Sorted ascending array `citations`
Output	The h-index
Goal	Find the largest valid `h`
Required time	`O(log n)`

Example function shape:

def hIndex(citations: list[int]) -> int:
    ...

Examples

Example 1:

citations = [0, 1, 3, 5, 6]

There are 5 papers.

If h = 3, then the last three papers have citation counts:

[3, 5, 6]

All three have at least 3 citations.

If h = 4, then we would need four papers with at least four citations, but only two papers have at least four citations.

Answer:

Example 2:

citations = [1, 2, 100]

There are two papers with at least two citations:

[2, 100]

There are not three papers with at least three citations.

Answer:

First Thought: Linear Scan

Since the array is sorted, we can scan from left to right.

At index i, the number of papers from i to the end is:

n - i

If:

citations[i] >= n - i

then there are at least n - i papers with at least n - i citations.

The first index where this becomes true gives the h-index.

class Solution:
    def hIndex(self, citations: list[int]) -> int:
        n = len(citations)

        for i, c in enumerate(citations):
            h = n - i

            if c >= h:
                return h

        return 0

This is correct, but it runs in O(n).

Problem With Linear Scan

The problem explicitly asks for logarithmic time.

So we need to use the sorted order more aggressively.

The condition:

citations[i] >= n - i

has a monotonic shape.

At earlier indices, citations[i] is smaller and n - i is larger.

At later indices, citations[i] is larger and n - i is smaller.

So the condition changes from false to true at most once.

That is exactly a binary search pattern.

Key Insight

For an index i, define:

h = n - i

This means there are h papers from index i through the end.

Because the array is sorted ascending, all papers from index i onward have at least citations[i] citations.

So if:

citations[i] >= h

then these h papers all have at least h citations.

This makes h a valid h-index candidate.

We want the earliest index where this condition becomes true.

Then the answer is:

n - index

Algorithm

Let n = len(citations).
Binary search over indices 0..n - 1.
At each mid, compute:
```
h = n - mid
```
If:
```
citations[mid] >= h
```
then mid is feasible, so search left for an earlier feasible index.
Otherwise, search right.
When binary search ends, left is the first feasible index.
Return:
```
n - left
```

If no index is feasible, left == n, so the answer becomes 0.

Walkthrough

Use:

citations = [0, 1, 3, 5, 6]

Start:

n = 5
left = 0
right = 5

We use a half-open search interval [left, right).

First middle:

mid = 2
h = 5 - 2 = 3
citations[2] = 3

Since:

3 >= 3

index 2 is feasible.

Search left:

right = 2

Next middle:

mid = 1
h = 5 - 1 = 4
citations[1] = 1

Since:

1 < 4

index 1 is not feasible.

Search right:

left = 2

Now:

left == right == 2

The first feasible index is 2.

Return:

n - left = 5 - 2 = 3

Correctness

For each index i, there are exactly n - i papers from i to the end of the sorted array.

Because the array is sorted ascending, every paper from index i onward has at least citations[i] citations.

Therefore, if citations[i] >= n - i, then the researcher has at least n - i papers with at least n - i citations. So n - i is a valid h-index candidate.

The condition is monotonic. When it becomes true at some index, it remains true for later indices because citation counts do not decrease while n - i decreases.

Binary search finds the first index where the condition is true.

This first feasible index gives the largest valid h-index, because earlier indices correspond to larger h values. If no feasible index exists, the h-index is 0.

Therefore the algorithm returns the correct h-index.

Complexity

Metric	Value	Why
Time	`O(log n)`	Binary search over the sorted array
Space	`O(1)`	Only a few variables are used

Implementation

class Solution:
    def hIndex(self, citations: list[int]) -> int:
        n = len(citations)

        left = 0
        right = n

        while left < right:
            mid = (left + right) // 2
            h = n - mid

            if citations[mid] >= h:
                right = mid
            else:
                left = mid + 1

        return n - left

Code Explanation

We search in the half-open interval:

[left, right)

Initialize it to the whole array:

left = 0
right = n

At each index mid, the possible h-index is:

h = n - mid

If the current paper has enough citations to support that h:

if citations[mid] >= h:

then this index works, and maybe an earlier index also works.

So we move left:

right = mid

Otherwise, this index cannot support its candidate h, so we search to the right:

left = mid + 1

At the end, left is the first feasible index.

The answer is:

return n - left

If no index works, then left == n, and the function returns 0.

Testing

def run_tests():
    s = Solution()

    assert s.hIndex([0, 1, 3, 5, 6]) == 3
    assert s.hIndex([1, 2, 100]) == 2
    assert s.hIndex([0]) == 0
    assert s.hIndex([100]) == 1
    assert s.hIndex([0, 0, 0]) == 0
    assert s.hIndex([1, 1, 1]) == 1
    assert s.hIndex([4, 4, 4, 4]) == 4
    assert s.hIndex([0, 1, 4, 5, 6]) == 3

    print("all tests passed")

run_tests()

Test meaning:

Test	Why
`[0, 1, 3, 5, 6]`	Official example
`[1, 2, 100]`	Official example
`[0]`	Single paper with zero citations
`[100]`	H-index capped by paper count
All zero	No positive h-index
All one	H-index is one
All high enough	H-index equals `n`
Mixed sorted values	Confirms binary search boundary