LeetCode 274: H-Index

Problem Restatement

We are given an integer array citations.

Each value:

citations[i]

is the number of citations received by the researcher’s ith paper.

Return the researcher’s h-index.

The h-index is the maximum value h such that the researcher has at least h papers with at least h citations each.

The constraints are 1 <= citations.length <= 5000 and 0 <= citations[i] <= 1000. The official examples include [3,0,6,1,5] -> 3 and [1,3,1] -> 1.

Input and Output

Item	Meaning
Input	An array `citations`
Output	The researcher’s h-index
Goal	Find the largest valid `h`
Condition	At least `h` papers have at least `h` citations each

Example function shape:

def hIndex(citations: list[int]) -> int:
    ...

Examples

Example 1:

citations = [3, 0, 6, 1, 5]

Sort descending:

[6, 5, 3, 1, 0]

Check possible h-index values:

`h`	Need	Result
`1`	At least 1 paper with at least 1 citation	Yes
`2`	At least 2 papers with at least 2 citations	Yes
`3`	At least 3 papers with at least 3 citations	Yes
`4`	At least 4 papers with at least 4 citations	No

Answer:

Example 2:

citations = [1, 3, 1]

Sort descending:

[3, 1, 1]

There is at least one paper with at least one citation.

There are not two papers with at least two citations.

Answer:

Example 3:

citations = [0, 0, 0]

No paper has at least one citation.

Answer:

First Thought: Try Every h

A direct solution is to try every possible value of h from 0 to n.

For each h, count how many papers have at least h citations.

class Solution:
    def hIndex(self, citations: list[int]) -> int:
        n = len(citations)
        best = 0

        for h in range(n + 1):
            count = 0

            for c in citations:
                if c >= h:
                    count += 1

            if count >= h:
                best = h

        return best

This works, but it repeatedly scans the whole array.

Problem With Brute Force

There are up to n possible values of h.

For each value, we scan all papers.

That gives:

O(n^2)

We can do better by sorting.

Key Insight

After sorting citations in descending order, the position tells us how many papers have at least that many citations.

For example:

[6, 5, 3, 1, 0]

At index 0, we have seen 1 paper.

At index 1, we have seen 2 papers.

At index 2, we have seen 3 papers.

If the citation count at index i is at least i + 1, then there are at least i + 1 papers with at least i + 1 citations.

So we scan the sorted list and keep increasing the answer while this condition holds.

Algorithm

Sort citations in descending order.
Initialize h = 0.
For each citation count c:
- If c >= h + 1, increase h.
- Otherwise stop.
Return h.

Walkthrough

Use:

citations = [3, 0, 6, 1, 5]

After sorting:

[6, 5, 3, 1, 0]

Start:

h = 0

Read 6.

6 >= 1

So:

h = 1

Read 5.

5 >= 2

So:

h = 2

Read 3.

3 >= 3

So:

h = 3

Read 1.

1 < 4

Stop.

Return:

Correctness

After sorting citations in descending order, the first i + 1 papers are the i + 1 most cited papers.

When we see a citation count c at index i, if:

c >= i + 1

then every previous paper has at least c citations or more, so the first i + 1 papers each have at least i + 1 citations. Therefore i + 1 is a valid h-index candidate.

If the condition fails at index i, then the i + 1th most cited paper has fewer than i + 1 citations. Since all later papers have no more citations, there cannot be i + 1 papers with at least i + 1 citations.

Thus the largest value reached by the scan is exactly the h-index.

Complexity

Metric	Value	Why
Time	`O(n log n)`	Sorting dominates
Space	`O(1)` or `O(n)`	Depends on the sorting implementation

Implementation

class Solution:
    def hIndex(self, citations: list[int]) -> int:
        citations.sort(reverse=True)

        h = 0

        for c in citations:
            if c >= h + 1:
                h += 1
            else:
                break

        return h

Code Explanation

Sort from most cited to least cited:

citations.sort(reverse=True)

The variable h stores the largest valid h-index found so far:

h = 0

For each citation count:

for c in citations:

we check whether this paper can support increasing the h-index by one:

if c >= h + 1:
    h += 1

If it cannot, the later papers cannot help because they have no more citations.

So we stop:

else:
    break

Alternative: Counting Solution

Since the h-index cannot be larger than n, citation counts above n can be grouped together.

Create buckets:

bucket[0], bucket[1], ..., bucket[n]

where:

bucket[i] counts papers with exactly i citations for i < n
bucket[n] counts papers with at least n citations

Then scan from n down to 0.

class Solution:
    def hIndex(self, citations: list[int]) -> int:
        n = len(citations)
        bucket = [0] * (n + 1)

        for c in citations:
            if c >= n:
                bucket[n] += 1
            else:
                bucket[c] += 1

        papers = 0

        for h in range(n, -1, -1):
            papers += bucket[h]

            if papers >= h:
                return h

        return 0

This version runs in:

Metric	Value
Time	`O(n)`
Space	`O(n)`

Testing

def run_tests():
    s = Solution()

    assert s.hIndex([3, 0, 6, 1, 5]) == 3
    assert s.hIndex([1, 3, 1]) == 1
    assert s.hIndex([0, 0, 0]) == 0
    assert s.hIndex([100]) == 1
    assert s.hIndex([0]) == 0
    assert s.hIndex([4, 4, 4, 4]) == 4
    assert s.hIndex([10, 8, 5, 4, 3]) == 4
    assert s.hIndex([25, 8, 5, 3, 3]) == 3

    print("all tests passed")

run_tests()

Test meaning:

Test	Why
`[3, 0, 6, 1, 5]`	Official example
`[1, 3, 1]`	Small unsorted input
All zero	H-index is zero
Single highly cited paper	H-index capped by paper count
Single zero-citation paper	No valid positive h-index
All papers have enough citations	H-index equals `n`
Descending input	Already sorted case
Mixed high citations	Confirms h-index cap logic