Skip to content

LeetCode 274: H-Index

A clear explanation of the H-Index problem using sorting, then an optimized counting approach.

Problem Restatement

We are given an integer array citations.

Each value:

citations[i]

is the number of citations received by the researcher’s ith paper.

Return the researcher’s h-index.

The h-index is the maximum value h such that the researcher has at least h papers with at least h citations each.

The constraints are 1 <= citations.length <= 5000 and 0 <= citations[i] <= 1000. The official examples include [3,0,6,1,5] -> 3 and [1,3,1] -> 1.

Input and Output

ItemMeaning
InputAn array citations
OutputThe researcher’s h-index
GoalFind the largest valid h
ConditionAt least h papers have at least h citations each

Example function shape:

def hIndex(citations: list[int]) -> int:
    ...

Examples

Example 1:

citations = [3, 0, 6, 1, 5]

Sort descending:

[6, 5, 3, 1, 0]

Check possible h-index values:

hNeedResult
1At least 1 paper with at least 1 citationYes
2At least 2 papers with at least 2 citationsYes
3At least 3 papers with at least 3 citationsYes
4At least 4 papers with at least 4 citationsNo

Answer:

3

Example 2:

citations = [1, 3, 1]

Sort descending:

[3, 1, 1]

There is at least one paper with at least one citation.

There are not two papers with at least two citations.

Answer:

1

Example 3:

citations = [0, 0, 0]

No paper has at least one citation.

Answer:

0

First Thought: Try Every h

A direct solution is to try every possible value of h from 0 to n.

For each h, count how many papers have at least h citations.

class Solution:
    def hIndex(self, citations: list[int]) -> int:
        n = len(citations)
        best = 0

        for h in range(n + 1):
            count = 0

            for c in citations:
                if c >= h:
                    count += 1

            if count >= h:
                best = h

        return best

This works, but it repeatedly scans the whole array.

Problem With Brute Force

There are up to n possible values of h.

For each value, we scan all papers.

That gives:

O(n^2)

We can do better by sorting.

Key Insight

After sorting citations in descending order, the position tells us how many papers have at least that many citations.

For example:

[6, 5, 3, 1, 0]

At index 0, we have seen 1 paper.

At index 1, we have seen 2 papers.

At index 2, we have seen 3 papers.

If the citation count at index i is at least i + 1, then there are at least i + 1 papers with at least i + 1 citations.

So we scan the sorted list and keep increasing the answer while this condition holds.

Algorithm

  1. Sort citations in descending order.
  2. Initialize h = 0.
  3. For each citation count c:
    • If c >= h + 1, increase h.
    • Otherwise stop.
  4. Return h.

Walkthrough

Use:

citations = [3, 0, 6, 1, 5]

After sorting:

[6, 5, 3, 1, 0]

Start:

h = 0

Read 6.

6 >= 1

So:

h = 1

Read 5.

5 >= 2

So:

h = 2

Read 3.

3 >= 3

So:

h = 3

Read 1.

1 < 4

Stop.

Return:

3

Correctness

After sorting citations in descending order, the first i + 1 papers are the i + 1 most cited papers.

When we see a citation count c at index i, if:

c >= i + 1

then every previous paper has at least c citations or more, so the first i + 1 papers each have at least i + 1 citations. Therefore i + 1 is a valid h-index candidate.

If the condition fails at index i, then the i + 1th most cited paper has fewer than i + 1 citations. Since all later papers have no more citations, there cannot be i + 1 papers with at least i + 1 citations.

Thus the largest value reached by the scan is exactly the h-index.

Complexity

MetricValueWhy
TimeO(n log n)Sorting dominates
SpaceO(1) or O(n)Depends on the sorting implementation

Implementation

class Solution:
    def hIndex(self, citations: list[int]) -> int:
        citations.sort(reverse=True)

        h = 0

        for c in citations:
            if c >= h + 1:
                h += 1
            else:
                break

        return h

Code Explanation

Sort from most cited to least cited:

citations.sort(reverse=True)

The variable h stores the largest valid h-index found so far:

h = 0

For each citation count:

for c in citations:

we check whether this paper can support increasing the h-index by one:

if c >= h + 1:
    h += 1

If it cannot, the later papers cannot help because they have no more citations.

So we stop:

else:
    break

Alternative: Counting Solution

Since the h-index cannot be larger than n, citation counts above n can be grouped together.

Create buckets:

bucket[0], bucket[1], ..., bucket[n]

where:

  • bucket[i] counts papers with exactly i citations for i < n
  • bucket[n] counts papers with at least n citations

Then scan from n down to 0.

class Solution:
    def hIndex(self, citations: list[int]) -> int:
        n = len(citations)
        bucket = [0] * (n + 1)

        for c in citations:
            if c >= n:
                bucket[n] += 1
            else:
                bucket[c] += 1

        papers = 0

        for h in range(n, -1, -1):
            papers += bucket[h]

            if papers >= h:
                return h

        return 0

This version runs in:

MetricValue
TimeO(n)
SpaceO(n)

Testing

def run_tests():
    s = Solution()

    assert s.hIndex([3, 0, 6, 1, 5]) == 3
    assert s.hIndex([1, 3, 1]) == 1
    assert s.hIndex([0, 0, 0]) == 0
    assert s.hIndex([100]) == 1
    assert s.hIndex([0]) == 0
    assert s.hIndex([4, 4, 4, 4]) == 4
    assert s.hIndex([10, 8, 5, 4, 3]) == 4
    assert s.hIndex([25, 8, 5, 3, 3]) == 3

    print("all tests passed")

run_tests()

Test meaning:

TestWhy
[3, 0, 6, 1, 5]Official example
[1, 3, 1]Small unsorted input
All zeroH-index is zero
Single highly cited paperH-index capped by paper count
Single zero-citation paperNo valid positive h-index
All papers have enough citationsH-index equals n
Descending inputAlready sorted case
Mixed high citationsConfirms h-index cap logic