# LeetCode 274: H-Index

## Problem Restatement

We are given an integer array `citations`.

Each value:

```python
citations[i]
```

is the number of citations received by the researcher's `i`th paper.

Return the researcher's h-index.

The h-index is the maximum value `h` such that the researcher has at least `h` papers with at least `h` citations each.

The constraints are `1 <= citations.length <= 5000` and `0 <= citations[i] <= 1000`. The official examples include `[3,0,6,1,5] -> 3` and `[1,3,1] -> 1`.

## Input and Output

| Item | Meaning |
|---|---|
| Input | An array `citations` |
| Output | The researcher's h-index |
| Goal | Find the largest valid `h` |
| Condition | At least `h` papers have at least `h` citations each |

Example function shape:

```python
def hIndex(citations: list[int]) -> int:
    ...
```

## Examples

Example 1:

```python
citations = [3, 0, 6, 1, 5]
```

Sort descending:

```python
[6, 5, 3, 1, 0]
```

Check possible h-index values:

| `h` | Need | Result |
|---:|---|---|
| `1` | At least 1 paper with at least 1 citation | Yes |
| `2` | At least 2 papers with at least 2 citations | Yes |
| `3` | At least 3 papers with at least 3 citations | Yes |
| `4` | At least 4 papers with at least 4 citations | No |

Answer:

```python
3
```

Example 2:

```python
citations = [1, 3, 1]
```

Sort descending:

```python
[3, 1, 1]
```

There is at least one paper with at least one citation.

There are not two papers with at least two citations.

Answer:

```python
1
```

Example 3:

```python
citations = [0, 0, 0]
```

No paper has at least one citation.

Answer:

```python
0
```

## First Thought: Try Every h

A direct solution is to try every possible value of `h` from `0` to `n`.

For each `h`, count how many papers have at least `h` citations.

```python
class Solution:
    def hIndex(self, citations: list[int]) -> int:
        n = len(citations)
        best = 0

        for h in range(n + 1):
            count = 0

            for c in citations:
                if c >= h:
                    count += 1

            if count >= h:
                best = h

        return best
```

This works, but it repeatedly scans the whole array.

## Problem With Brute Force

There are up to `n` possible values of `h`.

For each value, we scan all papers.

That gives:

```python
O(n^2)
```

We can do better by sorting.

## Key Insight

After sorting citations in descending order, the position tells us how many papers have at least that many citations.

For example:

```python
[6, 5, 3, 1, 0]
```

At index `0`, we have seen `1` paper.

At index `1`, we have seen `2` papers.

At index `2`, we have seen `3` papers.

If the citation count at index `i` is at least `i + 1`, then there are at least `i + 1` papers with at least `i + 1` citations.

So we scan the sorted list and keep increasing the answer while this condition holds.

## Algorithm

1. Sort `citations` in descending order.
2. Initialize `h = 0`.
3. For each citation count `c`:
   - If `c >= h + 1`, increase `h`.
   - Otherwise stop.
4. Return `h`.

## Walkthrough

Use:

```python
citations = [3, 0, 6, 1, 5]
```

After sorting:

```python
[6, 5, 3, 1, 0]
```

Start:

```python
h = 0
```

Read `6`.

```python
6 >= 1
```

So:

```python
h = 1
```

Read `5`.

```python
5 >= 2
```

So:

```python
h = 2
```

Read `3`.

```python
3 >= 3
```

So:

```python
h = 3
```

Read `1`.

```python
1 < 4
```

Stop.

Return:

```python
3
```

## Correctness

After sorting citations in descending order, the first `i + 1` papers are the `i + 1` most cited papers.

When we see a citation count `c` at index `i`, if:

```python
c >= i + 1
```

then every previous paper has at least `c` citations or more, so the first `i + 1` papers each have at least `i + 1` citations. Therefore `i + 1` is a valid h-index candidate.

If the condition fails at index `i`, then the `i + 1`th most cited paper has fewer than `i + 1` citations. Since all later papers have no more citations, there cannot be `i + 1` papers with at least `i + 1` citations.

Thus the largest value reached by the scan is exactly the h-index.

## Complexity

| Metric | Value | Why |
|---|---|---|
| Time | `O(n log n)` | Sorting dominates |
| Space | `O(1)` or `O(n)` | Depends on the sorting implementation |

## Implementation

```python
class Solution:
    def hIndex(self, citations: list[int]) -> int:
        citations.sort(reverse=True)

        h = 0

        for c in citations:
            if c >= h + 1:
                h += 1
            else:
                break

        return h
```

## Code Explanation

Sort from most cited to least cited:

```python
citations.sort(reverse=True)
```

The variable `h` stores the largest valid h-index found so far:

```python
h = 0
```

For each citation count:

```python
for c in citations:
```

we check whether this paper can support increasing the h-index by one:

```python
if c >= h + 1:
    h += 1
```

If it cannot, the later papers cannot help because they have no more citations.

So we stop:

```python
else:
    break
```

## Alternative: Counting Solution

Since the h-index cannot be larger than `n`, citation counts above `n` can be grouped together.

Create buckets:

```python
bucket[0], bucket[1], ..., bucket[n]
```

where:

- `bucket[i]` counts papers with exactly `i` citations for `i < n`
- `bucket[n]` counts papers with at least `n` citations

Then scan from `n` down to `0`.

```python
class Solution:
    def hIndex(self, citations: list[int]) -> int:
        n = len(citations)
        bucket = [0] * (n + 1)

        for c in citations:
            if c >= n:
                bucket[n] += 1
            else:
                bucket[c] += 1

        papers = 0

        for h in range(n, -1, -1):
            papers += bucket[h]

            if papers >= h:
                return h

        return 0
```

This version runs in:

| Metric | Value |
|---|---|
| Time | `O(n)` |
| Space | `O(n)` |

## Testing

```python
def run_tests():
    s = Solution()

    assert s.hIndex([3, 0, 6, 1, 5]) == 3
    assert s.hIndex([1, 3, 1]) == 1
    assert s.hIndex([0, 0, 0]) == 0
    assert s.hIndex([100]) == 1
    assert s.hIndex([0]) == 0
    assert s.hIndex([4, 4, 4, 4]) == 4
    assert s.hIndex([10, 8, 5, 4, 3]) == 4
    assert s.hIndex([25, 8, 5, 3, 3]) == 3

    print("all tests passed")

run_tests()
```

Test meaning:

| Test | Why |
|---|---|
| `[3, 0, 6, 1, 5]` | Official example |
| `[1, 3, 1]` | Small unsorted input |
| All zero | H-index is zero |
| Single highly cited paper | H-index capped by paper count |
| Single zero-citation paper | No valid positive h-index |
| All papers have enough citations | H-index equals `n` |
| Descending input | Already sorted case |
| Mixed high citations | Confirms h-index cap logic |