# LeetCode 275: H-Index II

## Problem Restatement

We are given an integer array `citations`.

The array is sorted in ascending order.

Each value:

```python
citations[i]
```

is the number of citations received by one paper.

Return the researcher's h-index.

The h-index is the maximum value `h` such that the researcher has at least `h` papers with at least `h` citations each.

The required runtime is logarithmic, so we need binary search. The official examples include `[0,1,3,5,6] -> 3` and `[1,2,100] -> 2`.

## Input and Output

| Item | Meaning |
|---|---|
| Input | Sorted ascending array `citations` |
| Output | The h-index |
| Goal | Find the largest valid `h` |
| Required time | `O(log n)` |

Example function shape:

```python
def hIndex(citations: list[int]) -> int:
    ...
```

## Examples

Example 1:

```python
citations = [0, 1, 3, 5, 6]
```

There are `5` papers.

If `h = 3`, then the last three papers have citation counts:

```python
[3, 5, 6]
```

All three have at least `3` citations.

If `h = 4`, then we would need four papers with at least four citations, but only two papers have at least four citations.

Answer:

```python
3
```

Example 2:

```python
citations = [1, 2, 100]
```

There are two papers with at least two citations:

```python
[2, 100]
```

There are not three papers with at least three citations.

Answer:

```python
2
```

## First Thought: Linear Scan

Since the array is sorted, we can scan from left to right.

At index `i`, the number of papers from `i` to the end is:

```python
n - i
```

If:

```python
citations[i] >= n - i
```

then there are at least `n - i` papers with at least `n - i` citations.

The first index where this becomes true gives the h-index.

```python
class Solution:
    def hIndex(self, citations: list[int]) -> int:
        n = len(citations)

        for i, c in enumerate(citations):
            h = n - i

            if c >= h:
                return h

        return 0
```

This is correct, but it runs in `O(n)`.

## Problem With Linear Scan

The problem explicitly asks for logarithmic time.

So we need to use the sorted order more aggressively.

The condition:

```python
citations[i] >= n - i
```

has a monotonic shape.

At earlier indices, `citations[i]` is smaller and `n - i` is larger.

At later indices, `citations[i]` is larger and `n - i` is smaller.

So the condition changes from false to true at most once.

That is exactly a binary search pattern.

## Key Insight

For an index `i`, define:

```python
h = n - i
```

This means there are `h` papers from index `i` through the end.

Because the array is sorted ascending, all papers from index `i` onward have at least `citations[i]` citations.

So if:

```python
citations[i] >= h
```

then these `h` papers all have at least `h` citations.

This makes `h` a valid h-index candidate.

We want the earliest index where this condition becomes true.

Then the answer is:

```python
n - index
```

## Algorithm

1. Let `n = len(citations)`.
2. Binary search over indices `0..n - 1`.
3. At each `mid`, compute:
   ```python id="vs7w3t"
   h = n - mid
   ```
4. If:
   ```python id="otaw9h"
   citations[mid] >= h
   ```
   then `mid` is feasible, so search left for an earlier feasible index.
5. Otherwise, search right.
6. When binary search ends, `left` is the first feasible index.
7. Return:
   ```python id="psavz3"
   n - left
   ```

If no index is feasible, `left == n`, so the answer becomes `0`.

## Walkthrough

Use:

```python
citations = [0, 1, 3, 5, 6]
```

Start:

```python
n = 5
left = 0
right = 5
```

We use a half-open search interval `[left, right)`.

First middle:

```python
mid = 2
h = 5 - 2 = 3
citations[2] = 3
```

Since:

```python
3 >= 3
```

index `2` is feasible.

Search left:

```python
right = 2
```

Next middle:

```python
mid = 1
h = 5 - 1 = 4
citations[1] = 1
```

Since:

```python
1 < 4
```

index `1` is not feasible.

Search right:

```python
left = 2
```

Now:

```python
left == right == 2
```

The first feasible index is `2`.

Return:

```python
n - left = 5 - 2 = 3
```

## Correctness

For each index `i`, there are exactly `n - i` papers from `i` to the end of the sorted array.

Because the array is sorted ascending, every paper from index `i` onward has at least `citations[i]` citations.

Therefore, if `citations[i] >= n - i`, then the researcher has at least `n - i` papers with at least `n - i` citations. So `n - i` is a valid h-index candidate.

The condition is monotonic. When it becomes true at some index, it remains true for later indices because citation counts do not decrease while `n - i` decreases.

Binary search finds the first index where the condition is true.

This first feasible index gives the largest valid h-index, because earlier indices correspond to larger `h` values. If no feasible index exists, the h-index is `0`.

Therefore the algorithm returns the correct h-index.

## Complexity

| Metric | Value | Why |
|---|---|---|
| Time | `O(log n)` | Binary search over the sorted array |
| Space | `O(1)` | Only a few variables are used |

## Implementation

```python
class Solution:
    def hIndex(self, citations: list[int]) -> int:
        n = len(citations)

        left = 0
        right = n

        while left < right:
            mid = (left + right) // 2
            h = n - mid

            if citations[mid] >= h:
                right = mid
            else:
                left = mid + 1

        return n - left
```

## Code Explanation

We search in the half-open interval:

```python
[left, right)
```

Initialize it to the whole array:

```python
left = 0
right = n
```

At each index `mid`, the possible h-index is:

```python
h = n - mid
```

If the current paper has enough citations to support that `h`:

```python
if citations[mid] >= h:
```

then this index works, and maybe an earlier index also works.

So we move left:

```python
right = mid
```

Otherwise, this index cannot support its candidate `h`, so we search to the right:

```python
left = mid + 1
```

At the end, `left` is the first feasible index.

The answer is:

```python
return n - left
```

If no index works, then `left == n`, and the function returns `0`.

## Testing

```python
def run_tests():
    s = Solution()

    assert s.hIndex([0, 1, 3, 5, 6]) == 3
    assert s.hIndex([1, 2, 100]) == 2
    assert s.hIndex([0]) == 0
    assert s.hIndex([100]) == 1
    assert s.hIndex([0, 0, 0]) == 0
    assert s.hIndex([1, 1, 1]) == 1
    assert s.hIndex([4, 4, 4, 4]) == 4
    assert s.hIndex([0, 1, 4, 5, 6]) == 3

    print("all tests passed")

run_tests()
```

Test meaning:

| Test | Why |
|---|---|
| `[0, 1, 3, 5, 6]` | Official example |
| `[1, 2, 100]` | Official example |
| `[0]` | Single paper with zero citations |
| `[100]` | H-index capped by paper count |
| All zero | No positive h-index |
| All one | H-index is one |
| All high enough | H-index equals `n` |
| Mixed sorted values | Confirms binary search boundary |