H-Index
Problem
Given an array of integers citations where citations[i] is the number of citations a researcher received for their ith paper, return compute the researcher's h-index. According to the definition of h-index on Wikipedia: A scientist has an index h if h of their n papers have at least h citations each, and the other n − h papers have no more than h citations each. If there are several possible values for h, the maximum one is taken as the h-index.
citations = [3, 0, 6, 1, 5]3def h_index(citations):
n = len(citations)
buckets = [0] * (n + 1)
for c in citations:
buckets[min(c, n)] += 1
count = 0
for h in range(n, -1, -1):
count += buckets[h]
if count >= h:
return h
return 0
function hIndex(citations) {
const n = citations.length;
const buckets = new Array(n + 1).fill(0);
for (const c of citations) buckets[Math.min(c, n)]++;
let count = 0;
for (let h = n; h >= 0; h--) {
count += buckets[h];
if (count >= h) return h;
}
return 0;
}
class Solution {
public int hIndex(int[] citations) {
int n = citations.length;
int[] buckets = new int[n + 1];
for (int c : citations) buckets[Math.min(c, n)]++;
int count = 0;
for (int h = n; h >= 0; h--) {
count += buckets[h];
if (count >= h) return h;
}
return 0;
}
}
int hIndex(vector<int>& citations) {
int n = citations.size();
vector<int> buckets(n + 1, 0);
for (int c : citations) buckets[min(c, n)]++;
int count = 0;
for (int h = n; h >= 0; h--) {
count += buckets[h];
if (count >= h) return h;
}
return 0;
}
Explanation
The h-index is the largest h such that at least h papers each have >= h citations. You could sort the citations, but here we get the answer in linear time using a clever bucket (counting-sort) idea.
The key insight: the h-index can never exceed n, the number of papers. So any citation count larger than n is just as useful as n. We make n + 1 buckets and, for each paper, do buckets[min(c, n)] += 1 — capping big counts at n.
Then we sweep h downward from n to 0, adding up papers as we go. The running count after adding buckets[h] equals the number of papers with at least h citations. The first time count >= h, that h is the answer — and because we scan high-to-low, it is automatically the maximum.
Example: citations = [3, 0, 6, 1, 5] with n = 5. The buckets fill so that at h = 3 the running count of papers with >= 3 citations reaches 3 (the papers with 3, 5, and 6). Since 3 >= 3, the h-index is 3.
No sorting is required, so the whole thing runs in time proportional to the number of papers.