Saturday, October 31, 2020

H-Index

Problem: Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher's h-index. The h-index is defined as the maximum value of h such that the given author/journal has published h papers that have each been cited at least h times.

Example:

Input: citations = [10,9,8,7,3]
Output: 4 
Explanation: [10,9,8,7,4] means the researcher has 5 papers in total and each of them had 
received 10, 9, 8, 7, 4 citations respectively.   Since the researcher has 4 papers with at least 4 citations each and the                 remaining one has only 3 citations i.e. less than 4, his h-index is 4.


Approach: One way to solve this problem is to just use standard sorting but it will take nlogn time. We can use count sort here to make it more efficient but what about the values which are more than then length of the array? If we see we really don't care about it as all those values will come in the bucket indexed at length as here we need to find the index where number of citations are less than or equal to that index.

Now we can have a reverse loop and we keep adding the bucket's values into a variable 'total' and we just need to check if total <= curr_index. If that's the case than we have our answer.

That's all!


Implementation in C#:

Approach1: Using Sorting:

    public int HIndex(int[] citations)
    {
        int length = citations?.Length ?? 0;
        if (length == 0)
        {
            return 0;
        }
        Array.Sort(citations);
        int hIndex = 0;
        for (int i = 0; i < length; ++i)
        {
            if (citations[i] >= length - i)
            {
                hIndex = Math.Max(hIndex, Math.Min(citations[i], length - i));
            }
        }
        return hIndex;
    }

Approach2: Using Count Sort:

    public int HIndex(int[] citations)
    {
        int length = citations?.Length ?? 0;
        if (length == 0)
        {
            return 0;
        }
        int[] count = new int[length + 1];
        foreach (var citation in citations)
        {
            if (citation > length)
            {
                ++count[length];
            }
            else
            {
                ++count[citation];
            }
        }
        int total = 0;
        for (int i = length; i >= 0; --i)
        {
            total += count[i];
            if (total >= i)
            {
                return i;
            }
        }
        return 0;
    }

Complexity: O(nlogn) for approach 1 and O(n) for approach 2.

No comments:

Post a Comment