The Hirsch index - a play on numbers or a true appraisal of academic output?

Citation metrics have rapidly gained importance in today's landscape and are being increasingly utilized as a yardstick in making several important decisions regarding academic funding and appointments. The impact factor has traditionally been the metric most often employed in this regard. However, the emergence of the Hirsch index has provided an alternative to the impact factor. The h-index, despite its flaws, continues to gain acceptance and popularity in the medical community. Several medical journals have evaluated and endorsed the use of the h-index. However, it must be interpreted with all of its limitations in mind.


Introduction
Over the years, different citation indices and bibliometric parameters have been formulated to measure academic output and scholarly activity of researchers and academicians across the globe. The importance of these metrics springs from the fact that various agencies bequeath grants and funds on the basis of these indices. Thus they influence the amount, structure and orientation of research allocation and endorsement [1]. In a dynamic world with limited resources [2], the importance of such bibliometric measures cannot, therefore, be denied. These indices are also used as a foundation for conferring academic awards, hierarchal promotions, tenures, fellowships, salary increments, recruitments and leadership positions [2][3][4][5]. These indices are surrogate markers of scientific output [6] and ultimately demonstrate how "efficient" and "effective" a journal is [7]. They can also be considered a self-regulated effort to enforce some measure of quality control in a rapidly growing industry [6].
Citation metrics heavily influence the trends of subscription for various scientific journals and serve as a yardstick for authors when they deliberate about where to submit their next best scientific works [7]. According to Ogden et al, "the attraction of having a simple single number to judge complex issues is too great" [7]. Despite their potential shortcomings, some of these citation metrics continue to garner immense momentum and acceptance in the scientific world.

Impact factor
Traditionally, the citation metric most often employed in medical circles to evaluate a journal's standing is the impact factor (IF). The IF, first proposed by Eugene Garfield and Irving Sher in 1963 [3], is published by Thomson Scientific Reuters on an annual basis in the Journal Citation Reports (JCR) [1]. The rationale behind the original derivation of IFs was based on their utility in the ease of selection of journals in the Science Citation Index (SCI) [7]. The IF of a journal for a particular year is calculated as "a ratio of the number of research papers cited from that journal to the total number of "citable" papers it has published, using the time bracket of previous two years" [4]. As such, several factors influence the IF including, but not limited to, the type of article, the discipline of research and the number of "citable" articles a journal publishes [3]. Table 1 lists the potential problems with IF usage in the contemporary medical landscape [3,4,[7][8][9]. The IF has been criticized because of its propensity for "biased selection" [10,11]. Although having an over-simplified measure such as the IF leases convenience to bioscientists, one cannot deny the fact that this situation is actually akin to a stock exchange where major shareholders, or highly cited medical journals, wield a disproportionate degree of influence or monopoly on the rest of the investors in the industry [10].

The h-index
In order to circumvent these pitfalls associated with the usage of IF, the h-index was proposed as a bibliometric tool for individual authors, research groups or journals [7]. Most frequently, the h-index is utilized as an indicator of "individual scientific achievement" [9] because a "book should not be judged by its cover" [12].
Physicist Jorge E. Hirsch at the University of California, San Diego, first explained the calculation of the h-index in 2005 in his seminal publication as follows: "a scientist has the index h if h of his or her N p papers have at least h citations each and the remainder of the papers (N p -h) have ≤ h citations each, where N p is the number of papers published over n years" [2]. For example, if a scientist has an h-index of 70, it stands to follow that he has published atleast 70 papers with at least 70 citations each. Hirsch also listed some caveats that need to be considered when using the h-index (Table 2) [2,9]. Using the h-index as a guidepost, Hirsch also proposed the following criteria for faculty promotions at major research institutions, a) advancement to associate professor/tenure at h ≈ 12, b) advancement to full professor at h ≈ 18, c) fellowship or membership of major academic or professional societies may be granted at h ≥ 15 [2]. Scholars can calculate their h-index at search engines such as the ISI Web of Science [13], Elsevier's Scopus [14] or Google Scholar [15].
In a second paper by Hirsch in 2007, the h-index was shown to be a better predictor of future scientific performance and productivity when compared to other measures such as total number of citations, mean citations per paper and total number of papers [9]. Additionally, Hirsch argued that the h-index is fashioned in such a way that it predicts individual cumulative achievement better than other indices in a differential manner, as is the case for a paper with multiple coauthors with varying levels of seniority. On the other hand, indices such as Egghe's g index, Jin et al.'s AR index and Komulski's H 2 index are inferior because they indiscriminately credit a highly cited paper to all its co-authors [9]. In contrast to the championing of the h-index by Hirsch, Lehmann et al [16] reported the superiority of the mean number of citations per paper over the h-index [9]. Similarly, Honekopp et al [12] Table 1 Pitfalls of using impact factor [3,4,[7][8][9] Pitfall Detail

Transparency issues
Thompson Scientific decides what is a "citable" research item. Books, book chapters and conference proceedings not included.
Ambiguity in calculation "The definition for citing items is broader than for cited items". Also, difficult to reproduce the calculation.

Shortcomings of 2 -year temporal bracket for citations
Citation fluctuations make the inclusion of citations limited to the previous two years unreliable.
No consideration of "citation half-life" in calculating IF ǂ Citations change as the article and the journal itself matures.
Over-representation Medical literature in English language or by a particular publisher or from a particular geographic region (e.g. North America) or about a particular subject (e.g. basic sciences) or of a particular type (e.g.a review) is disproportionately represented and cited.
Homonymy, Synonymy Many authors sharing the same name or one article with many variants. Articles where author's name misspelled will be missed.

Delays
Delays in registration of citations, delays in peer-review and publication process detrimental for journals of disciplines with long turn-over times.
Abridged referencing Many journals restrict the number of articles that can be cited by authors. Important research maybe potentially disregarded and not cited.
Miscellaneous "Gift" authorships, self-citations and "flattery" citations ǂ IF = Impact Factor Table 2 Caveats with the use of h-index outlined by Hirsch [2,9] # Caveats showed that the IF outperforms the h-index when predicting the future citations of an article. Another issue with the h-index that merits discussion is the inability of the metric to differentiate between scientists who publish but their work goes "uncited" and researchers who publish little to begin with [6]. The same can be considered as an advantage of the h-index in that the authors with "practically" similar research profiles have similar h-indices [6].
Several journals have now evaluated the value of the h-index as a viable citation metric including Retrovirology [17,18], Journal of Neurosurgery [19][20][21], European Archives of Oto-Rhino-Laryngology [22], Academic Radiology [23], Breast Cancer Research and Treatment [24], Harvard Review of Psychiatry [25] and Urology [26]. These studies have shown that the h-index is a robust metric to evaluate scientific research in these disciplines.
The h-index has been described as being inherently biased towards more seasoned researchers with long spanning career trajectories [27,28]. It appears to favor researchers who publish in fields with high citation frequencies [1]. Gender also appears to have an impact on the h-index via its influence on productivity [27,28]. For developing countries, the additional problem posed in the computation of the h-index is that many authors heavily publish in local and national journals which may not be tracked by or be indexed in common scientific search engines or databases [29]. This may distort the calculation of the h-index for these researchers as these local journals may have very limited "citability" and circulation [29]. There is thus a need to make the journals in developing countries more stream-lined [5]. Till that time, it seems unfair to compare the scientific achievements of a scientist based in a developing country with those of a scientist in the developed world solely by the use of the h-index [5].
Although the h-index is considered to perform at a superior level when considering scientists in the same scientific category, there are exceptions to the rule particularly when one considers the possibility of the existence of subcategories within a category [5]. Like the IF, this index also does not consider books, book chapters or conference proceedings in its computation [5]. The h-index is also not sensitive for changes in academic performance and it can never decrease [30].
An additional measure, the e-index, has also been introduced more recently to provide more information about a researcher's published works [31,32]. Dodson believed that the h-index underestimates the actual number of citations by as much as 50% [31,33]. The eindex refers to the "further impact" that scientists have over and above the h-index. This is the "excess" citations that a researcher has in addition to those that have been used to compute the h-index [31,32]. To counter the effect of the author's age on the h-index, an m-score has been proposed which is calculated as "the h-index divided by the number of years since the first publication" [6,28]. In addition, a "mentoring index" has also been suggested [18] to give due and weighted credit to both senior and junior scientists who are the mentors and mentees respectively.

Future directions
There is a need to find a balance between the quantitative and qualitative aspects of bibliometric indices. Additionally, there is a need to recognize the importance of expert opinion and review panels in deciding the scientific value of an article in addition to the numeric value assigned through a bibliometric index [34]. Despite its myriad advantages, a single index cannot, for all realistic purposes, be the ultimate authority on the scientific integrity and quality of scientific research [8]. More and more importance is being attached to bibliometric indices now and we need a modus operandi that is fair, honest, sophisticated, up-to-date, systematic and effective [1,10].
The h-index is certainly a step in the right direction. However, the verdict on the superiority of the h-index over other metrics is not unanimous at the moment, although positive trends can be well appreciated. This disputation invites further studies which may be best directed at the evaluation of the original h-index instead of derivation of further variants that will only add to the complexity of the problem [35]. Such studies should also provide a measure of concentration like the Gini coefficient or the Herfindahl index [35].
The importance of evaluating and rectifying flaws in current citation practices also needs to be highlighted. In this vein, journals can adopt the practice of random citation auditing [36]. Also, current bibliometric indices such as the h-index attach weight to frequency of citation. However, some authors have proposed that high readership be taken into account instead to gauge the true impact of a research paper [18].

Conclusions
When compared to similar bibliometric indices, the hindex undoubtedly represents an appealing prospect. It has quickly emerged to become a powerful nucleus in the universe of citation metrics that has captured the attention and fascination of researchers and scientists worldwide [37]. It is also being considered as a sustainable alternative to the IF. Although not the quintessential or gold-standard citation metric, the h-index still represents one of the most holistic and realistic indicators of a researcher's credentials. It attempts to accord importance to both the quantity (number of publications) and quality (number of citations) of scientific research [38].
However, at the same time, we need to be mindful that like any other citation metric that maybe excessively relied upon in this climate of financial uncertainty [31] and increasing competition [5], it can engender a vicious cycle of "publish or perish", "adapt or perish", "citation game-playing" or "citation coalitions and networks" [5,36,39,40] while also quelling creative bravado [41]. It must, therefore, be interpreted cautiously in the backdrop of all of its limitations [27,37]. At the end of the day, the h-index essentially remains the corollary of complex statistical jargon. It remains a matter of intense debate whether scientific talent can truly be quantified by a numerical entity such as the h-index [41].