Effect of Common Extraneous Citation Optimizing Factors on Journal Impact Indicators

Influence of a research journal is usually assessed in contemporary academia by Journal Impact Factor (JIF) given by Journal Citation Reports (JCR) published annually by Clarivate Analytics. JCR also provides Journal Immediacy Index (JII), an additional citation parameter which indicates current impact of journals. These citation based measures are simple arithmetic mean of raw citation counts to source publications. It is opined and empirically tested that three major extraneous citation optimizing factors i.e., Author self-citation (ASC), Journal self-citation (JSC) and Recitation (RC) can inflate these popular citation based metrics. This study examines nineteen Scopus index Library and Information Science (LIS) journals to understand individual as well as unified effects of these three optimizing factors on three popular impact indicators i.e., 2-year JIF, 3-year JIF and JII. It is found that ASC and JSC have noticeable effects on these impact indicators. Further, it is observed that these impact indicators exhibit very poor correlation among them when their values are deduced from raw citation counts, though all of them express simple arithmetic mean values. However, modified impact indicators calculated after excluding citations due to these optimizing factors, exhibit moderate to strong correlation among them. It is therefore concluded that more refined method that can automatically exclude the effect of these optimizing factors in their derivation may be needed for fair assessment of a journal’s relative impact in scholarly communication.


INTRODUCTION
Evaluation of journals is essential in contemporary world of formal scholarly communication to delineate a journal's role in scholarly communications process and to understand its importance in reporting novel ideas and significant findings in a given discipline. Journals are preferably assessed in today's academia by the citation outcomes of the articles published in them. In the citation based journal evaluation systems, Journal 'Impact Factor' (JIF) devised by Garfield and Sher [1] is undoubtedly the most popular measure that has been widely used in various research administrative purpose despite of DORA declaration [2] to discard JIF in research evaluation almost about a decade ago. In practice, journal impact factor is used in wide range of decision making process by the different stakeholders in academia. For instance, scholars competing for publication space and recognition among peers use it for their manuscript subscription decision; librarians use it for serial collection management decision; editors use the number to improve quality of their journals while publishers use it for branding their journals to attract high quality/citable papers for them; and academic administrators use it for assessing faculty performance regarding their promotion, recruitment, tenure renewal, etc. However, JIF was initially devised for selecting important scientific and technical journals for 'Science Citation Index' (SCI). [3,4] Garfield during his initial years of research for developing SCI found that the citation world was inherently skewed and majority of citations were received by a small group of journals in a given research field and therefore he used the analogy of a comet to describe the citation world as "the nucleus representing the core journals of a literature and the debris and gas molecules of the tail of the comet representing additional journals that sometimes publish material relevant to the subject". [5] Therefore, he thought to devise a size independent metric to select core journals in a given research field which culminated in the formulation of 'Impact Factor' of Journal. The annual SCI Journal Citation Reports (JCR) which provides JIF based on SCI citation data were officially launched in 1975. JCR also provides other citation based information like Journal Immediacy Index (JII) along with JIF. Like JIF, JII is also the ratio of the number of citations to citable source items published in a journal, only with a difference that both the published items and citations received due to those publications are counted for the publishing calendar year of the journal. JII is commonly used to understand how rapidly a journal's published items accrue attentions of peers in terms of citations. Research administrators, policy makers were quick to visualize the practical utility of citation based indicators in objective research assessment exercise amidst rising accountability culture in academia and societal emphasis on "value for money" for demonstrating social impact of publicly funded research. In such bizarre academic environment, the popularity of JIF quickly rose due to its ready availability, timelines and conceptual simplicity among the alternative citation based indicators. The use of JIF has gradually become so endemic that even quality of a research paper is also judged by JIF value of the journal in which it has appeared, especially in large scale research evaluations like 'Research Excellence Framework'(REF) of United Kingdom where huge numbers of papers need to be graded. [6] However, citations received by an article do not depend upon the intrinsic quality of the article alone. [7] The visibility/exposure of the journal where an article is published also plays an important role on citation outcomes of the article. [8][9][10] Scientists often cite material to which they are readily exposed [11] and those papers are usually in the top of the list in search results of search engines and they are often from high impact journals as these journals have greater exposure, search engine optimization strategy, etc. [12,13] Garfield [5] himself pointed out that author citation might be influenced by the extraneous factors like visibility, prestige, and accessibility of the cited journals. Studies have shown that more a journal covered by 'Abstracting and Indexing' (A&I) services, higher is its probability of getting cited as more A&I services indexing a journal, wider the potential readers. [14][15][16] Papers that are more widely distributed both in print and online may likely become better known and thus have the higher likelihood of getting cited. Therefore, open access articles tend to attract more citations than their closed counterparts. [17][18][19] It has also been found that multiple open access availability of an article has a positive impact on its citation counts. [20] Hence, alternative web based approaches towards assessing the impact of research articles going beyond the traditional citations based measure have gained momentum. It is argued that in the age of netizens when online venues for scholarly communication on the rise, 'Altmetrics' measurements like how many times a research article has been downloaded, viewed, bookmarked or shared in social web may be the useful parameters for assessing the social impact of the paper. [21][22][23] Major reputed publishers like PLOS, Elsevier, Springer-Nature, ACM, etc., have already been providing article level usage data. Also, 'Altmetrics' data aggregator websites like Impactstory.com, Altmetrics.com, Plum-X etc., are providing article level usage metrics but these usage data are highly susceptive to manipulation. Article level usage data provided by the publishers or the 'Altmetrics' aggregators also lack consistency, provenance and verifiability. [24] Altmetrics is thus in an evolving phase and have to go a long way before being seriously considered as a reliable tool for systematic evaluation of research. Therefore, impact indicators based on citation counts to evaluate the importance of scientific works have still been considered more reliable than the 'Altmetrics' score.
However, recent investigations on academic publishing has indicated that heightened pressure on researchers and academia has accelerated untoward research/citation practices in an alarming rate. [25,26] As citation based measures have become norms for assessing impact of research publications, it is often observed that focus of research has significantly shifted from high quality, ethical and significant relevant researches to producing research papers that can possibly attain targeted citation metrics. Thus, the rising questionable practices in research perhaps have transpired the application of Goodhart's law in academic publishing which is commonly understood as 'when a measure becomes a target, it ceases to be good measure'. [27] Empirical evidences from large number of studies indicated that several factors can influence citation outcomes of a journal. [7,28,29] Some of these factors are inherent to citation based measure like skewness of citation distributions; others are external factors that can be optimized/manipulated in a large extent like Author self-citations (ASC), Journal selfcitation (JSC), Recitation(RC). [30,31] Among these optimizing factors, JSC is easiest to track and often cause denying JIF value for successive two years of a journal in JCR given by Web of Science (WoS, web version of expanded SCI) of Clarivate Analytics due to excessive JSC. This process was started since 2007 by Thompson Reuter (the then publisher of WoS and JCR). As JIF is the most influential metric in determining the academic prestige of a journal, temptation "to play the system" can be high. [32] So, editors and publishers of journals are sometimes found to be caught in whirlwind web of citation optimization tide to leverage JIF value of their journals. However, all high JSCs do not necessarily indicate unfair practices by editors of the associated journals. Studies have reasoned that potential legitimate mechanism may also trigger overrepresentation of journal self-citations in citation oeuvre of a journal. [33,31] Several investigations, however, suggest that JSCs are often intentionally used as pliable tool to optimize journal citation based impact metrics. [34,35] Similarly, role of ASC in citation based impact metrics is also a long drawn debate and it was extensively discussed in published literature. [29,30,31,36] Though, ASC is a normal process in research communication, excessive ASC may inflate citation based matrices. Generally, ASCs appear more frequently in early years after the publication and thus it may impact JIF or JII in a great extent. [30,31,37,38] Further, it is argued that ASC serves as an advertisement of author's previous works and thus has potential of garnering more citations from other researchers. [39,40] So, it is opined that excluding ASC from total citations received by an article is not sufficient, additional penalty may be imposed in calculating citation impact indicators. [39] Recitation (RC) is another pliable tool that may be exploited by the researchers having extensive research connectivity to enhance citation counts of their research papers. [41] Several investigations have already indicated that authors tend to cite the works of those authors with whom they are personally acquainted. [42] Cronin [43] contends that such citing behaviour is not at all surprising as it strengthen bonds between the authors and can generate higher citations through reciprocal exchanges. Studies have indicated that recitations can inflate citation counts of journals in a great extent. [30,31] Excessive use of any of these pliable optimizing tools by the stakeholders of research publications like authors or editors/ publishers can be detected with relative ease. However, intelligent mix of these tools may take myriad forms which render it difficult to detect. For instance, editorial policy of journals may give additional emphasis to research papers submitted by established and prolific authors over equally and even little bit better merited papers submitted by newcomers, with an assumption that papers from established and prolific authors may possibly attract more citations both in terms of selfcitations and foreign citations. Prolific and established authors working in a topic and aimed to publish them in a series usually cite their own previous works to delineate trajectory of their research works. Thus, they have higher probability of generating self-citations. As established authors are usually highly networked and have good number of followers, papers of these authors may likely generate high recitations and other foreign citations. These will eventually increase citation counts of journals. Thus, along with legitimate but coercive journal self-citations, preference to publish papers of established and prolific authors can boost a journal's citation potential. This kind of intelligent mix can't be detected or contested.
Though the effect of JSCs and ASCs on citation based impact measures have been studied individually at times, [33,35,[44][45][46] surprisingly, the unified effect of these optimizing factors on citation counts of a research entity is less explored. More recently, Giri [30] has made an attempt to examine the unified effects of these optimizing factors on citation counts of the journals empirically, where he took a synchronous publication year and a diachronous citation window. However, popular citation based matrices of journals like JIF or JII are based on diachronous publication period (i.e., publication period spans over multiple calendar years) and synchronous citation window (i.e., citation window of one specific calendar year).
Thus, the current study is a step towards greater understanding of the effects of citation optimizing factors on popular citation based journal impact measures. Specifically, this study seeks to answer the following research question: (1) To what extent do these three optimizing factors i.e., ASC, JSC and RC leverage their individual as well as unified influences on three widely used journal impact measures, viz., JII, 2-year JIF and 3-year JIF? (2) Does exclusion of these optimizing factors for computing these impact indicators reveal better association among these indicators?

METHODS
The analysis presented in this paper is based on nineteen journals under the subject category of 'Library and Information Science' (LIS) from Scopus database. The journals are selected through two stage selection procedures. In the first stage, following criteria are adopted to prepare initial list of Scopus index journals for this study: About 56 journals qualify the above criteria. In the second stage, among the journals qualified in the first stage, nineteen journals representing different ranks starting from highest to lowest are taken for this study. As the citation rate widely differs among subjects, only one subject category is chosen to get more homogeneous citation data. The synchronous citation window of 2014 is used in this study. The bibliographic data of published articles of these journals along with their citations in the observed period are extracted from Scopus database. The data are then processed using spreadsheet. Prime data-collection is carried out from January 2017 to December 2018. The lists of selected journals are given in the following Table 1.
Here, following definitions are adopted for ASC and JSC.
When the intersection of set of authors of citing and cited article is not empty, the citation is called as ASC. When an article of a journal cited a previous article of the same journal referred here as JSC. The definition and calculation for diachronic recitations forwarded by Howard D. White [47] is used here as 'Recitation'. Disambiguation of authors' names are carried out by using tools like, Scopus author ID, Author profiles from Google Scholar, ORCID.

Journal Immediacy Index and Optimizing Factors
JII is widely regarded as manifestation of current impact of journals and extensively used for retrieving popular papers from emerging areas of research in a given discipline. [44] Table 2 provides citation distributions of the journals for studying JII for the year 2014.
From the data given in Table 2, it is observed that contributions of ASC and JSC vary from 0-100%, whereas contribution of RC ranges from 0-33.33%. As the synchronic citation window of current year (i.e., publishing year of the article) is considered in JII, it is quite natural that recitation counts will be very low unless hyper-prolific authors are affected by the idea. The percentage figure of JSC shows that six journals received more than 40% of total citations as JSC. Among these six journals, it is found that all citations received by the journal JILDDER (i.e., total of six citations) are JSC and the other journal HILJ has received 29 citations as JSC out of its total citation counts of 30. Krauss [48] studied the effect of JSC on JII for the 107 JCR ranked ecological and evolutionary journals and found that JSC had contributed about 34% of TC in most of these journals. Thus, these results are almost in conformity with the earlier findings. Investigation of ASC on TC of these nineteen journals has shown that nine journals have received one third or more of its total citations as ASC. As citation counts signify impact of research, optimization may arise from different actors of a research publication. Therefore, union of the widely used pliable tools commonly counted as optimizing tools viz., JSC, ASC and RC denoted by CIF may better reveal the combined influence of these factors on TC of these journals. The percentage data of CIF in Table 2 shows that fifteen journals (i.e., about 79% of total sample journals) have received one third or more of its citations as CIF citations. As high CIF in TC of a journal indicates low visibility of a journal, it may be argued that high CIF in JII are employed to attract more citations and perhaps for better search engine optimization as demonstrated in several prior studies. [12,13,39] Hence, it may be inferred that the Journal Immediacy Index (JII) is predominantly contributed by CIF citations and thus raw JII does fail to reflect a journal's actual visibility.
Further, the JII values and JII ranks for the journals studied are given in Table 3. It is seen that HILJ has highest JII rank with a value of 0.909 while ALIS has lowest rank with a JII value of 0.114 before exclusion of CIF. However, exclusion of citations due to CIF causes abrupt declination of JII value (reflected through MJII values) in most of these journals. It is observed that three journals having JII rank 1, 4 and 18 have 100% decrease in JII value after exclusion of citations due to CIF. Of the remaining sixteen journals, eleven journals have declination of JII value at 40 percent or more.
Journal of Scientometric Research, Vol 9, Issue 1, Jan-Apr 2020  Both the Pearson correlation and Spearman rank correlation are carried out to understand the correlation of JII values before and after exclusion of CIF. The r value of Pearson correlation (i.e., 0.260, 2 tailed significance at 0.283) and Spearman rank correlation (i.e., 0.227, 2 tailed significance at 0.350) demonstrates that raw JII and MJII are very poorly correlated. Though face value of JII is widely considered to understand the current impact of journal articles, it may not be judicious to consider JII value for any research evaluation purpose as it may be gamed to a large extent to inflate its value using these pliable optimizing tools. These findings are almost in line with the finding of earlier studies. [48,49] However, it deviates from the findings of Huang and Cathy Lin, [44] where investigations on 20 environmental science journals revealed that the effect of JSC on JII was relatively less and little difference existed between pre and post JSC excluded immediacy indexes of the journals. The probable reason for this deviation from the present study is that their study examined effect of JSC only on JII, whereas the present study emphasizes on the unified effects of JSC, ASC and RC on JII.
Optimising Factors and two-year and three-year JIF Table 4 presents the citation data along with the contribution of optimising factors in total citation counts of journals for calculating both 2-year JIF (JIF2) and 3-Year JIF (JIF3). It is seen that percentage of RC has increased in JIF2 and JIF3 in comparison to its share in JII. As the distance from publishing year gets increased, the probability of noticing a paper by prolific authors gets increased, which may lead to higher recitations. Also, the share of RC is expected to be less in synchronous year citation counts of a journal as required in its JII, JIF value derivation, unless significant numbers of highly prolific authors preferentially used the journal. This is why the share of RCs in JOI is found to be quite higher than other journals in the present set of journals.
The percentage of JSC given in Table 4 reveals that about six journals in JIF2 calculation and about four journals in JIF3 calculation have around twenty five percent or more as JSC.
The comparative view of JSC in these two JIF indicators in conjunction with JII clearly demonstrates that JSC percentage tends to decline with the passage of time. Therefore, it may be argued that JSC acts more as a self-promotional avenue for journals and it may not be advisable to include JSC in journal impact assessment.
The contribution of ASC in total citation counts of the journals is also quite significant though its share is relatively less than the share of JSC. It is found that share of ASC crosses twenty five percentages of total citation counts in three journals in JIF2 data and only one journal in JIF3 data.
The CIF percentages as reflected in Table 4 clearly demonstrate that these optimising factors in their union, definitely play a role in augmenting JIF value of these journals, though these factors do not increase societal impact of a research work in terms of reach and thus unable to predict actual visibility of a journal in research communication. It is observed that CIF citations contribute one third or more of total citations in seven journals (i.e., ~37% of total journals) in JIF2 calculation and in six journals (i.e., ~32% of total journals) in JIF3 calculation. Thus, the modified JIF values (i.e., MJIF2 or MJIF3) that exclude all the citations due to CIF show a substantial decline from their original JIF values. These results however differ to some extent from the findings of Alguliyev and Aliguliyev [45] where they advocated modified impact factor for journals after excluding JSC only. The deviation arises perhaps due to difference in number of optimising factors considered in deriving modified impact factor.
Further, Table 5   Thus, it can safely be inferred from the above statistical viewpoints that the journal citation impact indicator derived after excluding citations due to these common optimizing factors may better reveal a journal's visibility or impact. Similar view is also echoed by earlier studies. [36] CONCLUSION Ranking journals by two year JIF value is common and widely used by different stakeholders of journal publishing world for different purpose. JCR from Clarivate Analytics also includes JII value along with JIF value of journal whereas Scopus prefers 3-year JIF value (termed as CiteScore) of journal. The above study has demonstrated that all these indicators are more or less likely to be affected by the common pliable optimizing tools (i.e., ASC, JSC and RC). It may, therefore, be inferred from these results that if all these popular citation based indicators are calculated after excluding citations due to these optimizing factors, it may better inform performances of journals in a given discipline. As there are strong disagreements among the researchers, academicians and scientometricians regarding exclusion of citations arising out of these optimizing factors, the newly proposed Active Visibility Index [31] which automatically excludes all these optimizing factors in its derivation may be a suitable alternative for fair assessment of journals' impact.
positions in one journal. These results thus essentially raise an important question on the biasedness of these optimizing factors towards 2 year JIF as it is the most prevalent indicator for assessing journals' impact.
However, changes in rank order does not distinguish large and small distance, a closure look into the changes has been illustrated using scatter plot of these indicators by linear regression model. These linear models well capture relationship among these three indicators before and after exclusion of citations due to CIF. Figure 1 along with coefficients of determination given in Table 6 clearly depict that relationship