Impact of Self-Citations on Impact Factor: A Study Across Disciplines, Countries and Continents

Purpose. The present study is an attempt to find out the impact of self-citations on Impact Factor (IF) across disciplines. The study examines the number of research articles published across 27 major subject fields covered by SCImago, encompassing as many as 310 sub-disciplines. The study evaluates aspects like percentage of self-cita-tions across each discipline, leading self-citing countries and continents, and the impact of self-citation on their IF. Scope. The study is global in nature, as it evaluates the trend of self-citation and its impact on IF of all the major subject disciplines of the world, along with countries and continents. IF has been calculated for the year 2012 by analyzing the articles published during the years 2010 and 2011. Methodology/Approach. The study is empirical in nature; as such, statistical and mathematical tools and techniques have been employed to work out the distribution across disciplines. The evaluation has been purely undertaken on the secondary data, retrieved from SCImago Journal and Country Ranking. Findings. Self-citations play a very significant part in inflating IF. All the subject fields under study are influenced by the practice of self-citation, ranging from 33.14% to 52.38%. Compared to the social sciences and the humanities, subject fields falling under the purview of pure and applied sciences have a higher number of self-citations, but a far lesser percentage than the social sciences and humanities. Upon excluding self-citations, a substantial amount of change was observed in the IF of subject fields under study, as 18 (66.66%) out of 27 subjects fields faced shuffle in their rankings. Variation in rankings based on IF with and without self-citation was observed at subject level, country level, and continental level.


INTRODUCTION
Citation analysis is a subject of significant interest among the research scholars across the globe, factually for being one of the oldest methods to gauge the impact of a research article. Impact Factor (IF) and Hirsch Index (h-index) are two quality parameters employed to judge the quality of research articles and both the parameters are computed on the basis of number of citations received by an article, applied to compute the IF or h-index of a journal or an author. Researchers across the globe mostly prefer to publish their research results in those journals, which have higher IF. The higher the Impact Factor of a journal, higher is the rate of rejection of articles submitted to such journals and thus better supposed is the quality of articles published in high IF journals.
Metric studies like IF and h-index computed for authors, journals, and so on are purely based on the citation analysis, received by research publications. Given the fact, it becomes imperative to know how far the self-citations play a part in inflating the IF or h-index of journals, or for that matter of authors or countries. In the present study, an attempt has been made to work out the impact of self-citation on IF across disciplines, countries, and continents. The focus has been laid on assessment of IF and to examine the overall percentage difference in IF, both with and without self-citation.

Impact Factor
IF is calculated on the basis of citations received by a research publication. The concept was first suggested by Gross and Gross in 1927, who said that counting references can be used to rank scientific journals, but it was Eugene Garfield who in 1955 was instrumental in devising the formula to compute the impact of scientific journals of repute by undertaking computations on the basis of citations received by scientific publications published by these journals. Garfield suggested that Impact Factor can be calculated for any given period of time, but to evaluate the current impact of publications, he suggested computing IF by undertaking the publications of the preceding two years for which the Impact Factor is to be calculated. As per the Web of Knowledge, Journal Impact Factor (JIF) is "the average number of times articles from the journal published in the past two years have been cited in the Journal Citation Report (JCR) year. " Garfield was also of the view that generally the role in research is played by current awareness service (CAS); as such, within a span of two years these research publications catch the attention of potential researchers. Over a period of time the concept of Impact Factor has been extended to various other fields, and in computing the IF of authors, countries and subject disciplines, or even for that matter as of late, web-IF has started gaining popularity in the research circles.

Self-Citation
As per the Thomson and Reuters journal citation report (JCR), "A self-citation is a reference to an article from the same journal. Self-citations can make up a significant portion of the citations a journal gives and receives each year. " In simpler terms, we can say a journal self-citation is a practice whereby a research work published in a particular journal includes references from the research works published in the same journal previously. Farrara and Romero (2013) in their study defined self-citation as such: "A self-citation C P S →Q is any citation appearing in a paper P pointing to paper Q, whose set of authors are respectively A[P] and A [Q], for which it holds true: A P ∩ A Q ≠ Ø , i.e., the intersection of the sets of authors is not empty. "

OBJECTIVES OF THE STUDY
• To understand and examine the trend of self-citation across disciplines and its impact on Impact Factor across disciplines, countries, and continents. • To reflect on the self-citations of the world's top thirty cited countries of the subjects under study and to determine the variation in their Impact Factor by excluding self-citations.

LITERATURE REVIEW
Some of the earlier studies undertaken in the field of self-citations relevant to the present study have been reviewed hereunder. Tagliacozzo (1977) in his study concerning the practice of self-citation in scientific literature undertook an analysis of 180 research articles and observed that nearly 17% of citations in Plant Physiology and Neurobiology are self-citations. Similarly, Bonzi and Snyder (1990) in a similar kind of study analyzed 120 publications and found an average of 11% self-citations across disciplines, which varied from 16% to 3% among physical and social sciences. These studies indicate the fact that there is nothing new about the practice of self-citation and it has been there for quite long now. Besides, the trend of self-citation varies considerably from subject to subject and as of late the practice has assumed a much larger shape.
Nederh et al. (1993), Moed and Velde (1993) and Leeuwen, Rinia, and Van Raan (1996), in their respective studies concerning the trend of self-citation, found that during the period 1985 to 1994, 29% self-citations were recorded in physics and chemistry each and this percentage was higher during the first year of publications. Van Raan (1998) in his study upheld the practice of self-citations by saying that they cannot be neglected for the fact that such practices allow us to perform corrections. Factually, self-citing an earlier work in a new work eliminates the chances of any missing link in the research work.
White (2001) suggested that there is no need to get carried away by the practice of self-citations as these can be easily identified and can be easily excluded while computing the JIF, as they have a potential effect on the JIF. Accordingly, editors have grown conscious of the growing awareness among authors towards publishing their research results in high IF journals and as such have begun to manipulate the IF of their journals (Jennings, 2001). These studies are a clear indication of the fact that self-citations play a very important role in inflating the IF of journals, or for that matter of individuals.
Hyland (2003) in his study has opined that self-citation accentuates the expertise of a researcher in any given field of activity and perpetuates one's credibility and interpretation of such specific research findings. Surely, the importance of self-citations by no means can be undermined as these citations redirect us to earlier sources over which the present work has been built up to a larger extent, along with authors' or individuals' earlier understandings and contributions in the given field.

Neuberger and Couinsell (2002) and Sevinc (2004),
in their respective studies, have reported on instances whereby manuscripts submitted for publication in journals were returned to authors with remarks by the editors to add some references from that very particular journal of some relevant previously published articles.
All this clearly reflects that the trend of journal self-citation is going around quite intentionally, where editors and publishers are manipulating the JIF as per their connivance and to their suitability. Gami et al. (2004) raised concerns and criticized the IF as a metric for the fact that it is influenced by the practice of self-citation. Kaltenborn and Kuhn (2004) are of the view that there is a growing consciousness among publishers and editors of journals about the importance of Journal Impact Factor (JIF), whereby researchers prefer to publish their research results in those journals which have higher IF. There is no denial of the fact that editors want their journals to have a high Impact Factor, firstly to attract the authors and also to present their publication as superior and qualitative to that of others. Anseel, Duyck, and Baene (2004) while assessing the impact of self-citations over psychology journals, found that upon adjusting the self-citations, the Impact Factor of journals with higher IF decreased by around 15%, while as, in the case of mid and low IF psychology journals the IF declined between 35% and 45%, respectively.
Frandsen (2007) in his study of journal self-citations and JIF mechanisms undertook an analysis of the impact of self-citations by studying 32 economics journals, and observed that with the increase in the self-citation, the JIF increases. Frandsen is also of the view that the self-citing rate and self-cited rate are positively related. Campanario (2011) undertook a study on the impact of citations in general and self-citations in particular on the Impact Factor of journals on a yearly basis. Campanario chose to analyze 40 different journals on a yearly basis during the period 1998 to 2007 and found that self-citations on the whole resulted into increase of 54% increase and 42% decrease in the journal Impact Factor. King et al. (2013) in their study based on gender self-citation analyzed over 1.6 million articles of JSTOR published after the 1950's and found that, compared to women, men self-cite their own earlier research work at a higher rate and 10% of cited articles are self-cited. The authors also observed that the gender gap of self-citation over the past 50 years has increased, despite a fair amount of women joining academics. Of the total references studied, the authors found that 9.4% are self-citations and molecular biology has the highest rate of self-citation and classical studies the lowest.
Farrara and Romero (2013) in their study worked on the discounted h-index (dh-index) to present a method whereby we can mitigate the impact of self-citations while computing the impact evaluation of journals and authors. The authors in their experiment observed a decrease of 3% to 23% in author h-index and a decrease of 2.5% to 22% in the journal h-index. Since both the IF and the h-index are computed on the basis of citations received by a research article, accordingly we can work out the discounted IF for both the authors and the journals, and so the result has been actually computed in the analysis part of this particular study.
Most of the research studies reviewed concerning the field of self-citations and their impact on IF have revealed that self-citations play a very significant role in inflating IF.

METHODOLOGY
This study has been undertaken on secondary data retrieved from SCImago Journal and Country Rankings on June 21, 2014, accessible at http://www.scimagojr.com/ index.php. The data upon retrieval was in semi-struc-tured form; as such, it was first structured in the desired form given the objectives of the study. Structuration of data was done after retrieving it on a yearly basis for each individual subject, for each individual country, and for each individual continent. To analyze the data for a continent, affiliation of each contributing country was sought with the respective continent by using the world atlas. Also, to perform some basic expressions like drawing percentage, division, subtraction, and so on, mathematical tools and techniques were employed. Impact Factor has been calculated by adopting the method devised by Garfield, and accordingly the rankings undertaken in each table referenced below are based on the Impact Factor. The higher the Impact Factor of a subject discipline, country, or continent, the higher is its ranking. Accordingly, (R 1 ) is the ranking calculated with self-citations and (R 2 ) is the revised ranking, calculated without self-citations. The difference computed in the IF with and without self-citation gets reflected in R 2 , which as a result will give insights about the impact of self-citations on the Impact Factor.

Analysis Approach & Tool
The Impact Factor for each individual subject field, country, and the continent has been computed for articles published during the year 2010 and 2011 and the citations received during the year 2012.

Tool
The tool has been applied in all the three tables as per the scheme of things worked out.

RESULTS
The computations of the present study are based on Impact Factor, involving no complex mathematical cal-culations. The expressions drawn for a percentage have not been rounded off, hence may reflect slight variation while computing figures for 100%. In the above tabulations, the IF of the subject fields under study has been computed with and without self-citations to draw their R 1 & R 2 ranking based on IF, as per the proposed research tool. At the global level, a total of 1,889,483 citations were received by all the 27 major subject disciplines under study, of which 749,135, (39.64%) are self citations, constituting more than one third of total citations received. The multidisciplinary subject field emerged as the major subject field with the highest 0.814 Impact Factor, followed by Physics & Astronomy and then Biochemistry, Genetics, & Molecular Biology with 0.486 and 0.469 IF, respectively. The Arts & Humanities subject field stands at the bottom of the table with a minimum 0.054 IF, and the percentage of self-citations of each individual subject field varies considerably.
Arts and Humanities have a maximum 52.38% self-citations, followed by Psychology and Engineering with 51.34% and 50.94%, respectively. Physics and Astronomy is the subject field having a minimum 33.14% self-citations, followed by multidisciplinary subjects and medicine with self-citations percentages of 33.27% and 33.38%, respectively. The percentage of self-citation is drawn in proportion to the total citations received.
The interesting fact is that Medicine, despite being the third lowest self-citation percentage subject field, has the maximum number of self-citations to its credit with an overall self-citations share percentage of 17.5% (131,147). Medicine is followed by Biochemistry, Genetics, & Molecular Biology and then Physics & Astronomy, with an overall self-citation share percentage of 12.39% (92,836) and 9.70% (72,718), respectively.
There is a need to understand the fact that having a greater number of self-citations may not necessarily result in a greater percentage of self-citation and vice-versa. But if a subject field has received a greater number of self-citations, the number has a greater impact on its IF. The Impact Factor of a subject field with and without self-citations shows a considerable change, as at gross global level, 39.67% decline was recorded in Impact Factor for all the subject fields when taken together.
Of the total citations received by all the subject fields under study, Medicine receives the maximum (392,829, 20.79%) number and percentage of citations, followed by Biochemistry, Genetics, & Molecular Biology (256,761, 13.58%) and Physics & Astronomy (219,376, 11.61%). 14 subject fields have citations share percentages between 1.02% and 7.19%, while as, 10 subject fields each have received less than a 1% share of total citations, with Dentistry at a minimum (2,978, 0.15%) citations share percentage.
In terms of overall publications share percentage, it  The ranking of each subject field has shown a considerable variation, as out of 27 subject disciplines 18 faced variation, either to the next lower or earliest higher, which also means 66.66% of subject fields faced changes in their rankings.
The subject fields which slumped in their ranking include: Neuroscience, Earth & Planetary Sciences, Environmental Sciences, Chemical Engineering, Psychology, Computer Sciences, Engineering and Social Sciences; while the subject fields which improved in their ranking include Immunology & Microbiology, Medicine, Pharmacology, Toxicology & Pharmacy, Health Professions, Nursing, Decision Sciences, Dentistry, Economics, Econometrics & Finance, Veterinary Sciences, and Business Management and Accounting. However, in the remaining 9 (33.33%) subject fields no change was observed in their rankings.
The Impact Factor of the world's thirty leading cited countries during the year 2012 has been worked out accordingly. Upon computing the Impact Factor of countries, Switzerland, Denmark, and the Netherlands emerged as the countries with the maximum Impact Factor of 0.529, 0.429, and 0.471, respectively, while Russia stands at the bottom of the table with 0.275 IF.
Of the total citations received by countries at the individual level, China has received the maximum 59.15% self-citations, followed by the U.S 56.92%, Iran 50.72%, India 46.03%, and Brazil 38.10%. Israel stands at the bottom of the table with a minimum of 20.68% self-citations. Impact of Self-Citations on Impact Factor

TC-Total Citations, SC-Self-citations, IF-Impact Factor, R1-Ranking with Self-citations and R2-Ranking without Self-citations
Impact of Self-Citations on Impact Factor There is almost a linear but substantial decline in the IF of countries after computing IF without self-citations in the world's thirty leading cited countries. The trend gets equally reflected in Fig. 5, reflecting changes in rankings, based on the IF of countries with and without self-citations. On the whole, 22 (73.33%) countries faced variation in their rankings, while as, no change was observed in the ranking of 8 (26.67%) countries. Countries which slipped in their rankings include Belgium, Finland, Norway, the United Kingdom, Germany, Australia, Italy, the United States, Japan, China, and Iran; and the countries whose ranking improved include Austria, Israel, Canada, Portugal, Greece, South Korea, Taiwan, Brazil, Turkey, India, and the Russian Federation.
The United States is the leading country in the world to receive the maximum (445,265, 23.56%) citations share percentage at a global level, followed by China with 144,051 (7.62%) and the United Kingdom (135,967, 7.19%) citations share percentages. Of the total 30 tabulated countries, 16 countries have received citations from 1.13% to 6.56% and 11 countries have received less than 1% of citations each.
The United States leads the table with the maximum (1,289,380, 20.83%) global publications share percentage, followed by China and United Kingdom with a share percentage of 12.02% (744,309) and 5.84% (361,457), respectively. Singapore figures at the bottom of the table in terms of its publications share percentage of 0.55% (34,499). Impact of Self-Citations on Impact Factor The scenario of continents is altogether different and has been computed by taking together the publications and citations of different countries falling under a particular continent. Accordingly, Europe leads in the table of continents with an IF of 0.361, followed by Oceania and North America with their individual IF scores of 0.360 and 0.342, respectively. Asia stands at the bottom of the table with an IF of 0.214. In terms of overall publications share percentage, Europe again leads the table with a share of 36.58% (2,264,069), followed by Asia and North America with their individual publications share percentage of 30.49% (1,887,098) and 24.90% (1,541,506), respectively. Africa stands at the bottom of the table with its lowest publications share percentage of 1. 97% (122,200).
The IF curves of the continents with and without self-citations, show a considerable difference. 91.97% of the total global publications have come from the Europe, Asia, and the North America, while as, the remaining 8.03% have come from the Oceania, South America, and the Africa. In terms of variation in IF at continental level, North America showed the maximum 52.63% decline in the Impact Factor, followed by Asia with 43.92%. The IF of Oceania and Europe declined by 32.77% and 30.74%, respectively, while as, the Africa showed the minimum decline of 27.63% in its IF.
In terms of citations share percentage at global level, Europe has received the maximum 43.33% (818,897), followed by North America and Asia with their share percentages of 27.92% (527,696) and 21.41% (404,725), respectively. However, of the total citations received by Africa 27.42% are self-citations, followed by South America with 30.62%, and Europe with 30.84%. North America is the largest continent which has the highest 52.50% self-citations.

SUMMARY OF FINDINGS AND DISCUSSION
The practice of self-citation is advocated, so long as the authors carry forward their earlier work or continue their studies from earlier stages. The practice of this kind of self-citation is purely regarded as ethical, as it is more purposive and justified; whereas, at the same time, if the practice of self-citation is carried out with an aim to inflate individuals' or journals' IF, then surely the practice cannot be regarded as ethical. As highlighted in the introductory section that IF or h-index has somewhat become the parameter to gauge the impact of a research article, as such the practice of manipulation of citations by self-citing one's own articles helps one to inflate both the IF and the h-index. It is evident from the findings that 39.64% self-citations helped to inflate the IF of all the 27 subjects under study from 0.184 to 0.305, an inflation of 65.76%, and more than 66.66% of subject fields felt the impact, as 18 out of 27 subject fields faced variation in their rankings, which varied from 33.12% to 53.70%, hence reflecting the impact on their ranking.
The self-citations received by the subject fields are almost in proportion of the total number of citations received by each individual subject field. Compared to Arts & Humanities, which recorded the highest 52.38% self-citations during the period of study, Physics & Astronomy received the lowest 33.14% self-citations. Psychology and Engineering are the other two leading self-citing subject fields having 51.34% and 50.94% self-citations, respectively. There are subject fields which have received a far greater number of self-citations, but still enjoy a lesser percentage of self-citations, include Multidisciplinary subjects (33.27%), Medicine (33.38%), Dentistry (35.35%), Biochemistry, Genetics, & Molecular Biology (36.15%), Immunology and Microbiology (36.25%), and Neuroscience (38.79%).
There is a considerable variation in the self-citations percentage and numbers of citations which various subject fields have received, this variation in self-citations leaves enough scope for some more advanced studies in direction, as what makes professionals from a particular science to self-cite more than the professionals from other sciences who self-cite far less. But apparently the reason can be owed to the following: • As is apparent from Table 1 above, the amount of research undertaken in the field of Social Sciences and Arts & Humanities is far less than the research work undertaken in the fields of natural and pure sciences and so is the proportion of citations received by the articles published in respective sciences. The larger the proportion of research publications available for review, the lesser will be the proportion of self-citations, and vice-versa. • The amount of research undertaken in continents like Africa, Australia, and South America is far less than those of Asia, Europe, and North America. For obvious reasons, the lesser the rate of research carried out by a country or continent the lesser will be the percentage of self-citation from these continents. Accordingly, the higher the rate of research undertaken, the greater are the chances of having the highest percentage of self-citations. Aksnes (2003), while studying over 45,000 scientific publications published between the period 1981-1996 in Norway and found that of the total citations, 36% are self-citations, having a direct bearing with the number of co-authors of publications. Research work undertaken on the co-authorship pattern may reflect a greater percentage of self-citations for the reason that each co-author of a particular publication may selfcite a particular work in his/her next work, thereby increasing the chances of receiving a higher number of self-citations by a particular publication; while as, for obvious reasons, research work undertaken as a single author permits far fewer chances to receive a significant number of self-citations. Aksnes also observed that poorly cited papers have a far greater percentage of self-citations than those of highly cited papers. Aksnes further observed that there was no uniformity in self-citation practice across scientific disciplines.
The practice of regional or national self-citation by and large appears unintentional, but it is quite interesting to see that the countries which have generally recorded a higher number and percentage of self-citations are either developed countries or are the fastest-developing countries, so this somewhat leaves sufficient scope for advanced study to see and assess the reasons for greater self-citations in such countries, which generally is seen because of better research done and the quality of literature available in these countries. This gets better corroborated by the fact that China has recorded 59.15% self-citations in their research publications during the period of study, followed by the U.S 56.92%, Iran 50.72%, India 46.03%, Brazil 38.10%, Germany 37.53%, Japan 36.91%, Russia 36.73%, Poland 35.56%, Italy 35.03%, the United Kingdom 34.49%, Australia 33.59%, Spain 32.63%, Taiwan 32.06%, South Korea 31.75%, France 26.54%, and many more.
Arguments are also being made about the practice of journal self-citation, as to whether the practice is intentional or unintentional so as to be rated as ethical or unethical. There is no denial of the fact that IF metrics for both authors and journals can be easily manipulated with the practice of self-citation, hence an unethical practice. The practice of self-citation is somewhat acceptable in academic circles and is termed as ethical if the same is undertaken for justified reasons. Researchers are of the view that journal impact is always a consideration among editors and self-citation can prove a very handy tool.

Limitations
It was practically impossible to work out the citations received in 2012 of the articles published in 2010 and 2011; hence a uniform pattern was followed for all the subject disciplines under study by calculating the total publications for the year 2010 and 2011 and the total citations received by these subjects in the year 2012, and not specifically of the articles of the period under study, which otherwise should have been the case to calculate Impact Factor. Besides, how exhaustive our citation count is can be termed as a limiting factor. The moment indexing services add new journals in their database the figures reflect variation, which nevertheless has been taken care of by reflecting the date of data retrieval.

CONCLUSION
It is quite evident from analysis that self-citations play a very significant role in inflating the IF of subject disciplines, the countries, or the continents, and so holds it true about inflating IF scores of authors and journals. As reflected in the rankings with and without self-citations, there is a considerable decline in the Impact Factor of each individual subject field.
The practice of self-citation reflects a mix of both the positives and the negatives, whereby if on one hand, the importance of self-citations cannot be undermined for being purposive, on the other hand, if the same is undertaken only for inflating one's IF or any other impact parameter, then there is surely a need to regulate such unethical practices. Given the evidence of manipulation of parameters set to judge the impact of a research publication as per one's suitability, this somewhat warrants change, either by setting new parameters which may not be easily influenced or by doing away with those aspects which influence such manipulations.
However, researchers all across the globe are of the view that editors of journals and reviewers can play a commendable part in doing away with the practice of unnecessary and unwarranted self-citations by suggesting authors to do away with those journal and author self-citations which are least required or have no relevance at all with the study. And to this effect, Garfield and Welljams-Dorof (1992) opined that ex-cessive self-citations are well apparent and that these should be done away with during the review process of articles.