Analysis of Global Research Trends in Coronaviruses: A Bibliometric Investigation

This COVID-19 (n-CoV) belongs to a large family of viruses known as ‘Coronaviruses’ that causes respiratory and intestinal illness among animals and human beings. The present study attempts to understand the trends in global research in coronavirus related diseases during the last seven decades. It thus seeks to provide an informed assessment of research in this area. Major areas of research related to this disease on which the research focused, were ‘acute respiratory syndrome’ and fusion and penetration process of this virus with ‘gastroenteritis virus’ and ‘mouse hepatitis virus’ (mhv). The USA and China were the most productive countries and the collaborative researches work in China were largely intra-national whereas in the USA, these tended to be multinational. This paper elaborates and illustrates some salient characteristic trends of research on coronavirus related diseases that has taken place in these two most prolific countries. Some interesting trends can be observed from citation analysis. Although on average, an article received 27.76 citations, wherein 10% of citations came only from the top 56 (0.34%) articles which suggest that only a few articles out of total articles received global attention. The weak collaboration link between highly cited authors also suggests that collaborative research team work in this field does not so much exist. Research activity in this area can be traced back to early 1950’s. It is not surprising that in recent years, more intense research in this area is being undertaken now than that was done before when diseases caused by this virus were more localised. It gives hope that this well directed research across different countries will provide new pathways for understanding coronavirus generated diseases including the present n-CoV which is an essential pre-requisite for developing measures to control coronavirus associated disease and develop vaccination for its prevention.


INTRODUCTION
Stephen Morse of Columbia University defines 'infectious disease' as a 'disease that is rapidly increasing in incidence or geographic range, including previously unrecognized diseases as HIV/AIDS, Ebola hemorrhagic fever or Nipah virus'. Institute of Medicine of the National Research Council in the US in its report in 1992 entitled Emerging Infections: Microbial Threats to Health in the United States identified six factors influencing infectious diseases. [1] These are human demographic characteristics and behaviour, technology and industry, economic development and land use, international travel and commerce, microbial adaptation and changes and breakdown of public health measures. These may not be inclusive, but broadly cover key factors that influence infectious diseases.
According to World Health Organization (WHO) "Coronaviruses are a large family of viruses that are known to cause illness ranging from the common cold to more severe diseases such as Middle East Respiratory Syndrome (MERS) and Severe Acute Respiratory Syndrome (SARS)". [2] Although the Novel Coronavirus (COVID- 19) is an infectious disease caused by a new coronavirus introduced to humans (WHO, 2020), the history of coronaviruses can be traced back to 1931. Earlier studies claimed that Coronaviruses (CoV) was first reported in turkeys in the United States in 1951. This virus was known as "blue comb disease, mud fever, transmissible enteritis and coronaviral enteritis" [3] during that period. There was another similar outbreak that occurred in the winter of 1957, but that outbreak was not caused by coronavirus -it was an influenza virus. In fact, until 1965, no coronavirus was discovered that could infect humans. In 1965, Tyrrell and Bynoe first isolated the virus from the human embryonic tracheal organ culture which was obtained from the respiratory tract of an adult with a common cold and they named it B814. There are two prominent human coronaviruses that have been identified; the first one is called Middle East Respiratory Syndrome (MERS), which is largely prevalent in Saudi Arabia or visitors to that area and the second one is Severe Acute Respiratory Syndrome (SARS), mainly prevalent in China and among travellers from there. [4] Although, it was earlier identified that SARS-Cov and MERS-Cov first circulated in bats before these were transmitted via intermediate hosts to humans, no evidence has yet been found that SARS-Cov-2 (COVID-19) too follows the same route. [5] This new coronavirus, designated as COVID-19, was first reported on the 31 st of December 2019 from Wuhan city in the Hubei province in China, which caused Severe Acute Respiratory Syndrome (SARS-CoV-2). The outbreak statistics of SARS-CoV2, since January 21, 2020, are available on the WHO website. As the data displayed on the Dashboard indicate, the growth of the coronavirus during its initial phase was negligible. In February, the growth was also stable until 13 February 2020 when more than 15,000 cases were reported. But from March onwards, the rate at which it spread, has been quite high throughout the world with the highest number of cases, some fifty thousand cases, noted on 26 March 2020. While on 31 March 2020, the outbreak was quite acute in the six countries -Italy, United States, China, United Kingdom, Spain and Germany. Figure 1 shows the global map of the COVID-19 outbreak as indicated by WHO. It is clear in the Figure 1, this virus caused a huge impact throughout the globe irrespective of the fact whether the affected country was developed, developing, or under-developed.
The paper attempts to examine the extent of the global researches so far done in the domain of coronaviruses and also to reflect where and why we are lagging behind. Is there any comprehensive scientific and technological information available in the recent past which deals with the dynamics and evolution of coronavirus research or any such infectious disease?

Literature Review
Bibliometrics has become a heterogeneous field [6] for assessing the national and international research that focusses and evaluates research performance in order to identify future research priorities, funding sources and interdisciplinary collaboration. [7] It also provides a resources to policy-makers for implementing necessary prophylactic measures [8] or for disbursing aid regarding health issue and awarding research grants. [9] It is observed that although there exists some global bibliometric studies on various infectious diseases [10] including "Zika virus, [11] Ebola virus disease [10] yellow fever disease, [12] dengue, [13] Malaria, [14] leishmaniasis [15] and influenza," [16] there is a lack of comprehensive study in the domain of coronaviruses.
Earlier few bibliometric studies have been conducted about the outbreak of MERS or SARS. Chiu, Huand and Ho [17] conducted a bibliometric study using Science Citation Index and found that thirty-two percent of total share was published as new features, followed by 25% as editorial materials and 22% as articles. The USA dominates the publication by 30% of the total share followed by Hong Kong (24%). Similarly, Zyoud [18] conducted a bibliometric study to analyse the global research trends of MERS-Cov using the Scopus database. The study found that a total of 883 MERS-CoV research publications were published by 92 countries/territories and again the USA was the largest producer on that domain with 319 articles published in over 4-years, followed by Kingdom of Saudi Arabia (113 articles). The publications were cited on an average of 9.01 citations per publication and the h-index of these publications was 48.
Bibliometric based analysis also draws attention to the present pandemic and considerable number of literature have also come out. Various insights are emerging from these studies that inform the present research. Bhattacharya and Singh 1 have pointed out that "the alarming spread of the disease, challenge to control its spread and its grave health consequences, lack of vaccine or effective drug among others have prompted researchers to actively work on various aspects of this disease". This has led to accumulation of huge volume of research in this short period of time. [19,20] Drawing from Dimension database, the study conducted by Bhattacharya and Singh b has identified the most influential papers, the key knowledge base and major topics in the research related to COVID-19. An interesting aspect of their study was attempt to capture the society's perception by discerning key topics that are trending online. The study observes that China and USA drive and direct this research globally and are also actively collaborating with other countries in research. In another study, Lie et al. [21] have used PubMed and Embase to characterize the growth of medical literature between January 1 and March 24 on COVID-19 using evidence map and bibliometric analysis to elicit cross-sectional and longitudinal trends and to find research gaps. The results show that early covid-19 medical literature originally originated from Asia and focussed mainly on clinical features and diagnosis of disease. Issues like pathophysiology of COVID-19 with different body system, use of artificial intelligence were absent. They also found that the median submission-to-publication duration was 8 days (interquartile range: 4-16). Chahrour, Assi and Bejjani [22] in their study found that from December 2019 until March 18, 2020, 564 publications, indexed in the PubMed database and the World Health Organization (WHO) database for publications, were written on COVID-19. These articles were written by the researchers who came from 39 different countries, constituting 24% of all affected countries. It is deduced from the analysis that China produced the highest number of research output with 377 publications (67%).
All these studies have primarily captured the trend and insights of research covering COVID-19. The present study, however, attempts to undertake a comprehensive bibliometric study of research covering coronavirus disease. The results of the study may be useful to address the knowledge gap in the coronaviruses research and to produce a resource pool and a,b https://arxiv.org/ftp/arxiv/papers/2004/2004.10878.pdf thus accelerate study of clinical, epidemiological, diagnostic and therapeutic aspects of the present and future infectious diseases.

OBJECTIVES
The current study was designed to examine the global research trend in 'Coronaviruses' publications in terms of (i) growth of literature; (ii) prominent sub-domains of coronavirus research; (iii) prominent organizations or countries engaged in the research and (iv) productivity of the most relevant journals.

METHODOLOGY
In order to evaluate the global research trend in coronaviruses research, a bibliometric analysis was performed using available information indexed in Web of Science (WoS) Core Collection  and PubMed . The search in both the two databases were conducted on 30 March 2020. It was conducted on the same day in order to avoid the possible bias that may arise consequent to update on the database. All publications except errata were considered. Despite its known errors and limitations, WoS is considered as one of the most reliable and comprehensive databases for bibliometric study. A wide range of high-impacted journals are indexed in WoS. [23] On the other hand, PubMed is known worldwide for its wide coverage in medical science. [24] Globally relevant literature was searched without time limits through all these two databases. The search was restricted to Title search in both the two databases. The reason of choosing title-specific search over topic-search is that the precision of a title-specific search is assumed to be much higher than topic related search. [25] While the bibliometric parameters of the WoS output (13,329 publications) have been downloaded in tab-delimitated files, the output of PubMed results (14,521 publications) were exported in a CSV file. The output of both these two databases were finally mixed and after eliminating the duplicates, all unique records were kept in a separate file. A total of 16159 unique publications on human coronavirus (hereafter coronaviruses) were selected for final analysis. To project the collaborative patterns that exist in literature, visualization techniques were used. The visualization of the collaboration network was made using Journal of Scientometric Research, Vol 9, Issue 2, May-Aug 2020 VOSViewer (ver. 1.6.15). VOSviewer is an open-access tool developed by Leiden University "for construction and visualization of bibliometric network" c .

RESULTS
On analyzing the data extracted, it was observed that maximum of the literature was published in the form of Journal articles (11272 or 70%) followed by Reviews (1134), Proceedings (438), Editorial Materials (445) and Clinical Trials (309). English was the leading language in which 94% of literatures were published which indicates that researchers conducted their studies for the wide dissemination of their findings to the global community. Table 1 and Figure 2 display the global research output on coronaviruses in the last 7 decades.

Growth of Literature
It is important to note that in the last seventy years, of these total 16159 publications, almost 10% or 1612 publications were published in the first three months of the year 2020. This was followed by 4.88% or 790 publications in 2015 and 4.84% or 783 publications in 2004. Overall, half of the publications has been published in the last 10 years i.e. 2010 to 2020 and the remaining half in sixty years. The exceptional growth in 2020 is quite obvious as the SARS-Cov-2 (COV-19) has now become a major concern of every nation to develop treatment for COVID-19 or to develop various tools and techniques to prevent the COVID-19 outbreak and spread. However, it is important to note that despite the fact that the other variants of this virus like SARS-Cov-1 or MERS-Cov-2 also caused human casualties in the year 2004 and 2014, research in this domain did not get much attention of the global researchers in this field in the subsequent years.

Subject analysis by Title of Publication
To identify the major subdomains of this field where most of the researches have been conducted so far, the analysis was performed at the macro and micro-level. At the macro-level, the sub-domains in coronaviruses research were observed by analysing the titles of published literature. Figure 3 depicts the map of significant terms in title appeared during 1949-2020. The keyword map is based on 22590 keywords which are encompassed in 9 clusters. Of these, the four main clusters are indicated in colours: red, green, yellow and blue. While the size of a bubble indicates the frequency, the colour indicates a similar group or cluster of terms related to one other. On the other hand, the thickness of the line demonstrates the cooperation strength and distance between terms indicates the interrelatedness of terms. There were altogether 31189 significant terms available in 16159 publications. By setting the minimum number of occurrence of a term to 10 times, it is found that 1097 terms exist at the threshold. In general, the default choice in VOSviewer is 60% of the most relevant terms. Therefore, Figure 3 displays 658 terms.
The top ten keywords that appeared most in the title of documents are hepatitis virus (597 times), COVID (462 times), gastroenteritis virus (336 times) mers-cov (315 times), novelcoronavirus (301 times) child (296 times), respiratory virus (225 times), protease (218 times), pneumonia (206 times) and inhibitor (201 times). Little or no significant volume of terms has been observed related to vaccine, medicine, or isolation of virus strains in titles. It is interesting to note that, Wuhan, the epicentre of COVID-19, was not a significant keyword in title till March 31, 2020. Only 51 occurrences of this keyword were observed in the title and almost all are published from January 2020 to March 2020.
It is well understood that coronaviruses cause respiratory and enteric diseases among human beings and animals. The  epithelial cells of the digestive and respiratory tracts are the primary sites of the replication of coronaviruses. Therefore, researches on the hepatitis virus or gastroenteritis virus may be of great interest to researchers to understand how different genetic subsets of coronaviruses gain entry in these cells and whether they use similar or different strategies of fusion and penetration.

Subject analysis by Co-occurrence of Keywords
Keywords of an article indicate the core content of the topic. [26] In the next step, attempts have been made to identify micro-level terms in the subject by analysing the keywords of the published literature. According to the VOSviewer manual, "each link has a strength, represented by a positive numerical value. The higher this value is, the stronger the link will be. The total link strength indicates the number of publications in which two keywords occur together" c .
The bibliographic data show that there are 21895 keywords available with the title of the publications. The co-occurrence threshold of keywords was set to 5 which led to getting 3064 keywords in VOSviewer. As indicated in Figure 4a, all the keywords are grouped into five clusters: red, green, blue, yellow and purple for representing the sub-domains of the concept 'coronaviruses'. It is to be noted here that similar

Co-authorship analysis Co-authorship analysis of authors
Through an analysis of 16159 publications of the concerned domain, it was observed that researches conducted on 'coronaviruses' predominantly have either intra-national or international collaboration. Only 9.33% of the articles have been published by a single author. To identify the pattern of collaboration, we used the network visualization method. This function was applied to investigate the cooperation pattern among authors, organizations and countries that published articles on the subject. Figure 5 represents the co-authorship network of the 45639 authors of 16159 publications.
To maintain uniqueness in author name across publications, digital identity strategy was used. Out of the total authors, while identifying the authors who had a minimum of 5 number of publications, it was found that 3123 authors were at the threshold. To intelligibly visualize the author-graph, only the top 500 authors have been selected. In Figure 5 the collaboration network is shown by 23 Table 2, the total link and link strength are displayed for highly productive authors Vs. highly-cited authors.
As indicated in the table, all five highly productive authors are not highly cited authors (except Yuen, Kwok-Yung). However, highly productive authors have strong collaboration networks as a result they have higher link strengths. But authors with a higher number of citations have worked together with different sets of co-authors in most of their publications as a result these do not appear in co-authorship networks.

Co-authorship analysis of Organizations and Countries
Based on bibliographic data extracted from both WoS and PubMed, the organizations and countries co-authorship visualization map is created in Figure 6.    On the other hand, on selecting the criteria of minimum 5 publications for a country, out of the total 139 countries, 87 countries were under the threshold. In Figure 6b, different groups of the scientific camp are mentioned by different colours of the bubble. As indicated in Figure,

Bibliographic Coupling
In bibliometric research, research collaborations are frequently depicted through a citation network. The citation network can be established either by showing bibliographic coupling or co-citation networks between related documents, authors, or keywords. It is assumed that two authors with a common reference or having common citations are more related and have more similar research interests. Bibliographic coupling, introduced by Kessier, is a "measure to establish a similarity of relationship between documents, institutions, authors". [27] When two works (a,b) refer commonly to a third work (c) a network among works (abc) is established. This network provides deep insights in the scientific activities and reveals how an author, journal or organization is related to a domain to infer the focus as well as size of research community. Through VOSviewer bibliographic coupling network can be constructed and visualized.
The aim of using VOSviewer visualization for journals was to identify journal that are most strongly related (focal) to coronaviruses research and to know the related journals of this domain. Of the total 1695 sources where coronaviruses researcher were reported, top 5 journals, their articles, citation and link strength are shown in Table 3. The coupling network of all 169 journals is shown in Figure 7.    Of the total citations by 16159 articles, 10% of citations came only from the top 56 (0.34%) articles which suggest that citation rate is mainly reflected by the few frequently cited articles. Almost 30% of articles received less than 10 citations and about 12% of articles received no citation at all till the end of March 2020. The findings of this study have an association with the earlier study where it has shown the close relationship between IF and citations and that "the most cited articles are usually published in journals that are on the top of the IF list". [28] However, highly productive authors do not always need to show high link strength. Because the link strength depends not only on citations but also on how authors collaborate with the same pair of authors.

Co-citation network of documents
In bibliographic coupling, the relatedness of two documents is established based on the number of references the two articles shared together. However, in co-citation, the relatedness is determined based on the number of times they are cited together. This analysis is important in identifying pairs of highly cited papers. Bibliographic coupling is retrospective whereas co-citation is a forward-looking perspective. The co-citation network of documents as projected in Figure  8, indicates that the article, "A novel coronavirus associated with severe acute respiratory syndrome" by Ksiazek, TG et al. , a dramatic increase in research on coronaviruses was observed. This time, marked increase in publication output was observed within the first three months of 2020 when COVID-19 outbreak took place, almost 10% of total published literature was published. This irregular growth in research in this domain suggests that research on coronaviruses remained incidence-based and it has not been of broad interest in the past 60 years, nor even after the outbreak of SARS-Cov or MERS-Cov. It is likely that due to lower mortality or non-pandemic nature of its spread in earlier cases, the researchers or research organizations did not pay as much regular interest as in the case of other priority diseases. Such irregular attention has also been seen with other infectious diseases. For instance, only 43 articles on Ebola virus were published in 2013 before the Ebola outbreak in West Africa, but the number increased to more than 600 articles in 2014. [29] Continuity in research activities, however, is essential for study of any infectious diseases in order to minimize the risk of infection. Furthermore, it has been observed that coronaviruses have become a worldwide public health concern as the researchers from almost 132 countries are collaborating with one other in this domain. In general greater collaboration is an effective strategy to increase citation rates; collaboration and sharing of research outcomes globally are also necessary to develop the correct drug in the shortest possible time or to implement the correct strategy for controlling any pandemic outbreaks effectively and thereby reducing morbidity and mortality. Furthermore, looking onto the country-collaboration network, it was observed that the two mega-clusters or spheres on coronaviruses studies showed collaboration pathways chiefly among authors from the USA and China. Alliances between developing and developed countries are rare as is the case in several other scientific areas. [30] Among researchers in China, collaboration pathways were largely intra-national. In contrast, collaborations by authors in the USA, England, Netherland, Taiwan, Japan, India, etc. tended to be multinational, which is more valuable for the epidemiological control of pathogens. The absence of collaboration pathway in African nations is consistent with the low number of publications from these countries. Intra-and inter -national collaborations between developed and developing nations could provide opportunities for the division of labour and also resources for the search of the appropriate vaccine as well as of effective medication for the treatment of any such infection diseases including the COVID-19.
The results of this study show that the USA and China have had primary roles in coronaviruses research for the last few decades. It is not surprising that the majority of relevant publications are attributed to the USA or Chinese, because the USA has largest number of scientific research institutions, largest number of P3 and P4 biosafety laboratories and have largest investment in scientific research domain. On the other hand, China houses over 3.61 million licensed physicians as well as is the birthplace of the current pandemic. [31] However, it is notable that Italy, the UK, or Iran have less output or made less collaboration than the former two. No licensed vaccines for the prevention of any CoV (except limited for MERS) has come out and the treatment options too are very limited. Thus, most preventive measures are directed to reducing the risk of infection. As history can attest, humanity has been surviving epidemics with improved outcome, therefore, it is recommended that increasing research, from all countries, be the priority to better understand its pathogenic characteristics and to find proper therapeutic modalities.

CONCLUSION
The present bibliometric analysis revealed that there is a rapid expansion in the research activities related to coronaviruses over the last few months of 2020 just because of COVID-19 The most frequently mentioned keywords and research areas associated with coronaviruses studies reflect the research hotspots during last few years. As indicated, researches were mostly related to symptoms of this virus like acute respiratory infection; to studying the behaviour of this virus in nonhuman entities like mouse hepatitis virus, murine coronavirus; and to extra-intestinal studies on isolation of pathogen such as protein-wall, sequence, transmission etc. These findings reveal that the most pressing health issues among researchers were related to coronavirus-induced gastroenteritis and extraintestinal infections and co-infections with other pathogens and efforts to gain an understanding of the structural architect of the pathogen's cell-wall. However, important topics such as strain-based identification, including detection of pathogenic and non-pathogenic strains, the sensitivity of various drugs on the pathogens that are necessary for infection management were not adequately addressed and were not evident from the bibliometric analyses. Interestingly we observed only a single study entitled Immunogenicity of Different Forms of Middle East Respiratory Syndrome S Glycoprotein which dealt with the MERS coronavirus vaccine. Newer research themes such as molecular and genomic-level studies necessary to clarify the pathogenesis of coronavirus infection were not visible throughout this study. Future research is needed to answer significant questions such as what particular strain of the microorganisms are pathogenic and how to differentiate pathogenic variants from non-pathogenic ones.
While looking at the collaboration pattern of published literature, it was observed that the highly productive authors although have strong collaboration network or to say, have fixed collaboration networks, but the highly cited authors do not have any fixed set of co-authors, as a result of which they exhibited weak link-strength. The fewer distinct collaborations among scientists, as evident in present study, may be an indication that a considerable number of scientists still do not participate actively in the research team to the extent that is expected. outbreak, otherwise, the domain of coronavirus never received attention of the researchers the way other infectious diseases like HIV/AIDS in the past seventy years did. During the last few decades, the researchers have more frequently undertook research in the areas like symptoms of this virus (eg. acute respiratory infection) or to understand the fusion and penetration process of coronaviruses (hepatitis virus, gastroenteritis virus), little research is done on domains like antibiotics, pathogenesis. Researches on important areas such as strain-based identification, including detection of pathogenic and non-pathogenic strains, pharmacokinetic properties, pharmacodynamic characteristics, or sensitivity of various drugs on the pathogens that are necessary for infection management have been inadequate and are not evident in the bibliometric analyses. The greater research output came from high-economic countries was noticed as compared to low and middle-economic countries and there was limited collaboration with developing countries. A better understanding of the clinical, as well as of epidemiology of such infectious diseases, is needed in countries with a high infection rate. Furthermore, to develop effective strategy for enhancing and improving of measures of epidemic prevention must continue through strengthening of international collaboration. These findings reveal the importance of bibliometric methods to understand global research trends of research on coronaviruses. Thus, this study provides helpful insights for the researchers in this domain -medical virologists, policy decision-makers, as well as academics.

ACKNOWLEDGEMENT
The Author would like to thank Prof. Archana Kumar, Department of English, BHU for copy-editing. Thank the extensive editorial interventions of the Journal at different stages to improve the paper.