Introduction: analyzing a pandemic through bibliometric analysis

In December 2019, a new type of virus of the family Coronaviridae, currently called SARS-CoV-2 (former 2019-nCoV), was identified as the cause of an outbreak that was later on January 30, 2020 declared a Public Health Emergency of International Concern (PHEIC) by the International Health Regulation Emergency Committee by which tens of thousands of cases with relevant associated mortality have been reported. On March 11, 2020, the World Health Organization declared the pandemic. Since then, the COVID-19 outbreak is creating a global health emergency.

Effective control must be done not only by reinforcing containment measures, but also by promoting basic research on the disease, trying to anticipate it, and developing treatments and vaccines that prevent new scenarios of generalized contagion. Also, it is crucial to acquire the maximum possible knowledge about the SARS-CoV-2 virus, as well as the possible clinical-healthcare and public health measures that allow its immediate incorporation to protect people’s health and contain the spread of the disease COVID-19.

Bibliometric studies on coronaviruses are few. The first of them was from Chiu et al. (2004), and collected 256 articles indexed in the Science Citation Index (SCI) in the period March-July 2003, analyzing traditional indicators (authorship, collaboration, journals, language, document type, organization, times cited, etc.). From here other similar studies have been published such as an analysis of 3202 papers on SARS in Medline (Yang and Yang 2005), 2874 articles on SARS indexed in SCI in the period 1998–2008 (Kostoff and Morse 2011), 443 articles on MERS published between 2012 and 2015 in PubMed (Wang et al. 2016), 883 articles about MERS between 2012 and 2015 in Scopus (Zyoud 2016) or 8619 articles in Scopus of various infectious pathologies (Sweileh 2017). More recently, a short letter to the editor (Bonilla-Aldana et al. 2020) has been published that provides a very succinct description of 18,158 articles on coronavirus in SCI, Scopus, and PubMed. On the other hand, Larivière et al. (2020) have emphasized the importance of “opening” all scientific literature and also paying attention to articles that are not written in English to facilitate the rapid advancement of research in an emergency situation such as today. According to their data, 13,818 articles have been published on the topic of coronaviruses since the late 1960s in WoS but more than half are closed access.

With a wider perspective, Zhang et al. (2020) measure the response of academia to six international public health emergencies since 2000 (SARS, influenza A(H1N1), ebola, zika, COVID-19) and they also present a preliminary analysis of the academic response to COVID-19 based on 3069 publications indexed in WoS, PubMed and CNKI until 9 April 2020. Although the sources and time coverage are different, in Results section we will compare some of their results with ours.

Bibliometric techniques have also been applied to the analysis of press news about coronavirus. The first one was from Chan et al. (2003) and the treatment of SARS was focused on six newspapers (China, Hong Kong, Canada, and Europe) to compare the presence of news in the press with the official figures of dead and infected people, provided by WHO. This was followed by others of a similar nature focused on the press in China (Hong 2007), the USA (Bomlitz and Brezis 2008) or South Korea (You and Ju 2010).

In general, the bibliometric studies on coronaviruses published so far mainly extract data from WoS, Scopus, and Medline. They are based on few articles and carry out a traditional bibliometric analysis (authors, collaborations, countries, organizations, journals, etc.). In our case, in addition to compiling a greater number of articles, we want to place emphasis on two focus that ensure the rapid advancement of science: international collaborations (previously studied) and the availability of content in open access (a newer indicator in this type of study).

The availability of scientific production openly accessible related to it is, beyond a desired phenomenon, an indicator that the openness of science is a reality that can be framed if needed. Open Access is an essential instrument based on the new strategies to share knowledge and cooperative work using digital technologies showing the world that investigators are achieving attempts that otherwise would not be possible without collaborative networks and technological tools combined together (Belli et al. 2019). The spreading of the OA movement in several countries, exemplified by the growth of regional and national initiatives, such as the creation of OA digital journal libraries and the establishment of supportive governmental policies (Abadal et al. 2015; Minniti et al. 2018), provides evidence of the significant role OA is playing in reducing the scientific gap between countries and improving their participation in the so-called “global knowledge commons” (Chan et al. 2020).

Open science is the best method because it is an approach based on collaborative work, openness, and transparency in all stages including not only publication, but also data collection, peer review, and assessment. In this way, open science aims to bring research closer to society. Open science entails a radical transformation in the way scientific research is conducted and in its assessment system. Although the term open science includes various components, such as open research data, open access to publications, open peer review, citizen science, and new metrics, we focus on two of them. The main benefits and opportunities of open science include transparency (all phases can be monitored), greater effectiveness (rapid knowledge transfer helps avoid duplication and increases the speed of research), reproducibility (thus preventing fraud) and greater output.

The main aim of the present paper is to analyze scientific publications on coronavirus listed by Web of Science to analyze two main characteristics of open science, as international collaboration and open access to publications. The data obtained will be useful to determine the evaluation of coronavirus publications in the world and the principal characteristics of these scientific publications. This article explores the capacity of researchers to generate scientific knowledge about an international health crisis emergency, and their global capacity to collaborate between them. To progress in the knowledge of this pandemic and to control its effects, international collaborations among researchers are essentials, as well as having open and immediate access to scientific publications. Also, the paper identify the countries with the most scientific publications, their international scientific collaboration, their impact, and the percentage of open accessibility of these publications.

Methodology

We offer a general search in all databases available at Web of Science (WoS) platform and a deeper bibliometric analysis of recent coronavirus scientific publications indexed in its Core Collection. The general search covers all registers present in all databases, while the Core Collection search considers the period from 2001 to 2020. Also, we consider the major trend from 1948 to 2020 to map the number of publications, but the specific analysis will be treated both in the two last decades and two last years to compare what is happening in the present moment. The bibliometric analysis of the WoS database aims at mapping the use and distribution of OA in articles and cited references, among countries and organizations.

The main reason we focused our mapping on WoS is the availability and structuring of the cited references offered by its Core Collection and especially the information provided about the accessibility of each record (“Open Access Indicator” field). Besides that, and considering the fact that a great part of its collection is represented by journals with restricted access, it seemed interesting to see how much of the articles are opening in the present.

Data

The data cover all documents related to the following search expression (Topic Search: “CORONAVIRUS OR COVID-19 OR SARS-COV-2 OR 2019-nCoV”) retrieved from Web of Science (from Clarivate Analytics). We proceeded a manual download (500 registers each time), divided into two datasets retrieved respectively on March 20th and 23rd, with the following search limits:

Data source: Web of Science platform

Year Published: no limit defined (1900–2020)

Document Type: (DT = Article OR DT = Proceedings Paper OR DT = Review)

Databases: KCI-Korean Journal Database, Russian Science Citation Index and SciELO Citation Index (we filtered the regional indexes and discarded the others, like Derwent Innovations and Current Contents Connect)

Data source: Web of Science Core Collection

Year Published: (PY = 2001 OR PY = 2002 OR … OR PY = 2020)

Document Type: (DT = Article OR DT = Proceedings Paper OR DT = Review OR DT = Early Access)

Collection Citation Indexes: SCI-EXPANDED, SSCI, A&HCI

Variables

The article was considered as the unity of analysis, and its references were used to calculate percentage citations to OA journals and the Price Index (percentage of citations to articles until the past 5 years).

In order to make a comparison between the recent publications about the COVID-19 and coronaviruses in general published in the last two decades, we considered two periods: 2001–2020 and 2019–2020.

The bibliometric analysis was focused on the following characteristics of the citing articles, considering the information of various bibliographic fields presented in WoS:

Dataset: all databases

  1. 1.

    Database the first three characters of the “Accession Number” (UT) were used to identify the source database; and those UT begun with “WOS” were used to match to the following dataset, in order to identify the intersection between both.

Dataset: core collection

  1. 2.

    Status of the article (closed access/open access) if the field “Open Access Indicator” (OA) identifies if the article is openly accessible or not;

  2. 3.

    Articles’ open access type the same “Open Access Indicator” (OA)Footnote 1 field, considering the different OA types, aggregated as follows: Gold-DOAJ (journals in Directory of Open Access Journals—DOAJ), Gold-Other, Bronze (free access) or Green (Published or Accepted, not differentiated).

  3. 4.

    Research funding we checked if there was mention of research funding in “Funding Agency and Grant Number” (FU) field;

  4. 5.

    International collaboration or not based in the “Author Address” (C1) field, the following criteria were applied: if the article has authors from more than one country it was classified as international collaboration;

  5. 6.

    Affiliation country and organization using the “Author Address” (C1) field, each country and organization were identified.

Then we processed the information from the “Cited References Field” (CR) of Core Collection dataset, which consisted on separating the different cited references, discarding non-journal articles and separating each piece of information that composes the cited reference. An algorithm was applied (Mugnaini et al. 2004) to match the string respective to the cited journal, considering all variations written by the authors in order to identify if it was present in DOAJ. Finally, we identified the references whose year was not older than 5 years with respect to the citing article in order to calculate its percentage (Price Index).

Data analysis

We performed the mainly descriptive statistical analysis in order to explore the performance of the more prolific countries and organizations, as well as paying attention to the last 2 years.

Registers have been analyzed separately via the VOSviewer software (Van Eck and Waltman 2016) drawing a network of links among countries and organizations to identify the starred countries and organizations, and the strongest links of the net. The reason for choosing the VOSviewer software tool to build and represent large networks form bibliometric data among other tools, like Gephi or Pajek (Van Eck and Waltman 2016), is the significant display quality, the choice of depicting the density of links and the possibility of generating overlay maps adding data batches. These features are relevant in order to conduct our research. Similar criteria could be observed in the study of Vargas-Quesada et al. (2017) and Belli and Baltà (2019), who also prefer VOSviewer (version 1.6.15) over other software tools.

It should be noted that the WoS data has been modified in order to unify and group country names in the C1 and RP parameters. Specifically, the denomination “Peoples R China” has been replaced by “China”; the joint denomination “United Kingdom” has been added to the registers of Scotland, Wales, North Ireland, and England. Also, for regional comparison, the name “European Union” has been added after the name of the 27 countries that are currently part of it, to which we have added the United Kingdom despite the recent departure of the EU on 31 January 2020.

Results

Global distribution of articles and the challenge of information retrieval

The general search in the All Databases dataset resulted in 19,435 registers, from which 18,875 were articles published in journals. A reduced part of the production (1.37%) was found in regional databases (citation indexes from Korea, Russia and Ibero-America “SciELO”).

The great amount of production came from WoS Core Collection, from which we differentiate three types of records, distributed in two databases. One database is the Emerging Sources Citation Index-ESCI (with 265 records or 1.40%), that contains journals of regional importance and in emerging scientific fields, but that were not accepted yet to the Core Collection. It means that the Impact Factor is not calculated to ESCI journals, representing a huge negative effect by authors from countries where evaluative processes give special importance to this indicator.

Another type of record was created using the same search expression we used to conceive the Core Collection dataset, that is to say, those representing the intersection between both datasets (with 13,244 or 70.17% of the articles). Finally, we have the records that are also present in Core Collection, but cannot be retrieved with our search expression in that databaseFootnote 2 (representing 27.06%).

Figure 1 reveals differences over time, mainly after 2003, when both numbers of articles show growth, but more significantly to the retrieval in both WoS sources, meaning that the terms in our search expression settle. That was the year just after the first SARS crisis began, resulting in that the queried terms retrieved 56.43% of the articles in Core Collection before that (1945–2002), but 77.75% afterward.

Fig. 1
figure 1

Distribution of articles, according to the year of publication and successful retrieval from both WoS sources or not—1945–2020

The differences above are important to be taken into account when investigating the coronavirus research in WoS, but we decided to pay deeper attention just to the 10,026 articles queried in Core Collection (including their 456,133 cited references), from the last two decades. One reason is simply due to the fact that the authors have appointed the terms in the title and/or abstract of their articles and the other is the completeness of metadata in Core Collection when downloading data from its interface, compared to the data delivered by WoS platform.Footnote 3

Scientific collaboration and impact

Our objective is to analyze the composition by countries and organizations of publications on coronavirus in the period 2001–2020 and in the most recent period January 2019–March 2020, covering the SARS crisis of 2003 (2002–2004) and that of 2019–2020. In this way, we will be able to capture the most prolific countries and the composition of organizations around coronavirus research, attending to their productivity, the degree of international collaboration and the number of citations accumulated.

Countries

The network map generated by the VOSviewer software for the 10,026 documents by countries for the period 2001–2020 (Fig. 2) offers us a fairly clear image of three main clusters. The first one is the USA in green, involved in the 34.22% of the documents. It is followed by China in yellow, with the 25.24%. Finally, it is the cluster formed by the different European countries in red, where contributions from Germany (6.74%), UK (6.35%), the Netherlands (5.52%) and France (4.25%) stand out above the rest. To these, we must add a fourth cluster made up of different Asian countries such as Japan (4.59%), South Korea (4.02%) or Taiwan (3.76%), among others.

Fig. 2
figure 2

Network visualization per countries WoS Core Collection data—2001–2020

Paying attention to the links between countries we can identify, based on co-authorship, certain recurring relationships. It highlights the high interaction between European countries and their preferential relationship with the USA over China. In fact, the different European countries and the USA appear as preferential partners in the network, with each other and with the rest of the countries for the period 2001–2020. By covering two decades of publications, we understand that its composition suggests a relatively established structure of leadership and recurring relationships in research around the coronavirus.

On the other hand, trying to capture the characteristics of the publications related to the COVID-19, we focus on the period between January 2019 and the twelve first weeks of 2020. Since the search strategy used uses the different names of the virus, it is assumed that a large part of the 2019 documents does not refer to the recent pandemic that began in the months of November and December of that year in the Chinese region of Wuhan and spread through different areas of China, Asia, Europe and the rest of the world during the first trimester of 2020. Despite this, we consider this sample as related and representative of the first research incomes of the second wave of SARS.

The map generated by VOSviewer (Fig. 3) for the 917 documents for the period 2019–2020 shows fewer countries involved than those accumulated for the last two decades and a somewhat different distribution of the clusters. Compared to the 2001–2020 map (Fig. 2), China is involved in the 32.50% of the documents and gains a more central position in the network in close relationship with the USA, which participates in the 29.44%. Again, both countries are the main producers of documents and relationship nodes in the network, together with a group of European countries led by the United Kingdom (5.67%), Germany (5.45%), France (5.13%), and the Netherlands (4.47%). Likewise, there are differentiated groups of countries whose main partner is the USA but which do not maintain relations with each other. One of them, in green color, is the one formed by Canada, involved in the 5.02% of the documents. In this group, we also have Italy (3.27%), India (2.73%) and Iran (0.98%). The other is the one formed by Saudi Arabia, with an important contribution of documents (7.85%), Egypt (2.94%) and Lebanon (0.44%). We can also identify a representation of the Asian cluster, in blue in the 2001–2020 network map (Fig. 2), in which South Korea, involved in the 7.42% of the documents, gains prominence above Japan, involved in the 2.73%.

Fig. 3
figure 3

Network visualization per countries WoS Core Collection data—2019–2020

The number of documents provided by each country, the number of citations accumulated by these documents and the total link strength (TLS), that is, the total sum of all links generated by each document, are considered for both periods. A higher value of the TLS in proportion to the number of documents is an indicator of greater international co-authorship in the publication of scientific documents and of a greater recurrence of international collaboration in research in a given country. On the other hand, a lower TLS value suggests a greater weight of internal scientific production.

For the 2001–2020 period (Table 1), the value of the TLS in proportion to the number of documents provided is especially low in countries such as Japan, South Korea, Taiwan or Brazil, with 0.46, 0.35, 0.28 and 0.38 links per document and 35.87%, 25.81%, 19.89% and 31.79% of documents resulted from international collaboration respectively. At the other extreme, we find especially high values in Egypt, with 1.77 links per document and 94.62% of documents fruit of international collaboration, Switzerland, with 1.76 links per document and 80,26% of international collaboration, having WHO’s headquarters or Saudi Arabia, whose organizations have been characterized in recent years by the hiring of foreign research personnel in compatibility with their origin affiliations (Bhattacharjee 2011; Messerly 2014), with 1.31 links per document and 71.39% of documents fruit of international collaboration.

Table 1 Number of documents, total link strength, citations and percentage of international collaboration per country—2001–2020

The countries that accumulate a greater number of links, the most connected in the network previously analyzed, also tend to accumulate a greater number of citations. It stands out the highest proportion of citations collected in relation to the number of documents in countries such as the Netherlands, Canada or Switzerland, which in turn stand out for a large number of accumulated links. In any case, it should be kept in mind that this indicator would only serve as a gateway for a detailed study of the specific publications that accumulate the most citations. It would be fallacious to assume an equal distribution of the number of citations according to the number of documents published.

The USA appears as the country with the most documents contributed to the sample (3431), the most accumulated citations (121,219) and the strength of the link (1957). China appears in second place with 2531 documents that accumulate 69,852 citations and generate a total of 1133 links. Both countries present figures above the rest of the countries in the sample, although their size and composition make them hardly comparable to the rest. In this sense, if we compare their figures with the group of the 28 countries of the European Union, we see how they are in the second position after the USA and above China, with 2863 documents that accumulate 100,109 citations and generate a total of 1636 links.

A previous study about MERS-CoV research publications from 2012 to 2015 in Scopus databases (Zyoud 2016) found that the Netherlands produced the greatest proportion of publications with international research collaboration (72.7%) followed by the UK (71%) and Germany (69.1%) out of the total number of publications for each country. In our study, covering a wider timespan, these countries also present high values (62.03%, 64.67% and 63.91% respectively). Therefore, they are among the most contributing countries to the sample behind the USA and China (Table 1). Despite this, the data for the set of EU countries present values more similar to the USA and China. This is explained by less international collaboration in the rest of the countries of the European Union that are not highlighted and the lower proportions presented by Germany, the United Kingdom, the Netherlands or France alone. Similarly, it is to be assumed that the level of international collaboration for the different United States of America or Chinese regions does not present the same distribution. Comparatively, China has a lower percentage of documents resulting from international collaboration (33%), the proportion of links (0.45) and citations (27.6) per document than the USA, with the 41% of internationally co-authored documents and 0.57 links and 35.33 citations per document, and the EU, also with the 41% of documents co-authored with non-EU countries and 0.57 links and 34.97 citations per document.

Comparing these figures for the two last decades with results of other previous studies related to the first SARS crisis, the same countries are among the most productive varying their positions depending on the databases. For instance, the USA, Canada, the United Kingdom, and Germany accumulated the 63% of publications about SARS in the Science Citation Index (SCI) database from the beginning of the outbreak, in March, until July 8, 2003. The USA had about the 30% of the total share, followed by Hong Kong with the 24% (Chiu et al. 2004). Another example with a short time-span, Sept-2002 to Aug-2004, and a sample of 1646 documents from MEDLINE database sets China as the most contributive country accounting the 48% of the total share followed by the USA with the 15% (Yang et al. 2005). A wider study between 2003 and 2008 with 2874 papers retrieved from SCI database shows China as the most contributive country followed by the USA. China increases the number of publications over the years getting closer to the first position and leading the public health response (Kostoff and Morse 2011).

Our study covers a more extensive time-span (2001–March 2020). We show that the most contributive countries, with the highest number of publications, around the two SARS crisis are also the two most contributive countries in SARS-related publications: China and the USA. Kostoff and Morse (2011) also suggest that the countries hit hardest by SARS increase the number of publications concerning the topic. According to this, our sample time-span specifically points to the first publications related to the second SARS crisis (COVID-19) outbreak.

If we put the number of documents provided in relation to the total number of links generated with other countries, we see proportionally higher values than in the sample for 2001–2020, going from an average of 0.96 links generated by documents in the total period to 1.16 links for the 20 countries with the largest contribution of documents to the sample for each period. So, there is a higher level of international collaboration in the 2019–2020 period, with 64.84% of the documents, than in the 2001–2020 period with 53.46% of internationally co-authored documents. The most notable exception is China, which falls from 32.75 to 27.52% of the documents (0.44 links to 0.38 links per document). Japan increases the number of documents internationally co-authored from 35.87 to 40% in 2019–2020, despite the ratio decreases from 0.41 to 0.24 links per document, meaning that there are fewer countries involved. On the other hand, Italy increases from 0.66 to 0.76 links per document but keeps around 40% of international documents in both periods, so more countries are linked in the most recent publications. Other countries that offer a low proportion of links throughout the whole period show a slight rise in 2019–2020, such as South Korea that goes from 0.34 to 0.54 links per document and 25.81% to 29.41% of documents fruit of international collaboration, or Brazil that goes from 0.38 to 0.43 links per document and 31.79% to 34.78% of international documents.

In the same way as for the period 2001–2020, for the period 2019–2020 (Table 2), there is some correspondence between a higher total link strength and the number of citations collected by country, although the value of citations is lower due to the recent publication of the works. Despite this, the exception of China stands out, which despite offering a proportionally low value of 0.45 links per document and 27.52% of internationally co-authored documents, accumulates a greater number (733) and proportion (2.46) of citations of its registers than the other countries.

Table 2 Number of documents, total link strength, citations and percentage of international collaboration per country—2019–2020

The 2019–2020 sample shows even sharper ratio differences when we focus on the comparison between the values of China, the USA and the countries of the European Union as a whole. The number of citations in proportion to the documents provided increases in China and the USA (1.91 citations and 0.88 links per document with the 56.82% of international collaboration) compared to those of the European Union, which increase to a lesser extent (1.23 citations and 0.91 links per document with the 55.96% of international collaboration). Contrasting these three specific cases, the proportion of citations accumulated per document is not related with the number of links generated and the international collaboration percentage for documents dated in the 2019–2020 period.

Our explanatory hypothesis for this significant difference in citations is, on the one hand, the initial impact of the COVID-19 pandemic in China and that the data treated in our study reflects the first studies carried out. On the other hand, there is the more than likely presence of works that have served as a reference for further research, such as publications of Chinese origin describing the Covid-19 person-to-person transmission (Chan et al. 2020).

We understand these data as a conjunctural description of the first impact of the COVID-19 pandemic on international scientific research to be compared in the future with the total number of records for 2020, which it is supposed to show a greater number of accumulated citations and a wider number and distribution of links between countries.

Organizations

We found interesting to add each organization’s country to see how the network changed when compared to the rest. Thanks to this, we can map which countries have the most collaborative organizations about coronavirus and when they have started. In addition, thanks to the number of collaborations, we can identify if individual organizations have decided to work on this topic or if it was a country’s decision and if different organizations from the same country do the same. Also, we can observe if in international collaborations, organizations assume a central role due to political decisions from their countries or if it is an initiative of researchers of the organizations. Some of the organizations from these networks are supranational or depend on scientific political decisions from different countries.

In Fig. 4, the network shows the main organizations that have collaborated in the scientific production from 2001 to 2020. We can observe the total production (the size of the node) and the number of collaborations (the size of the link) with other organizations. It is divided into several clusters of different colors. The cluster in the middle of the map is represented by organizations with a central role in collaborations. Many of them are based in the United States, and as we observe, collaborate with different parts of the world, mainly with the European Union and China. The red cluster contains Chinese organizations. They have a major number of publications as the nodes show, but they do not have a central role in scientific collaboration.

Fig. 4
figure 4

Network visualization per organization WoS Core Collection data—2001–2020

In the periphery of this network, we can observe how several organizations compose small clusters of scientific collaborations. At the bottom of the map, for example, there is the pink cluster composed by Korean universities with a medium number of publications and a very active collaboration between them, but with few cooperation with organizations from other countries. The same situation is for the yellow cluster, composed by Brazilian and Italian organizations, with the difference that they work in a more isolated dynamic.

In this graph (Table 3), we can observe with more detail what the previous network introduced. The University of Hong Kong and the Chinese Academy of Science have the largest number of publications with a big number of link strengths. Many of these publications were produced by collaboration between them. For this reason, in the network the two nodes appear very close. Also, these two organizations collaborate with other centers but they do not have a main role like the organization that we meet in the second position, the Center for Disease Control and Prevention of the United States. This research organization has a high number of documents and of total link strength, many of them with different international organizations. For this reason, its position on the map is in the middle. A similar situation is for the Ministry of Health of Saudi Arabia. Its position is close to the middle of the map and has the third position in total link strength. It is the first non-Chinese organization with more scientific collaborations.

Table 3 Number of documents, total link strength, citations and percentage of international collaboration per organization—2001–2020

The University of Hong Kong has the leadership in the number of documents (456), total link strength (572), and as we observe in this last graph, in the number of citations from 2001 to 2020. Its leading position, with 26,101 citations, is evident. The number is more than double than the Center for Disease Control and Prevention (10,583 citations), and the Chinese Academy of Science (10,500 citations), respectively in the second and the third position with a similar number of citations. The first European organizations are the University of Utrecht, which has a high number of documents (189) and citations (8102). This organization is close to the middle of the previous network (Fig. 4), which means it has a high number of scientific collaborations with the rest of international organizations too. Thanks to this graph, it is possible to understand that the impact of these organizations follows a close relation with the number of publications and the number of collaborations. It is not surprising that the University of Hong Kong, the organization that has the leadership in the last 20 years, is the world benchmark organization for coronavirus research.

In this study, it was important to compare the results of the last two decades with the results of the last 15 months, 2019 and the first 12 weeks of 2020, when COVID-19 has become a global pandemic with more than a million of contagions and several thousands of dead until April 2020. The interests of researchers and organizations have moved to study coronavirus and to find a solution.

In this network (Fig. 5), we can observe how the clusters of Chinese organizations (blue, pink, and red) have moved in the middle of the map. This means that the number of collaborations has increased and also it is more distributed the research on COVID-19 in different Chinese centers and not only in a few like in the last two decades. Moreover, the relevant role of these Chinese organizations has increased in relation to the rest of the world, with different Chinese clusters that are connected with different other clusters. The map has changed also if we compare the position of the US cluster with the previous map. Now, US organizations move to a more peripheral zone of the map and have lost their leader place in the middle of it. Also, we can observe how more US organizations have appeared in these last 2 years with a stronger collaboration with Chinese organizations, a situation that previously did not happen.

Fig. 5
figure 5

Network visualization per organizations WoS Core Collection data—2019–2020

The most relevant data that we can observe in this comparison is how China has increased the number of publications distributed in different scientific organizations, and not only in a few centers. In these last months, eleven Chinese organizations are leading the area with several publications (Table 4). This affects the main role of the University of Hong Kong and the Chinese Academy of Science, which however occupy the first two positions, but their number of publications is not so high compared with the other organizations. On the other hand, for the number of collaborations with foreign organizations, we discover that the trend has moved to more collaboration than the last two decades. Outside China, it appears a South Korean organization, the Seoul National University in the fourth position, and the Ministry of Health of Saudi Arabia with a relevant role in collaborations. The first European university is the University of Utrecht, confirming its leading role as in the global trend (2001–2020), also highlighted due to cooperation with foreign organizations. As we have observed in the previous network, the US organizations have completely disappeared in the first positions. The first one that appears, the National Institute of Allergy and Infectious Diseases, is in the twelfth position.

Table 4 The number of documents, total link strength, citations and percentage of international collaboration per organization—2019–2020

Mapping the movement to open access by coronavirus publications and references

In this section, our primary objective is contributing to the mapping of OA coronavirus publications and their cited references, observing how it improves along the period. For this, it is important to analyse if the OA blooms of the research on coronavirus is funded or not. Moreover, we observe if it is due to international collaboration or not. Another aspect of interest is related to the obsolescence of the literature that supports the research under consideration—we hypothesise it increases due to higher productivity in times of epidemic/pandemic. Finally, we explore how the most prolific countries move forward and widen access to coronavirus scientific knowledge. The cited references, in turn, indicate whether the greater availability of openly accessible literature is actually being used for its own advancement.

We can observe the number and percentage of articles in OA (Table 5), concluding that it is growing almost every year, especially in 2020, when it grows 22.4% in relation to the previous year. When comparing the percentages among different OA types, we see that Bronze predominates in the first 3 years, when Green takes the lead until 2011 and 2013 again. In 2012 and from 2014 to 2019 Gold-DOAJ is the most preferred, sharing the first with Bronze—the latter really taking the lead—in the present year. This huge increase is due to the announcement made for many commercial publishers to give free access to COVID-19 and coronavirus related articles published in their subscription journals. It is also interesting to highlight the actual prominence of DOAJ journals to widen access to coronavirus scientific knowledge. In the end of this section, we compare the relative importance of the other types against Gold-DOAJ, in relation to the different countries.

Table 5 Annual distribution of articles, according to the presence of not in OA journals, and percentage in OA (distinguishing the OA types)—2001–2020

Figure 6 presents the temporal evolution of OA in coronavirus literature as well as its obsolescence. As we can observe, the total amount of articles shows some variation, decreasing between one and another epidemics. The joint analysis of the presence of OA among cited references and articles is presented in Fig. 6a shows that from 2010 to 2019 the percentage of OA increases practically in a constant manner, from ~ 55 to ~ 75% among articles, and ~ 10% to ~ 25% among cited references. In the previous interval, its variation was somewhat erratic. In the current year improves significantly in publications, but decreases in citations. On the right side, the percentage of OA articles is considered between funded or not research, and internationally collaborated or not (Fig. 6b). Funding is more associated to OA publication than those publications that do not mention acknowledgment to funding, especially 3 years with highest gap between both (~ 37%): 2004, 2007 and 2011. As observed about Table 2, through these years the Green type is growing while Bronze is dropping. After that the DOAJ-Gold takes the scene until 2019, and the gap of percentage of A articles between research funded or not narrows crescently. In due proportion, the presence of foreigners in the collaboration is also significant to OA publications (the average gap between 2010 and 2014 is around 15%). Finally, it is important to highlight that the lines converge in 2020, when the differences among categories become null, reaching its maximum value, and surpassing 90%.

Fig. 6
figure 6

Annual distribution of: a average percentage of OA in cited references versus the percentage of OA in article; b percentage of OA in articles by funding or collaboration; d average percentage of OA in cited references by funding or collaboration, and; d average percentage of Price Index (references) by funding or collaboration—2001–2020. Subtitle: in order to guarantee consistent percentages in articles cited references, we have discarded articles with less than 10 references (corresponding to ~ 3%)

Figure 6c, d focus on the cited references. The former, measures the average percentage of OA in relation to the presence (or absence) of funding, showing no significant difference between the trends. By the way, the presence of foreign co-author shows some positive difference, meaning that more cited references relate to OA journals, mainly in the 2012–2019 period. However, we have observed that the journals in OA are scarcer, mainly until the middle of the period, when they leave the level of 10% and reach about 25% in 2019. If the higher, recent and crescent availability of articles influences this tendency among cited references, the age of the references must show that it is getting younger—as to say, that the Price Index is increasing. As pointed out by Larivière et al. (2007), the scarcity of scientific production implied a decrease in the Price Index (as they observed in the two world wars). So, in the opposite way, Fig. 6d permits to observe three peaks, coincident with periods when the scientific production about the epidemics were high: one in 2004, following SARS crisis; the second between 2014 and 2017, after the MERS-CoV, and now, during the SARS-CoV-2 pandemic. In this specific case, we call the attention to the increase of cited references to OA journals by not funded articles.

Considering the scientific production of the 20 most prolific countries, in each of the periods, we have considered the average percentage of OA in cited references versus percentage of OA in articles. In order to analyse the changes in both OA variables, by the countries, we have ran a cluster analysis to group them according to similar behavior profiles, considering the relative growth, between the periods, in each variable (Fig. 7). Also, the changes between periods to all countries in each cluster are presented below (Table 6), where the other variables can be analyzed: international collaboration, funding and Price Index.

Fig. 7
figure 7

Distribution of countries, in each period and cluster, according to average percentage of OA in cited references versus percentage of OA in articles. Subtitle: in order to guarantee consistent percentages in articles cited references, we have discarded articles with less than 10 references (corresponding to ~ 3%)

Table 6 Cluster of countries and change between periods, in the percentage of articles and cited references, according to different variables

The seven OA clusters were identified with markers in the scatter plot with different geometric shapes, while the periods are differentiated by different shades of the same color (Fig. 7). In a general manner we observe that in the last 2 years the group of countries grows in both variables, with the exception of Belgium (alone in the cluster 2), that decreases the percentage of OA articles in the recent period. This can be due to the few number of articles in the recent period, denoting outlier behavior, that is also the case of Singapore, Taiwan and Vietnam (for this reason they do not present data bars in Table 6). Taiwan is the other cluster (7) with just one country, due to its remarkable increase in the percentage of cited references in OA journals, but have to be analyzed carefully.

The fourth cluster is formed by Canada and Italy (shades of purple), increasing at least 60% their percentage in each of the variables. Also, in Table 6 we observe that Canada presents an increase in the percentage of international collaboration (we have observed it presents increase in its TLS in Tables 1, 2) and funding, while Italy increases in the Price Index. It is interesting to compare the previous cluster with the third one (shades of green), formed by Brazil and Singapore, that present similar patterns of increase in cited references, less in articles. However, in percentage of international collaboration and funding Singapore improves better and Brazil with the second highest increase in the Price Index (Table 6). The situation is not so different to cluster 6 (shades of red), differing due to its lower increase in percentage of publications in OA in the recent period—being formed by India, Spain Switzerland, USA and Vietnam. In Table 6 we observe that Spain and Switzerland increases their percentage of funded research and India and USA in international collaboration.

Cluster 5 (shades of brown) is composed by Egypt and Saudi Arabia, which present the biggest percent of cited references with OA journals, showing no increase in the recent period. Egypt decreases its percentage of international collaboration, while both of them increase modestly the percentage of funded research. But their percentage of cited references in the Price Index deserves attention, due to a decrease of more than 12%, meaning less attention to recent articles in the recent period. They are probably more focused in the literature respective to the previous epidemics. Finally, cluster 1 (shades of blue) is the biggest one, with Australia, China, the majority of European countries, Japan and South Korea. They present an increase in both variables of OA, tending to improve better in the cited references. About the other variables (Table 6) we highlight a general increase in international collaboration (excepting China, that retroceeds, as observed in Table 2) and funding (with exception of Sweden), while the Price Index presents few increases (Australia and South Korea, while Sweden have to be considered carefully due to its few publications in the recent period).

As we observe, OA is playing an important role in all countries and getting more attention in recent years. But this growth comes together to the tendency to cite more recent literature just in the case of Australia, Brazil, India, Italy and South Korea. One variable whose the change between periods correlates, despite negatively with these, is the funding, that increases less in all case (except India that decreases).

Finally, we focus on the relative importance of the other types against Gold-DOAJ (Fig. 8). Considering that part of the articles in the other types can be openly accessible after payment by the author (Robinson-Garcia et al. 2020) it is important to compare the share of this part of the production with the share in DOAJ journals. The same authors find that the Green type is strongly represented in Europe and North America, while in South America Gold has a comparable level of importance. We can observe in Fig. 8 that the majority of the European countries, USA and Canada are distributed along the x-axis (related to the complete period), with a ratio of at least 1.5 (Spain and USA almost 3.0). Japan is the positive outlier related to this direction, showing that the DOAJ-Gold type is not its usual option (they prefer to publish in hybrid journals). On the opposite side we find Brazil, Egypt and Saudi Arabia, to whom the DOAJ-Gold is the favourite choice. China is in the middle, not so far from Taiwan and Australia.

Fig. 8
figure 8

Distribution of countries according to the relative importance of the other types against Gold-DOAJ, with different each period in the axis. Subtitle: the relative indicator consists of the ratio between the share of other types (Bronze, Green and Other-Gold) and the DOAJ-Gold

Considering the y-axis, we highlight those countries that increase the ratio in the recent period, which are Brazil and Italy. In the opposite situation, we find Egypt, Canada, Japan and Spain, whose ratio decreases significantly.

Discussion

We have explored the capacity of researchers to generate scientific knowledge to face a health crisis emergency, and their global capacity to collaborate between them in a global problem. We have also observed how different well-established clusters of countries and organizations are maintained over the two studied periods. The United States and China are the major producers of documents of our sample, followed by all European countries, especially the United Kingdom, Germany, the Netherlands and France. At the same time, the leading role of Saudi Arabia, Canada or South Korea should be noted, with a significant number of documents submitted but very different dynamics of international collaboration.

The case of the USA and China deserves attention, since their outputs are proportional to the magnitude of their built scientific infrastructure. One must therefore consider the rise of China, which in 2006 (Leydesdorff and Wagner 2009) takes second place in the ranking of articles (in WoS). On the other hand, if we observe the production of these countries in the periods considered in our study, we observe in the InCites platform (of Clarivate Analytics): that USA presents 22.2% of the world share in 2001–2020 and 18.8% in 2019–2020; while for China these percentages are, respectively, 9.4% and 14.7%.We note therefore the significant rise of China (57.1%), and the decrease of the USA (− 17.3%). On the other hand, when we update our search in WoS on coronaviruses, USA presents 32.6% for 2001–2020 and 28.4% for 2019–2020 (− 12.7%), while China presents respectively 24.8% and 25.2% (1.7%). Just over 2 months of previous searches, USA takes the lead in research in COVID-19. This change of positions is to be expected in a context where science reacts so quickly to face the pandemic. By the way, the proportion of international collaboration is growing in the great majority of countries in the last period, which contrasts with the situation of the last two decades. In this sense, China is the great exception, decreasing the share of articles in international collaboration (− 16%, as shown in Table 6), even if this does not allow us to directly link this change with the impact of the health emergency generated by COVID-19.

When we look closely at the details of organizations, we see how things change. The organizations that provide more documents to the sample are mostly Chinese. The proportions of international collaboration are very different between the different organizations in the same country, especially in the case of China and, to a lesser extent, the US. However, if we look at the organizations that have the highest number of citations in their works, we can see that they have high international collaboration rates and that they increase proportionally in organizations from all countries, as well as in Chinese organizations with high contributions to the sample. A comparison with previous studies (Kostoff and Morse 2011; Zyoud 2016; Zhang et al. 2020) shows that new countries have emerged in this network only in the last 2 years. Small and emerging countries have taken advantage of international collaborations to enhance their visibility in scientific scenarios. These new countries have started to connect with other parts of the world, like China, South Korea and Saudi Arabia.

The important role that open science plays is increasingly evident to have immediate access to scientific publications. The percentage of open access articles on coronavirus for the period 2001–2020 is 59.2% but if we focus in 2020, the figures increase up to 91.4%, due to the commitment of commercial publishers with the emergency (they have provided free access to their contents). Not by chance, our analysis has shown that fully open journals have become the main open access type until 2019. On the other hand, it is worth noting that the pandemic phenomenon gives way to the Bronze type, that is, without guaranteed perennial open access, and depending on the goodwill of commercial publishers. These are very relevant figures, never achieved before. Commercial publishers are assuming implicitly that open access ensures best results and effectiveness. It is important that this practice will continue after the COVID-19 pandemic and that can be extended to all the topics. Plan S (cOAlition 2019) can be a good strategy and may be it can be followed by a greater number of funding agencies.

It was important to observe that the coronavirus research is giving importance to the openly accessible scientific literature, since more of its cited references are being directed to fully open journals. This indicates that growing impact is being presented by them. Therefore, the consideration of Plan S on the possible absence of quality of open access journals (in certain areas of knowledge), and the consequent responsibility of funders to empower them (cOAlition 2019), must be taken seriously. In addition, we can say that this analysis sheds light on that, since the great majority of countries are improving their attention to them.

However, the group of countries formed by Australia, Brazil, India, Italy and South Korea, which in the recent period of the pandemic are giving more emphasis to recent literature, caught our attention. In addition, their last publications denote that they are neither funding depending, except Australia, nor they have depended too much on international collaboration, except India. Such characteristics may result from a clear and almost self-sufficient reaction to the situation.

Limitations

Several limitations affect this study. The first one is the restriction regarding the database used, WoS, that do not include other journals in other languages that are indexed in Scopus and SciELO among others. WoS data sources have some biases (Packer 2011; Mongeon and Paul-Hus 2016; Lucio-Arias et al. 2015), as publications in English, publications about Natural Sciences, Ecology and Bio-medicine, and publications authored by researchers from the Netherlands, United Kingdom, and the USA are overrepresented. In contrast, all other countries and languages are underrepresented (Belli and Baltà 2019).

The second limitation is the lack of reliability of publications in OA before 2009 in WoS due to a not well-defined specification distinction between not-OA and OA publications before that year.

The third limitation is the impossibility to capture the most recent publications in these last weeks when the COVID-19 outbreak was declared a pandemic by the World Health Organization. Although the replicability possibilities of the searches carried out are taken into account, it must be underlined that the WoS databases are constantly updated and expanded through works not previously contemplated too. WoS databases are updated not only by the most recent works but also by older ones. As a consequence, nowadays, if any researcher reproduces the same searches, the number of registers for these samples and the timespan will increase.

However, we are sure that more bibliometric analysis will be offered in this journal in the next months to follow this study.

Conclusions

From the beginning of 2020, when the COVID-19 pandemic created a global health emergency, the scientific community has concentrated its efforts to promote research on the disease from different parts of the world. Even if the coronavirus research has started many years earlier, only in the last period the number of scientific publications has increased significantly. Thanks to our bibliometric analysis, we have mapped an unknown territory until the moment, useful to determine the evaluation of coronavirus publications in the world and the principal characteristics of these scientific publications.

We consider, observing the tendency of coronavirus publications, that science is moving rapidly to find solutions to international health problems but access to this knowledge by society is not so quick due to several limitations such as open access policies and corporate interests. Open science is the best way to face global emergencies as COVID-19 pandemic. International collaboration in research and open access to publications, two of their main characteristics, can contribute decisively in obtaining more rapid and effective results. This crisis shows the importance of these open science practices. This type of study allows us to imagine and build future scenarios for scientific collaboration, creating a “common vision” and scientific policy recommendations to improve the numbers of scientific collaboration.