The most influential COVID-19 articles: A systematic review

Background Since December 2019, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2),causative pathogen of coronavirus disease 2019 (COVID-19), has triggered a pandemic with challenges for health care systems around the world. Researchers have studied and published on the subject of SARS-CoV-2 and the disease extensively. What is the significance of articles published, shared and cited in the early stages of such a pandemic? Materials and methods A systematic literature search in a time frame of 12 months and analysis rating using Principle Component Analysis (PCA) and Multiple Factor Analysis (MFA) were performed. Results The 100 most cited COVID-19 articles were identified. The majority of these articles were from China (n = 54), followed by United States of America (USA) (n = 21) and United Kingdom (UK) (n = 8). All articles were published in high-ranked, peer-reviewed journals, with research focusing onthe the diagnosis, transmission and therapy of COVID-19. The level of evidence of the 100 most cited COVID-19 articles on average was low. Conclusion In the early stages of a pandemic, new and innovative research can emerge and be highly cited, regardless of the level of evidence.


Introduction
The onset of the COVID-19 pandemic was not only a test of resilience for the human race, but it also put scientists through their paces.In being a novel virus there was initally a lack of literature to aid the medical workforce; it fast became a race for scientists to contribute to the evidence-base to guide management of unwell patients accordingly.Newly proposed treatments based on anecdotal evidence were being used across the world, however policy-makers and those treating patients on the 'front-line' were unable to rely on such data alone for assurance that these novel treatments would be best for patient care.Some countries such as the UK with NICE guidelines, heavily rely on validated and peer-reviewed evidence in order to formulate treatment guidelines and regimens.
One of the largest barriers to clinical confidence in hastily published 'COVID-19' articles is the distinct lack of high hierarchical levels of evidence.Whilst this could largely be due to the lack of time alongside intense pressure to publish research, there may also be a general lack of understanding that results from case-studies of small sample sizes cannot be extrapolated to be true of entire populations.
This paper aims to highlight, understand and assess the top 100 most-cited articles published under the topic of COVID-19 through a systematic search using stringent inclusion and exclusion criteria.As shown in the results section, most papers originated from China (n = 54) and USA (n = 21).Difficulties with translations of Chinese papers were found to be an issue (although most were published in English), with their focus on diagnosis, mechanism, transmission and treatment, whilst Western papers focused only on transmission and treatment.
Using Principle Component Analysis (PCA) and Multiple Factor Analysis (MFA) of the filtered search results, this systematic review explores the possible correlations between objective metrics including: number of citations, density, article age, hierarchical evidence level and impact factors.Our findings suggest that pioneering evidence was published and subsequently heavily cited regardless of the level of evidence (mainly levels IV & V).We hope that this review will be of use to those contributing to the evidence base in future time-pressured scenarios such as subsequentnovel pathogen emergences.

Materials and methods
The Web of Science and the iSearch COVID-19 portfolio were utilised as effective tools for retrieval of citation information of published Covid-19 articles.
The Web of Science provides comprehensive citation data for articles published in Medline, Web of Science Core Collection, BIOSIS Citation Index, KCI-Korean Journal Database, Russian Science Citation Index, and SciELo Citation index [1][2][3].Topic fields of articles (title, abstract, author's keywords and keywords within a record) were searched for the following keywords: Only COVID-related articles submitted after 31/12/2019 (first reported COVID-19 case) were included in the study and the 100 most cited articles were identified and evaluated by two independent reviewers (Fig. 1).
COVID-19 articles were classified and assigned a level of evidence.
The levels of evidence (I-V) were adapted from the National Health and Medical Research Council (NHMRC) and The Centre for Evidence-Based Medicine (CEBM) [4].
Articles were categorized, using LitCovid, by different research topics as following: Clinical Features, Mechanism, Diagnosis, Treatment, Transmission, Prevention, Forecasting and General [5].

Statistical analysis
Statistical analyses were conducted using the R programming language.Normality of data was checked using the Shapiro-   *Any disagreements between the reviewers were resolved via consensus.

Results
Table 1 gives an overview of the top 100 most cited articles on COVID-19.All articles were published in 2020 (100%).The highest number of citations was 18958 and the lowest number was 1410.The median age of the articles was 21 months (range 13-24).In terms of levels of evidence -14 articles were evidence level I, 7 were level II, 12 were III, 45 were IV and 22 were V.
Articles published in China concentrated on the diagnosis, mechanism, transmission and treatment of COVID-19.On the other hand, articles published in Europe and the USA mainly focused on the transmission and the treatment of the virus (Table 3).
Principle Component Analysis (PCA) revealed a strong correlation between the number of citations and the citation density of citations.Furthermore, there was also a strong correlation between the age of the article, the level of the evidence and the impact factor.There was a significant trend towards increased frequency of citations with age of the article (r = 0.26, P = 0.0004).The number of citations an article had was not significantly associated with the level of evidence (r = 0.152, p = 0.152) (Fig. 4).

Discussion
This systematic review identified the 100 most cited articles on Covid-19 and sought to identify trends within them by applying citation analysis techniques.In late 2019, the COVID-19 pandemic presented one of the greatest challenges of the modern scientific era.With an estimated 503,862 deaths worldwide reported within the first 6 months of 2020 [6], the gravity and urgency of the problem required rapid advancement in knowledge to a degree not previously seen.It is unsurprising that with the amount of funding and resource invested, great volumes of scientific literature were produced in a relatively short period of time.What is surprising is the degree to which this occurred.Despite the first case of COVID-19 being only   All the articles were published in 2020 with a mean article age of 21 months (range 13-24 months).There was a weak but significant association between age and citation number; as citations take time to accumulate and consequently more recently published articles may not yet have achieved sufficient citations to have entered the review.The weakness of this relationship is likely a result of the short time-frame over which the articles have accrued the citations.The most highly cited article has been cited 18958 times and had a citation density of 790, the median citations was 2434.5 (IQR 1989.5-3749.0) and median citation density was 117.5 (IQR 89.5-185.2).This is particularly impressive as a variety of other citation classics have reported significantly lower median citations despite covering time periods of many years [2,[13][14][15][16].On average a journal article will peak in citation density approximately 3 years after publication [17] which presents a potential problem in applying citation analysis to a novel and rapidly evolving field.The strong correlation between density and citation number combined suggests that highly cited articles continue to be cited and may be establishing 'authority' status.Given the ongoing expansion in literature there is a risk that articles, considered powerful by traditional metrics, may already be scientifically out of date but not yet past their peak in terms of citation accrual.
54% of the articles originated from China which is unusual for citation classics reviews.Similar reviews on other topics tend to draw most of their articles from the USA [1,2,[13][14][15].This is likely explained by early geographic distribution of cases which would have granted a significant advantage for Chinese-based labs, resulting in earlier publication and thus citation accrual.Interestingly the USA provided almost half of the remaining articles, which allowing for the above explanation is in keeping with what would be expected.The early geographical distribution of cases may also explain diagnosis playing a significant role in articles from China but not from the rest of the world.
Articles representing level IV and V levels of evidence account for 67% of those identified.Whilst citation classics often demonstrate the inclusion of the lowest levels of evidence, it is seldom to this degree.For example, a review into general medical articles found 38% of articles were drawn from the lowest two levels [1] and another review into GI surgery 44% [2].Only 7 RCTs were identified which is significantly lower than what would have been expected.It must be considered that higher levels of evidence such as RCTs (and systematic reviews of these) can take many months to conduct.It is probable that the lack of high-level evidence, and the overrepresentation of lower levels of evidence, is partially a result of the literature not yet reaching maturity.Another interesting finding of this review is the degree to which high impact factor journals are publishing low levels of evidence.It has been previously shown that in the top three general medical journals (The Lancet, New England Journal of Medicine and Journal of the American Medical Association) the level of evidence represented by an article regarding COVID-19 was significantly lower when compared to both contemporary and historic controls [9].
The main limitation of this review is the time at which it was conducted; this makes comparisons to similar reviews of different topics difficult.Due to the short publication span of the papers the definition of citation density had to be modified, using a reference period of a month rather than a year.It is likely that as the literature around COVID-19 matures trends in publications will change.It is possible that in the early stages of an emerging topic traditional citation metrics may not be the most reliable way of identifying the most influential research in the longer term.Presence on social media may play an important role in identification of future influential articles; number of tweets within the first 7 days of a publication are shown to correlate with high levels of citation [18].The simple and easily repeatable methods of this review, however, allow for later comparative review to examine how these trends have changed.

Conclusion
This review has collated the 100 most influential COVID-19 papers and assessed trends within them.We have established that in the early phases of a pandemic new and ground-breaking research surfaces regardless of the evidence level and can gain high levels of citation.

FIG. 1 .
FIG. 1. Flow diagram demonstrating the methodology and data extraction.

FIG. 3 .
FIG. 3. Number of publications and citations per country.

TABLE 1 .
Overview of the top 100 cited COVID-19 articles (* next to rank number indicates systematic review).
Wilk test.The distribution of a parameter was characterised by the median and interquartile range.The Kendall rank correlation coefficient was used to measure the ordinal association between two values.Multiple Factor Analysis (MFA) was used to analyse quantitative variables simultaneously.A P-value of <0.05 was considered statistically significant.Microsoft Excel software was used for descriptive statistical analysis.

TABLE 2 .
Journals in which top 100 cited COVID-19 articles were published with accompanying journal metrics.

TABLE 3 .
Articles topic field.