Super-spreaders of novel coronaviruses that cause SARS, MERS and COVID-19: a systematic review

Purpose Most index cases with novel coronavirus infections transmit disease to just one or two other individuals, but some individuals “super-spread”—they infect many secondary cases. Understanding common factors that super-spreaders may share could inform outbreak models, and be used to guide contact tracing during outbreaks. Methods We searched in MEDLINE, Scopus, and preprints to identify studies about people documented as transmitting pathogens that cause SARS, MERS, or COVID-19 to at least nine other people. We extracted data to describe them by age, sex, location, occupation, activities, symptom severity, any underlying conditions, disease outcome and undertook quality assessment for outbreaks published by June 2021. Results The most typical super-spreader was a male age 40+. Most SARS or MERS super-spreaders were very symptomatic, the super-spreading occurred in hospital settings and frequently the individual died. In contrast, COVID-19 super-spreaders often had very mild disease and most COVID-19 super-spreading happened in community settings. Conclusions SARS and MERS super-spreaders were often symptomatic, middle- or older-age adults who had a high mortality rate. In contrast, COVID-19 super-spreaders tended to have mild disease and were any adult age. More outbreak reports should be published with anonymized but useful demographic information to improve understanding of super-spreading, super-spreaders, and the settings in which super-spreading happens.


Introduction
During the COVID-19 pandemic, the role of super-spreaders in the transmission of the disease became a widely discussed topic within the scientific community and in the media. Throughout 2020, the media reported COVID-19 super-spreading in fitness classes, religious worship, skiing trips, and public transport. It has long been noted that for a variety of diseases, 20% of the host population has the potential to cause 80% of transmission (20/80 rule) [1]. Fully understanding the role of super-spreaders could enable more effective containment of disease outbreaks and more accurate modeling of epidemics. Studies have described super-spreading events for SARS [2,3], MERS [4,5], and COVID-19 [6,7]. However, to date, there is little focus on the characteristics of the individual superspreader and what factors might make an individual into a superspreader [8][9][10]. By compiling details from super-spreading events across three novel coronavirus (nCoV) outbreaks, we aimed to assess whether there are any common factors between super-spreaders.
This information may be used to guide contact tracing during outbreaks, aid early identification of super-spreaders, and prevent super-spreading, therefore helping to reduce total transmission and reduce harm from future pandemics [9][10][11].
Reasons offered to explain why only some cases become superspreaders can be grouped into four categories: transmission pathways suited to that pathogen (e.g., droplet, fecal-oral, contact with bodily fluids, etc.), biological/individual factors (e.g., age, sex, ethnicity, viral load, duration of shedding), behavioral factors (e.g., type of work, number of contacts), and environmental factors (e.g., air circulation in workplace) [12]. We undertook this review to research attributes of individuals believed to be index patients in superspreading events of nCoVs that cause any of MERS, SARS, or COVID-19, using evidence published by June 2021. We were interested in specific individual, behavioral and environmental factors that were likely to be most available in outbreak reports: transmission setting, sex, age, ethnicity, occupation, how many cases were in the cluster, severity of symptoms, disease course outcomes, and comorbidities.

Methods
We defined a "super-spreader" as an index case that was described in the scientific literature as linked to at least nine secondary infections with eligible viruses. We chose nine as the threshold to be indisputably much higher than the commonly cited effective reproductive numbers (between 2 and 5) for the nCoV diseases: SARS, MERS, and COVID-19 [13][14][15]. There is no formal consensus on how to define super-spreaders relative to the typical reproductive number for a stated pathogen [16]. Super-spreading was defined as "above the average number of secondary cases" for SARS [15]. Others suggested that super-spreaders should be defined as the 1% of index cases that generated the most secondary infections [9,17,18]. We acknowledge that other definitions of "super-spreader" have merits.
We were only interested in transmission between humans outside of deliberate experiments. We refer to "super-spreading events" as events where super-spreading is believed to have happened. Events ranged from a small birthday party to the annual Hajj pilgrimage in size and scope.
There was ambiguity in primary studies about how many secondary infections were directly linked to each index case. Different studies identified different numbers of secondary cases for the same index case, sometimes fewer than nine. We tried to be inclusive about this eligibility criteria: if at least one peer-reviewed assessment identified a minimum of nine secondary infections, we included the individual as a super-spreader. However, some sources that describe the same specific index case, may have suggested fewer secondary cases. Therefore, the range of secondary cases reported sometimes includes values under nine. We report the full range of possible counts of secondary cases from each specific index case as identified in relevant literature even if some of these were less than nine.
We undertook a structured search of scientific bibliographic databases using the search terms in Box 1. Note that how phrases in this search were interpreted by the search engine is that partial matches are returned; that is, searching for SARS would match to any of SARS (the disease), SARS-CoV (original name of the virus that causes SARS), SARS-CoV-1 (the new name of the virus that causes SARS) or SARS-CoV-2 (the virus that causes . Searches were completed in June 2021. The search results were combined into a single database and de-duplicated. We chose not to use data from public-access inventories of apparent nCoV super-spreader events because we wanted to confine our efforts to the most validated evidence available to date, using evidence that had been collected by applying criteria that we understood, was designed to have minimal bias, and we could describe transparently. This review has PROSP-ERO registration CRD42020190596.

Screening
Eligible publications were published in 2002 or later. Our latest search date was June 18, 2021. We considered publications in all languages if we could translate them into English or Spanish. Two independent screeners applied eligibility criteria, with a third person deciding if no consensus could be reached. Studies had to be chosen by at least two reviewers to go to full-text review. Full-text review confirmed eligibility criteria. The study had to describe infection(s) confirmed by a reverse transcription-polymerase chain reaction test (rt-PCR), cell culture or clinical presentation, and known exposure history. Eligibility criteria were: • The study must describe one of these three nCoV diseases: MERS, SARS, COVID-19.
• Study design: almost any, but case or case-cluster studies were preferred. Preprints or gray literature from credible sources could be used and could be supported by sources such as public statements by individuals themselves, interviews, and press releases.

Extraction and synthesis
One investigator undertook initial extractions which were checked by another researcher, with differences resolved by discussion or a third opinion. We tabulated and descriptively summarized this information separately for each disease. The extracted

Box 1
Phrases used for scientific bibliographic database searches.
information was grouped by virus and/or linked disease (MERS, SARS, COVID-19). Pooled summaries were only undertaken if there were at least 20 super-spreaders found for any disease. No pooling of data beyond descriptive statistics (such as median age or percentage female) was attempted for smaller groups. We expected that much information would be heterogeneous and need to be reported as stated by the authors. Throughout the results, we often report raw numbers (such as numbers of persons found in a specific age group) but note that because the COVID-19 pandemic was much bigger than SARS or MERS outbreaks, the results should be considered relative to all cases (proportionately) for each specific disease.

Activities when super-spreading
We reported the activities that super-spreaders seem to have been doing when spreading infection (such as being an inpatient, working in a call center, attending a fitness class, etc.).

Setting
We determined where super-spreader had their impact, in settings such as households, hospitals, places of worship, etc. Additionally, the country was reported, as was a city or other regional information when available. For reporting purposes, the setting was grouped with activities when super-spreading.

Age and sex of the super-spreader
Age (in years) and gender were reported.

Ethnicity and occupation
We knew from preliminary searches that this information is largely unavailable, but report the information we found narratively.

Secondary cases
We recorded the direct count of first-generation cases and took this to equal the number of people that each super-spreader was believed to transmit to. This was expressed as individual case counts. This estimate was sometimes described as a range, as counts for index cases occasionally varied across different sources. We report all available information.

Symptoms and disease outcomes
Outcomes (especially mortality rates) for the super-spreaders were summarized in a simple narrative format.

Underlying conditions
We recorded narratively which underlying conditions the superspreader individuals were reported to have (such as diabetes or heart disease). There is a hypothesis that super-spreaders are more likely (than the general population) to be people with compromised immune systems [19]. Conversely, there was also widespread public perception early in the COVID-19 pandemic [20] that super-spreaders might often be nearly or completely asymptomatic. We collected data that might address either hypothesis.

Deviations from protocol
We intended to report on tertiary and onward transmission counts, broken into percentiles or median estimates. We extracted this information but found that it was reported with such variability and inconsistency that it was not possible to summarize, and is not elaborated upon further. A post-hoc sensitivity analysis was undertaken.

Quality assessment
We designed a customized quality assessment of the primary studies (Table 1). This checklist was designed to assess the credibility of sources and the rigor of their contact tracing methods and transmission chain reconstructions. We defined studies with six "Yes" answers as "Most credible," studies with four or five "Yes" answers as "Fairly credible" and those with three or fewer "Yes" answers as "Less credible."

Additional analyses and interpretations
We examined the information collected to attempt to establish whether people identified as super-spreaders of nCoVs "could be anyone" or tend to fit a profile, such as being late middle age males or especially socially connected individuals. We were interested in whether super-spreading depended on certain types of people or the situations they were in. This information was only interpreted narratively but with respect to relevant literature or statistics such as the age distribution of people in the country where the superspreader lived.
We did not collect information on the viral load carried by identified super-spreaders because this information is not systematically reported. However, we collected information to address the hypotheses that super-spreaders tend to be especially asymptomatic or especially severely ill. The severity of the disease was indicated by mortality outcomes. Survival rates for super-spreaders were calculated for each disease. The survival rates of patients who are super-spreaders were compared to publicly available survival rates of known MERS, SARS, or COVID-19 cases.
A sensitivity analysis was undertaken. We considered if broad findings about age profile, sex balance, settings, or symptomatic status were especially different (segregated by disease) for individuals linked to a higher number of secondary cases (specified as at least 12 in all reports) and where the evidence base was at least fairly credible, that is, quality scores ≥4. We compare these subgroup findings narratively to results from the main analysis.

Results
The search procedure is illustrated in Figure 1. Counts of superspreaders identified for each disease were: COVID-19, 76; SARS, 29; MERS, 14. There were many more COVID-19 super-spreading individuals documented, partly due to the much wider spread of COVID-19, and long pandemic duration, starting in 2020, as well as better ascertainment methods available in 2020 compared to earlier years and greater numbers of papers published around this disease compared to MERS and SARS. The data summaries are discussed narratively and also presented in a condensed format in Table 2 (with sensitivity analysis). Figure 2 shows information about the distribution of quality scores for the individuals identified. The quality score data are shown as percentages with each score, to make it possible to compare the quality of evidence associated with each disease, in spite of the different case counts (many more for COVID -19). Altogether, 30% (36/119) of super-spreader individuals had epidemiologic descriptions that scored 6 in the quality assessment exercise (most credible rating). Proportionately, MERS super-spreaders had the highest number of "most credible" ratings (7/14, 50%). Around 28% (8/29) of SARS super-spreaders had epidemiologic evidence that was most credible (scores ≥6); this proportion was similar for COVID-19 superspreaders (26% [21/76]). Proportionately, more COVID-19 outbreak reports were good or better quality than those for SARS or MERS, which is reassuring in light of much commentary about poor quality research published during the COVID-19 pandemic [21].

MERS
We provide a narrative summary because only 14 eligible MERS super-spreader individuals were identified. See Table S1 in Supplemental Material for MERS super-spreader details. Eight index cases were in Saudi Arabia, five in South Korea, and one in the UAE. These transmissions happened from 2014 to 2017. The five South Korean super-spreaders were linked to the May-July 2015 MERS outbreak.

Activities when super-spreading and settings
All MERS super-spreading was linked to clinical settings, usually from hospital inpatients admitted for their recognized respiratory illness, although day patients who attended the clinic for other reasons (e.g., receiving dialysis) were also identified. Most secondary cases were other hospital patients and healthcare workers.

Age and sex
Index case age and sex were available for 12 individuals, who had a median age of 48 years (range 23-75 years, inter quartile range (IQR) 38-60.5). There were 2 females and 10 males.

Ethnicity and occupation
Traits such as ethnicity and occupation were mostly not reported. Two super-spreaders were stated to be Korean, one was Yemeni, and the patient in UAE was described as "ex-patriate." The occupation was described for three MERS super-spreaders in the Middle East: police storage room staff, camel butcher, and factory worker. Most secondary cases were health care workers.

Secondary case counts
Estimated counts of secondary cases caused by these 14 index patients ranged in published studies (sometimes quite preliminary) from 2 to 89. The largest counts of identified secondary cases and total people in epidemiological clusters were in South Korea. n = number of studies that contributed any useable data to any of age, sex etc., but often did not provide useable data for all of these fields. All non-clinical settings (including care homes or prisons, for instance) were treated as community settings; persons who were linked to transmission in both community and clinical settings were treated as "mixed" settings.

Figure 2.
Quality assessment scores for epidemiologic information on nCoV super-spreaders. Note: Questions in Table 1 were used to determine the quality assessment scores shown in Figure 2. nCoV, novel coronavirus.

Symptom severity and outcomes
No MERS super-spreaders were described as asymptomatic; specific symptoms of severe illness were documented for nine super-spreaders among whom pneumonia was the single most common diagnosis (n = 5). No information was given on the survival or otherwise of two MERS super-spreaders. Of the remaining 12 super-spreaders, 4 survived to discharge and 8 were recorded as deceased from MERS (mortality rate 67%). The crude case fatality rate for MERS has been reported as 34.8% [22].

Underlying conditions
Two of the MERS super-spreaders were investigated for comorbidities but reported to have no underlying health conditions; information was not provided for the two others. Among the 10 super-spreaders with documented underlying conditions, a range of conditions was mentioned, including hypertension and kidney disease. Diabetes mellitus was the most commonly mentioned comorbidity (4 of the 14 super-spreaders).  Table S2 in Supplemental Material for a summary of extracted SARS super-spreader details.

Activities when super-spreading and settings
Almost all transmissions occurred while patients were in clinical settings (usually when they were admitted to the hospital for treatment related to their SARS, but also when being treated as outpatients, visiting other inpatients, or transiting between hospitals). Some transmission in the community was deemed likely for five super-spreaders. Air travel was identified as the transmission location for one patient.

Age and sex
Index case sex was available for 27 individuals. There were 7 females and 20 males. Index case age was available for 23 individuals who had median age of 54 years (range 22-91 years, IQR 42-70).

Ethnicity and occupation
Ethnicity or national origin was only reported for nine SARS super-spreaders: five were described as Chinese, one Malay, one Filipino, one Asian American, and one of "non-Asian" descent. There was no evidence that ethnic representation was different from ethnic distribution in the origin areas. The occupation was mostly not reported. Five individuals had occupational links to clinical settings: an ambulance driver, clinical professor, nurse, hospital laundry worker, and family physician. There were two business people, one seafood merchant and one vegetable seller. A patient aged 73-74 was described as retired. Similar to large MERS outbreaks, where occupational data were available for secondary cases, the secondary cases tended to be health professionals. Other occupations listed among the secondary cases were taxi drivers and market stall traders.

Secondary case counts
Estimated counts of secondary cases generated by these 29 index patients ranged from 4 to 51. Most cases were in the Far East (China, Hong Kong, and Taiwan) plus Canada.

Symptom severity and outcomes
No SARS super-spreaders were described as asymptomatic. Information was available on symptom severity for 22 SARS superspreaders. Altogether, 18 were described as having a fever; symptoms of respiratory illness were reported for 17. No information was given on mortality for 13 SARS super-spreaders. Of the remaining 16 patients, 3 survived to discharge and 13 were recorded as deceased from SARS (mortality rate 81.3%). SARS mortality rates are sex-and age-dependent and have been estimated at 26% for Chinese patients age ≥80 years [15].

Underlying conditions
There was no information about possible underlying health conditions in 16 of the 29 SARS super-spreaders. Two were described as having unremarkable or "previously healthy" histories. Underlying health conditions were reported for 11 patients. A total of 8 of the 11 had cardiovascular conditions, 6 had forms of diabetes.

COVID-19
In total, 76 eligible super-spreaders of COVID-19 were found. Of these, 38 were in East or Southeast Asia, 15 were in Europe, 14 were in North America, and 9 were elsewhere in the world. The USA contributed 12 super-spreaders to the database, the largest count from a single territory. The majority of super-spreader events (n = 61) reported in this study occurred before the end of May 2020. See Table S3 in Supplemental Material for a summary of extracted COVID-19 super-spreader details.

Activities when super-spreading and settings
Information on activities that were happening when superspreading events happened was available for 56 individuals. In contrast to the MERS and SARS outbreaks, most super-spreading COVID-19 index patients were active in the community at the time that they were index patients, not receiving medical care. Just 12 of the super-spreaders were linked to clinical settings, either as patients or healthcare workers interacting with patients and colleagues. Other activities or settings linked to super-spreading were social/business in nature (n = 22), religious worship (n = 7), fitness or sport (n = 5), education (n = 4), public transport (n = 3) and residential settings (one each of prison, summer camp, and long-term care facility).

Age and sex
Index case age was mostly unavailable; where age data were available this was predominately in Asian countries. Approximate or specific age information was provided for 28 individuals who had a median age of 56 years (range 18-85 years, . From the 36 cases where sex was reported, there were 23 males and 13 females. All 13 super-spreaders whose sex was reported as female were from countries in East and Southeast Asia (where males were equally represented, n = 12). This sex imbalance for COVID-19 super-spreaders is likely an artifact of a widespread lack of reporting index patient sex rather than evidence of underlying sex disparity in representation.

Ethnicity and occupation
Ethnicity was rarely reported (only for six individuals). Where ethnicity was reported it was typically described by nationality: "American" (n = 3), "British" (n = 2), and Korean (n = 1). The occupation was mostly not reported but was described for 26 individuals (34%). There were six health care workers, four fitness instructors, four school staff, three religious leaders, two retired individuals, two businesspeople, and one person each described as an office worker, summer camp staff, sales representative, pilot, and meat processing plant worker. One person may be plausibly inferred as working as a church organist or piano player [23].

Secondary case counts
Estimated counts of secondary cases ranged from 8 to 78. The index cases with the largest counts of secondary infections were in Jordan, South Korea, Myanmar, Hong Kong, and the USA.

Symptom severity and outcomes
Symptom status was recorded for 50 individuals. Seven were described as asymptomatic at the time of super-spreading. Three were described as entirely asymptomatic, while four subsequently developed symptoms [24][25][26][27][28][29]. Additionally, 13 super-spreaders were described as pre-symptomatic and three as mildly symptomatic. Other individuals had symptoms at the time of their superspreading. No information on survival was available for 41 index cases. Three super-spreaders were reported to have died, while 32 were reported to have survived. This suggests a case fatality rate for super-spreaders of about 8.8%, which is 3.27 times higher than the all-age COVID-19 case fatality rate of 2.7% suggested by a pooled analysis of data available in the first 9 month of the COVID-19 pandemic [30].

Underlying conditions
Individual underlying health status was rarely reported for COVID-19 super-spreader individuals. It is likely that in some cases this lack of information means the absence of underlying conditions, but we cannot be sure without explicit statements. Just four superspreaders were reported to have underlying conditions, while one was reported to have had a previous "record of good health." The lack of specific age data on most super-spreader individuals resulted in small country-level datasets, making a robust comparison with age distribution in each country untenable. The country that contributed the most specific age data was China. We, therefore, undertook a China-only comparison with the age-distribution estimates within the Chinese population of 2003 (SARS) or 2020 (COVID-19) (see Supplemental Fig. S4). This limited comparison suggests that older individuals (age 40+) were over-represented for both SARS and COVID-19 compared with age-distribution estimates within the Chinese population of 2003 (SARS) or 2020 . These data suggest that in China individuals more prone to superspreading are relatively older, at least age 40+.

Sensitivity analysis
In our sensitivity analysis, we considered the age distribution, sex balance, transmission settings, and symptomatic status of superspreader individuals for each disease, where the subgroups were defined by those individuals linked to at least 12 secondary cases in all reports, or where the evidence base had quality score ≥4. The results are in Table 2. There was a divergence from the main analysis findings, in that females slightly dominated the COVID-19 superspreaders linked to ≥12 secondary cases; but this result was not confirmed for the most credibly documented COVID-19 superspreaders (where males strongly dominated, as they did for all SARS and MERS subgroups). The median age was about 55 in these subgroups (compared to 48, 54, or 56 in all-data findings). Almost all settings for subsets of SARS or MERS super-spreaders were clinical, but community settings were the large majority for COVID-19 superspreading. All subset SARS or MERS index cases were symptomatic, but about 30% of COVID-19 index cases in these subgroups were presymptomatic or never developed symptoms. Overall, this sensitivity analysis has results that agree with and reinforce conclusions that may be drawn from the results gained by analyzing the larger dataset.

Discussion
We produced a dataset that would provide empirical parameters for anyone attempting to model the role of super-spreading individuals related to nCoVs. The data we provide help to indicate which persons (by age or sex category, at least) in a population seem to be most likely to become super-spreaders. This information could help indicate which persons in a specific population (with known age/sex distribution and contact patterns) might contribute to epidemic growth. However, this is an appropriate opportunity to state that no one chooses to be highly infectious and hence in no way do we mean to imply stigma on any persons with such demographic traits; no one chooses to be highly infectious.
Observable mortality rates for super-spreaders of all the nCoV we examined were much higher than mortality statistics for the same diseases. However, we note that mortality outcomes for COVID-19 index cases were particularly incompletely reported. The count of COVID-19 super-spreaders who died was only three-so we may have an inaccurate picture of how often COVID-19 super-spreaders tend to die from the illness.
For MERS and SARS, super-spreaders tended to be quite ill. No asymptomatic super-spreaders were described for MERS and SARS, in spite of the prevalence of asymptomatic or mild infection among MERS cases being 26.9% [22]. This may be because for MERS and SARS only symptomatic people were tested. This contrasts with COVID-19, where super-spreaders who were completely asymptomatic throughout their disease course, or who were pre-symptomatic at the time that they transmitted the disease to others, were commonly reported. A recent meta-analysis found at least one-third of all COVID-19 cases were estimated to be truly asymptomatic [31]. This high prevalence of undetected infection severely challenged COVID-19 control. However, we acknowledge that biases in ascertainment may have affected epidemiologic reporting of symptom severity, especially for COVID-19. Individuals may have been motivated to incorrectly describe their symptoms as asymptomatic if they had social contact, due to legal proscriptions against social contact for people with COVID-19 symptoms in their jurisdictions at the time of their super-spreading.
Only two individuals aged under 20 years old were identified as super-spreaders for any of these nCoVs (both had COVID-19). One of these was age 18, another was described as a teenage staff member (so can be inferred to be close to or already reached adult age). In addition, not many super-spreaders were aged 19-39. Because younger people tend to have much higher social contact rates than older individuals [32], more youthful super-spreaders might have been expected. As asymptomatic infection of MERS [33,34] is more common in younger people, and droplet spread of infectious respiratory disease tends to closely correlate with symptom severity [35], this might explain why we found relatively few super-spreaders under the age of 40 for MERS. However given that COVID-19 asymptomatic infection is also most common in younger people [36] and that many COVID-19 super-spreaders were asymptomatic, it is surprising there were not more younger COVID-19 super-spreaders. For SARS, it is unclear why there are fewer younger super-spreaders. There is little information on how asymptomatic SARS infection varies with age, although a study in Singapore indicates almost no difference in age between symptomatic and asymptomatic cases [37]. There is no clear explanation as to why most nCoV superspreaders were aged over 40.
We tried to identify super-spreaders who generated at least nine secondary infections. However, the counts of secondary infections from individual index cases varied in the epidemiological studies. We did not aspire to evaluate which studies were most accurate (in absence of all primary epidemiological data). Identifying "superspreaders" is challenging when there is much uncertainty about the true count of secondary cases.
There was insufficient information about the occupation to link specific lines of work to super-spreading. There was also very limited information about patient ethnicity. There were some large differences in settings or traits for MERS or SARS super-spreaders compared to COVID-19 index patients. MERS and SARS super-spreaders were much more likely to have been spreading in clinical settings, whilst COVID-19 spreading appeared to occur more commonly in the community. Alongside this, MERS and SARS super-spreaders were more likely to experience severe symptoms, while non-symptomatic super-spreading from COVID-19 index patients was common. These findings reiterate the value of high awareness of asymptomatic COVID-19 infection.

Limitations
Ascertainment bias affected our efforts. Universal screening of all MERS and SARS contacts rarely happened, in contrast to much broader testing of all COVID-19 contacts. Therefore, it is unsurprising that we collected data about more COVID-19 super-spreaders. Even when a super-spreading individual is detected within a surveillance system, a decision is made whether or not to publish as a case study (or academic publication) and it is unknown what biases this introduces. Case studies may be more likely for diseases perceived to be rare, which might explain why relatively more studies on MERS and SARS were produced. It is also possible that super-spreaders who are considered unusual or have already generated publicity are preferentially published. We have reported attributes, such as symptom severity, as described in the original studies. It may be that some attributes have not been reported faithfully. When extracting data from individual publications the details presented varied greatly between studies. More complete reporting would be desirable.
To systematically collect data about super-spreading individuals we had to create consistent criteria to define them. Slightly different criteria would have identified a different group of people. We did not collect data that would allow a sensitivity analysis using lower eligibility thresholds (such as only 3 or 5 secondary cases). After our searches and analysis were completed, a useful database of infection trees was published that could be used to support confirmatory and replicated analyses of nCoV super-spreading events using different definitions of what is a "super spreading individual," at outbreaktrees.ecology.uga.edu [38].
COVID-19 is fast moving and our findings are only relevant to the time period investigated. Our latest searches were in June 2021 and all our super-spreaders were active in 2020; data about superspreading linked to the Delta and Omicron variants of COVID-19 would not have been included.

Conclusion
A key question is what makes individuals super-spreaders [8][9][10]. We demonstrate that this question is challenging to answer due to the limited number of published studies on super-spreader individuals and inconsistencies in the available details about index cases in available studies. One solution would be to encourage the publication of more outbreak reports with suitably anonymized, consistent descriptions of the index and secondary patients. Since early 2020, detailed contact tracing databases have been assembled in many jurisdictions to facilitate COVID-19 control: these may form a rich resource for identifying super-spreaders. Should stronger data emerge, they could be incorporated into disease modeling or be used to guide intensive contact tracing in outbreak situations.
For all three diseases, we found that males and people aged 40+ were traits most typical of super-spreaders, while people under age 18 years were unlikely to be identified as super-spreaders. However, characteristics varied between the coronaviruses. Most super-spreading from MERS or SARS was observed in clinical environments, while COVID-19 super-spreading happened predominantly in community settings. Generally, MERS and SARS super-spreaders were highly symptomatic, and had poor disease outcomes and underlying health conditions. In contrast, many individuals who super-spread COVID-19 were observed to have mild or no symptoms at the time of their high transmission rate, and where survival status was documented, the majority survived the infectious period.

Approval to use the data to undertake the research
This was a secondary analysis of data in the public domain and therefore ethics approval was not required for the study.

Author contributions
JB and IRL conceived of the research. JB designed and ran the searches of bibliographic databases. All authors screened scientific studies and resolved differences using discussion or by consulting another author. IRL and NRJ designed the quality assessment which was undertaken by JB, NRJ and FCDH. JB designed the database, JB and IRL coordinated the project. All authors extracted data. JB summarized the dataset with descriptive statistics. JB and FCDH wrote the first draft and all authors revised for content.

Declaration of Competing Interest
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Iain Lake reports that he was grant recipient for financial support for all authors, as provided by the National Institute for Health Research.

Acknowledgments
This study was funded by the National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Emergency Preparedness and Response at King's College London in partnership with the UK Health Security Agency (UK HSA) and collaboration with the University of East Anglia. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR, UEA, the Department of Health or UK HSA.

Appendix
Supplemental Material, Tables S1-S3 and Fig. S4. NOTES for all tables: Quality scores are based on count of "Yes" answers to quality checklist shown as Table 1. #secondaries means the number of direct secondary cases caused by the specified index case, according to the group of related articles that describe this index case; hence, the answer can be a range if multiple sources have different estimates for the number of secondary cases linked to the index patient. The year is 2003 in all SARS cases except if noted otherwise. ? : information not found.   ? : information not found.