Heterogeneity in transmissibility and shedding SARS-CoV-2 via droplets and aerosols

Background: Which virological factors mediate overdispersion in the transmissibility of emerging viruses remains a long-standing question in infectious disease epidemiology. Methods: Here, we use systematic review to develop a comprehensive dataset of respiratory viral loads (rVLs) of SARS-CoV-2, SARS-CoV-1 and influenza A(H1N1)pdm09. We then comparatively meta-analyze the data and model individual infectiousness by shedding viable virus via respiratory droplets and aerosols. Results: The analyses indicate heterogeneity in rVL as an intrinsic virological factor facilitating greater overdispersion for SARS-CoV-2 in the COVID-19 pandemic than A(H1N1)pdm09 in the 2009 influenza pandemic. For COVID-19, case heterogeneity remains broad throughout the infectious period, including for pediatric and asymptomatic infections. Hence, many COVID-19 cases inherently present minimal transmission risk, whereas highly infectious individuals shed tens to thousands of SARS-CoV-2 virions/min via droplets and aerosols while breathing, talking and singing. Coughing increases the contagiousness, especially in close contact, of symptomatic cases relative to asymptomatic ones. Infectiousness tends to be elevated between 1 and 5 days post-symptom onset. Conclusions: Intrinsic case variation in rVL facilitates overdispersion in the transmissibility of emerging respiratory viruses. Our findings present considerations for disease control in the COVID-19 pandemic as well as future outbreaks of novel viruses. Funding: Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant program, NSERC Senior Industrial Research Chair program and the Toronto COVID-19 Action Fund.


Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has spread globally, causing the coronavirus disease 2019  pandemic with more than 129.2 million infections and 2.8 million deaths (as of 1 April 2021) (Dong et al., 2020). For respiratory virus transmission, airway epithelial cells shed virions to the extracellular fluid before atomization (from breathing, talking, singing, coughing and aerosol-generating procedures) partitions them into a polydisperse mixture of particles that are expelled to the ambient environment. Aerosols ( 100 mm) can be inhaled nasally, whereas droplets (>100 mm) tend to be excluded (Prather et al., 2020;Roy and Milton, 2004). For direct transmission, droplets must be sprayed ballistically onto susceptible tissue (Liu et al., 2017a). Hence, droplets predominantly deposit on nearby surfaces, potentiating indirect transmission. Aerosols can be further categorized based on typical travel characteristics: short-range aerosols (50-100 mm) tend to settle within 2 m; long-range ones (10-50 mm) often travel beyond 2 m based on emission force; and buoyant aerosols ( 10 mm) remain suspended and travel based on airflow profiles for minutes to many hours (Liu et al., 2017a;Wei and Li, 2015). Although proximity has been associated with infection risk for COVID-19 , studies have also suggested that long-range airborne transmission occurs conditionally (Hamner et al., 2020;Lu et al., 2020a;Park et al., 2020).
While the basic reproductive number has been estimated to be 2.0-3.6 Li et al., 2020a), transmissibility of SARS-CoV-2 is highly overdispersed (dispersion parameter k, 0.10-0.58), with numerous instances of superspreading (Hamner et al., 2020;Lu et al., 2020a;Park et al., 2020) and few cases (10-20%) causing many secondary infections (80%) (Bi et al., 2020;Endo et al., 2020;Laxminarayan et al., 2020). Similarly, few cases drive the transmission of SARS-CoV-1 (k, 0.16-0.17) (Lloyd-Smith et al., 2005), whereas influenza A(H1N1)pdm09 transmits more homogeneously (k, 7.4-14.4) (Brugger and Althaus, 2020;Roberts and Nishiura, 2011), despite eLife digest To understand how viruses spread scientists look at two things. One is -on average -how many other people each infected person spreads the virus to. The other is how much variability there is in the number of people each person with the virus infects. Some viruses like the 2009 influenza H1N1, a new strain of influenza that caused a pandemic beginning in 2009, spread pretty uniformly, with many people with the virus infecting around two other people. Other viruses like SARS-CoV-2, the one that causes COVID-19, are more variable. About 10 to 20% of people with COVID-19 cause 80% of subsequent infections -which may lead to so-called superspreading events -while 60-75% of people with COVID-19 infect no one else.
Learning more about these differences can help public health officials create better ways to curb the spread of the virus. Chen et al. show that differences in the concentration of virus particles in the respiratory tract may help to explain why superspreaders play such a big role in transmitting SARS-CoV-2, but not the 2009 influenza H1N1 virus. Chen et al. reviewed and extracted data from studies that have collected how much virus is present in people infected with either SARS-CoV-2, a similar virus called SARS-CoV-1 that caused the SARS outbreak in 2003, or with 2009 influenza H1N1. Chen et al. found that as the variability in the concentration of the virus in the airways increased, so did the variability in the number of people each person with the virus infects. Chen et al. further used mathematical models to estimate how many virus particles individuals with each infection would expel via droplets or aerosols, based on the differences in virus concentrations from their analyses. The models showed that most people with COVID-19 infect no one because they expel little -if any -infectious SARS-CoV-2 when they talk, breathe, sing or cough. Highly infectious individuals on the other hand have high concentrations of the virus in their airways, particularly the first few days after developing symptoms, and can expel tens to thousands of infectious virus particles per minute. By contrast, a greater proportion of people with 2009 influenza H1N1 were potentially infectious but tended to expel relatively little infectious virus when the talk, sing, breathe or cough.
These results help explain why superspreaders play such a key role in the ongoing pandemic. This information suggests that to stop this virus from spreading it is important to limit crowd sizes, shorten the duration of visits or gatherings, maintain social distancing, talk in low volumes around others, wear masks, and hold gatherings in well-ventilated settings. In addition, contact tracing can prioritize the contacts of people with high concentrations of virus in their airways. both viruses spreading by contact, droplets and aerosols (Cowling et al., 2013;Yu et al., 2004). Although understanding the determinants of viral overdispersion is crucial towards characterizing transmissibility and developing effective strategies to limit infection , mechanistic associations for k remain unclear. As an empirical estimate, k depends on myriad extrinsic (behavioral, environmental and invention) and host factors. Nonetheless, k remains similar across distinct outbreaks for a virus (Lloyd-Smith et al., 2005), suggesting that intrinsic virological factors mediate virus overdispersion.
Here, we investigated how intrinsic case variation in respiratory viral loads (rVLs) facilitates overdispersion in SARS-CoV-2 transmissibility. By systematic review, we developed a comprehensive dataset of rVLs from cases of COVID-19, SARS and A(H1N1)pdm09. Using comparative meta-analyses, we found that heterogeneity in rVL was associated with overdispersion among these emerging infections. To assess potential sources of case heterogeneity, we analyzed SARS-CoV-2 rVLs across age and symptomatology subgroups as well as disease course. To interpret the influence of heterogeneity in rVL on individual infectiousness, we modeled likelihoods of shedding viable virus via respiratory droplets and aerosols.

Association of overdispersion with heterogeneity in rVL
We hypothesized that individual case variation in rVL facilitates the distinctions in k among COVID-19, SARS and A(H1N1)pdm09. For each study in the systematic dataset, we used specimen measurements (viral RNA concentration in a respiratory specimen) to estimate rVLs (viral RNA concentration in the respiratory tract) (Materials and methods). To investigate the relationship between k and heterogeneity in rVL, we performed a meta-regression using each contributing study (Figure 2-figure supplement 1), which showed a weak, negative association between the two variables (meta-regression slope t-test: p=0.038, Pearson's r = À0.26).
Using contributing studies with low risk of bias, meta-regression ( Figure 2) showed a strong, negative association between k and heterogeneity in rVL for these three viruses (meta-regression slope t-test: p<0.001, Pearson's r = À0.73). In this case, each unit increase (one log 10 copies/ml) in the standard deviation (SD) of rVL decreased log(k) by a factor of À1.41 (95% confidence interval [CI]: À1.78 to À1.03), suggesting that broader heterogeneity in rVL facilitates greater overdispersion in the transmissibility of SARS-CoV-2 than of A(H1N1)pmd09. To investigate mechanistic aspects of this association, we conducted a series of analyses on rVL and then modeled the influence of heterogeneity in rVL on individual infectiousness.

Meta-analysis and subgroup analyses of rVL
We first compared rVLs among the emerging infections. We performed a random-effects meta-analysis ( Figure 2-figure supplement 2), which approximated the expected rVL when encountering a COVID-19, SARS or A(H1N1)pdm09 case during the infectious period. This showed that the expected rVL of SARS-CoV-2 was comparable to that of SARS-CoV-1 (one-sided Welch's t-test: p=0.111) but lesser than that of A(H1N1)pdm09 (p=0.040).
We also performed random-effects subgroup analyses for COVID-19 ( Figure 3), which showed that expected SARS-CoV-2 rVLs were similar between pediatric and adult cases (p=0.476) and comparable between symptomatic/presymptomatic and asymptomatic infections (p=0.090). Since these meta-analyses had significant between-study heterogeneity among the mean estimates (Cochran's Q test: p<0.001 for each meta-analysis), we conducted risk-of-bias sensitivity analyses; meta-analyses of low-risk-of-bias studies continued to show significant heterogeneity (Figure 3-figure supplements 1-5).

Distributions of rVL
We next analyzed rVL distributions. For all three viruses, rVLs best conformed to Weibull distributions ( . At the 90th case percentile (cp) throughout the infectious period, the estimated rVL was 8.91 (95% CI: 8.83-9.00) log 10 copies/ml for SARS-CoV-2, whereas it was 8.62 (8.47-8.76) log 10 copies/ml for A(H1N1)pdm09 (Figure 4-figure supplement 3). The SD of the overall rVL distribution for SARS-CoV-2 was 2.04 log 10 Source data 1. Search strategy used for MEDLINE. Source data 2. Search strategy used for EMBASE. Source data 3. Search strategy used for Cochrane Central. Source data 4. Search strategy used for Web of Science Core Collection. Source data 5. Search strategy used for medRxiv and bioRxiv. copies/ml, while it was 1.45 log 10 copies/ml for A (H1N1)pdm09, showing that heterogeneity in rVL was indeed broader for SARS-CoV-2.

SARS-CoV-2 kinetics during respiratory infection
To analyze the influences of disease course, we delineated individual SARS-CoV-2 rVLs by DFSO and fitted the mean estimates to a mechanistic model for respiratory virus kinetics ( Figure 4D and Materials and methods). The outputs indicated that, on average, each productively infected cell in the airway epithelium shed SARS-CoV-2 at 1.33 (95% CI: 0.74-1.93) copies/ml day À1 and infected up to 9.25 susceptible cells ( Figure 4-figure supplement 4). The turnover rate for infected epithelial cells was 0.71 (0.26-1.15) days À1 , while the half-life of SARS-CoV-2 RNA before clearance from the respiratory tract was 0.21 (0.11-2.75) days. By extrapolating the model to an initial rVL of 0 log 10 copies/ml, the estimated incubation period was 5.38 days, which agrees with epidemiological findings . Conversely, the expected duration of shedding was 25.1 DFSO. Thus, SARS-CoV-2 rVL increased exponentially after infection, peaked around 1 DFSO along with the proportion of infected epithelial cells (Figure 4-figure supplement 5) and then diminished exponentially.
To evaluate case heterogeneity across the infectious period, we fitted distributions for each DFSO ( Figure 4E), which showed that high SARS-CoV-2 rVLs also increased from the presymptomatic period, peaked at 1 DFSO and then decreased towards the end of the first week of illness. For the 90th cp at 1 DFSO, the rVL was 9.84 (95% CI: 9.17-10.56) log 10 copies/ml, an order of magnitude greater than the overall 90th cp estimate. High rVLs between 1 and 5 DFSO were elevated above the expected values from the overall rVL distribution (Figure 4-figure supplement 3). At À1 DFSO, the 90th cp rVL was 8.30 (6.88-10.02) log 10 copies/ml, while it was 7.93 (7.35-8.56) log 10 copies/ml at 10 DFSO. Moreover, heterogeneity in rVL remained broad across the infectious period, with SDs of 1.83-2.44 log 10 copies/ml between À1 to 10 DFSO (Figure 4-figure supplement 2H-S).

Likelihood that droplets and aerosols contain virions
Towards analyzing the influence of heterogeneity in rVL on individual infectiousness, we first modeled the likelihood of respiratory particles containing viable SARS-CoV-2. Since rVL is an intensive quantity, the volume fraction of virions is low and viral partitioning coincides with atomization, we used Poisson statistics to model likelihood profiles. To calculate an unbiased estimator of    . Subgroup analyses of SARS-CoV-2 respiratory viral load (rVL) during the infectious period. Random-effects meta-analyses comparing the expected rVLs of adult (!18 years old) COVID-19 cases with pediatric (<18 years old) ones (top) and symptomatic/presymptomatic infections with asymptomatic ones (bottom) during the infectious period. Quantitative rVLs refer to virus concentrations in the respiratory tract. Case types: hospitalized (H), not admitted (N), community (C), adult (A), pediatric (P), symptomatic (S), presymptomatic (Ps) and asymptomatic (As). Specimen types: Figure 3 continued on next page partitioning (the expected number of viable copies per particle), our method multiplied rVL estimates with particle volumes during atomization and an assumed viability proportion of 0.1% in equilibrated particles (Materials and methods).
COVID-19 cases with high rVLs, however, expelled particles with considerably greater likelihoods of carrying viable copies ( Figure 5A

Shedding SARS-CoV-2 via respiratory droplets and aerosols
Using the partitioning estimates in conjunction with published profiles of the particles expelled by respiratory activities (Figure 5-figure supplement 2), we next modeled the rates at which talking, singing, breathing and coughing shed viable SARS-CoV-2 across d a ( Figure 5C-F). Singing shed virions more rapidly than talking based on the increased emission of aerosols. Voice amplitude, however, had a significant effect on aerosol production, and talking loudly emitted aerosols at similar rates to singing ( Each of these respiratory activities expelled aerosols at greater rates than droplets, but particle size correlated with the likelihood of containing virions according to our model. Talking, singing and coughing expelled virions at comparable proportions via droplets (55.6-59.4%) and aerosols (40.6-44.4%), whereas breathing did so predominantly within aerosols ( Figure 5G). Moreover, short-range aerosols mediated most of the virions (79.2-81.9%) shed via aerosols while talking normally and coughing. In comparison, while singing, or talking loudly, buoyant (14.5%) and long-range (17.5%) aerosols carried a larger proportion of the virions shed via aerosols ( Figure 5G).

Influence of heterogeneity in rVL on individual infectiousness
To interpret how heterogeneity in rVL influences individual infectiousness, we modeled total SARS-CoV-2 shedding rates (over all particle sizes) for each respiratory activity ( Figure 5H, Figure 5-figure supplement 3). Between the 1st and the 99th cps, the estimates for a respiratory activity spanned !8.48 orders of magnitude on each DFSO; cumulatively from À1 to 10 DFSO, they  (Spu). Dashes denote case numbers that were not obtained. Box sizes denote weighting in the overall estimates. Between-study heterogeneity was assessed using the p-value from Cochran's Q test and the I 2 statistic. One-sided Welch's t-tests compared expected rVLs between the COVID-19 subgroups (non-significance, p>0.05). The online version of this article includes the following figure supplement(s) for figure 3: Figure supplement 1. Risk-of-bias sensitivity analysis of between-study heterogeneity for SARS-CoV-2 respiratory viral load (rVL) during the estimated infectious period. Figure supplement 2. Risk-of-bias sensitivity analysis of between-study heterogeneity for SARS-CoV-1 respiratory viral load (rVL) during the estimated infectious period. Figure supplement 3. Risk-of-bias sensitivity analysis of between-study heterogeneity for A(H1N1)pdm09 respiratory viral load (rVL) during the estimated infectious period. Figure supplement 4. Risk-of-bias sensitivity analysis of between-study heterogeneity for SARS-CoV-2 respiratory viral load (rVL) for adult COVID-19 cases during the estimated infectious period. Figure supplement 5. Risk-of-bias sensitivity analysis of between-study heterogeneity for SARS-CoV-2 respiratory viral load (rVL) for symptomatic/ presymptomatic COVID-19 cases during the estimated infectious period. spanned 11.0 orders of magnitude. Hence, many COVID-19 cases inherently presented minimal transmission risk, whereas highly infectious individuals shed considerable quantities of SARS-CoV-2. For the 98th cp at 1 DFSO, singing expelled 313 (95% CI: 37.5-3158) virions/min to the ambient environment, talking emitted 293 (35.1-2664) virions/min, breathing exhaled 1.54 (0.18-15.5) virions/min and coughing discharged 249 (29.8-25111) virions/cough; these estimates were approximately two orders of magnitude greater than those for the 85th cp. For the 98th cp at À1 DFSO, singing shed 14.5 (0.15-4515) virions/min and breathing exhaled 7.13 Â 10 À2 (7.20 Â 10 À4 -220.2) virions/min. The estimates at 9-10 DFSO were similar to these presymptomatic ones ( Figure 5H, Figure 5-figure supplement 3B). As indicated by comparable mean rVLs ( Figure 3) and heterogeneities in rVL ( Figure 4B, C), adult, pediatric, symptomatic/presymptomatic and asymptomatic COVID-19 subgroups presented similar distributions for shedding virions through these activities.      . For higher number of virions, some likelihood curves were omitted to aid visualization. When the likelihood for zero virions approaches 0%, particles are expected to contain at least one viable copy. (C-F) Rate that the mean and 98th cp COVID-19 cases at 1 DFSO shed viable SARS-CoV-2 by talking, singing, breathing or coughing over particle size. (G) Relative contributions of droplets and aerosols to shedding virions for each respiratory activity (left). Relative contribution of buoyant, long-range and short-range aerosols to shedding virions via aerosols for each respiratory activity (right). (H) Case heterogeneity in the total shedding rate (over all particle sizes) of virions via singing across the infectious period. Earlier presymptomatic days were excluded based on limited data. Data range between the 1st and 99th cps. Lines and bands represent estimates and 95% confidence intervals, respectively, for estimated likelihoods or Poisson means. The online version of this article includes the following figure supplement(s) for figure 5:     We also compared the influence of case variation on individual infectiousness between A(H1N1) pdm09 and COVID-19. Aerosol spread accounted for approximately half of A(H1N1)pdm09 transmission events (Cowling et al., 2013), and the 50% human infectious dose for aerosolized influenza A virus is approximately 1-3 virions in the absence of neutralizing antibodies (Fabian et al., 2008). Based on the model, 62.9% of A(H1N1)pdm09 cases were infectious (shed !1 virion) via aerosols within 24 hr of talking loudly or singing (Figure 5-figure supplement 4A), and the estimate was 58.6% within 24 hr of talking normally and 22.3% within 24 hr of breathing. In comparison, 48.0% of COVID-19 cases shed !1 virion via aerosols in 24 hr of talking loudly or singing (Figure 5-figure supplement 4C). Notably, only 61.4% of COVID-19 cases shed !1 virion via either droplets or aerosols in 24 hr of talking loudly or singing ( Figure 5-figure supplement 4D). While the human infectious dose of SARS-CoV-2 by any exposure route remains unelucidated, it must be at least one viable copy. Thus, at least 38.6% of COVID-19 cases were expected to present negligible risk to spread SARS-CoV-2 through either droplets or aerosols in 24 hr. The proportion of potentially infectious cases further decreased as the threshold increased: 55.8, 42.5 and 25.0% of COVID-19 cases were expected to shed !2, !10 and !100 virions, respectively, in 24 hr of talking loudly or singing during the infectious period.

Discussion
This study provided systematic analyses of several factors characterizing SARS-CoV-2 transmissibility. First, our results indicate that broader heterogeneity in rVL facilitates greater overdispersion for SARS-CoV-2 than A(H1N1)pdm09. They suggest that many COVID-19 cases infect no one (Bi et al., 2020;Endo et al., 2020;Laxminarayan et al., 2020) because they inherently present minimal transmission risk via respiratory droplets or aerosols, although behavioral and environmental factors may further influence risk. Meanwhile, highly infectious cases can shed tens to thousands of SARS-CoV-2 virions/min, especially between 1 and 5 DFSO, potentiating superspreading events. The model estimates, when corrected to copies rather than virions, align with recent clinical findings for exhalation rates of SARS-CoV-2 . In comparison, a greater proportion of A(H1N1)pdm09 cases are infectious but shed virions at low rates, which concurs with more uniform transmission and few superspreading events observed during the 2009 H1N1 pandemic (Brugger and Althaus, 2020;Roberts and Nishiura, 2011). Moreover, our analyses suggest that heterogeneity in rVL may be generally associated with overdispersion for viral respiratory infections. In this case, rVL distribution can serve as an early correlate for transmission patterns, including superspreading, during outbreaks of novel respiratory viruses. When considered jointly with contact-tracing studies, this provides epidemiological triangulation on k: heterogeneity in rVL indirectly estimates k via an association, whereas contact tracing empirically characterizes transmission chains to estimate k but is limited by incomplete or incorrect recall of contact events by cases. When transmission is highly overdispersed, targeted interventions may disproportionately mitigate infection , with models showing that focused control efforts on the most infectious cases outperform random control policies (Lloyd-Smith et al., 2005).
Second, we analyzed SARS-CoV-2 kinetics during respiratory infection. While heterogeneity remains broad throughout the infectious period, rVL tends to peak at 1 DFSO and be elevated for 1-5 DFSO, coinciding with the period of highest attack rates observed among close contacts . These results indicate that transmission risk tends to be greatest near, and soon after, illness rather than in the presymptomatic period, which concurs with large tracing studies (6.4-12.6% of secondary infections from presymptomatic transmission) Wei et al., 2020) rather than early temporal models (~44%) . Furthermore, our kinetic analysis suggests that, on average, SARS-CoV-2 reaches diagnostic concentrations 1.54-3.17 days after respiratory infection (À3.84 to À2.21 DFSO), assuming assay detection limits of 1-3 log 10 copies/ml, respectively, for nasopharyngeal swabs immersed in 1 ml of transport media. Third, we assessed the relative infectiousness of COVID-19 subgroups. As a common symptom of COVID-19 , coughing sheds considerable numbers of virions via droplets and short-range aerosols. Thus, symptomatic infections tend to be more contagious than asymptomatic ones, providing one reason as to why asymptomatic cases transmit SARS-CoV-2 at lower relative rates , especially in close contact , despite similar rVLs and increased contact patterns. Accordingly, children (48-54% of symptomatic cases present with cough) (Lu et al., 2020b;Team and CDC COVID-19 Response Team, 2020) may be less contagious than adults (68-80%) Team and CDC COVID-19 Response Team, 2020) based on tendencies of symptomatology rather than rVL. Conversely, coughing sheds few virions via smaller aerosols. While singing and talking loudly, highly infectious cases can shed tens to hundreds of SARS-CoV-2 virions/min via long-range and buoyant aerosols.
Our study has limitations. The systematic search found a limited number of studies reporting quantitative specimen measurements from the presymptomatic period, meaning that these estimates may be sensitive to sampling bias. Although additional studies have reported semiquantitative metrics (cycle thresholds), these data were excluded because they cannot be compared on an absolute scale due to batch effects (Han et al., 2021), limiting use in compound analyses. In addition, our models considered virion partitioning during atomization to be a Poisson process, which stochastically associates partitioning with particle volume. Partitioning mechanisms associated with surface area, perhaps such as film bursting (Bird et al., 2010;Johnson and Morawska, 2009), may enrich the quantities of virions in smaller aerosols, based on their surface area-to-volume ratio. As severe COVID-19 is associated with high, persistent SARS-CoV-2 shedding in the lower respiratory tract (Chen et al., 2021) and small particles are typically generated there (Johnson et al., 2011), severe cases may also expel higher quantities of virions via smaller aerosols.
Furthermore, this study considered population-level estimates of the infectious periods, viability proportions and profiles for respiratory particles, which omit individual or environmental variation. Studies differ in their measurements of the emission rates and size distributions of the particles expelled during respiratory activities (Johnson et al., 2011;Schijven et al., 2020). Their characterization methods may prompt these differences, or they may be due to individual variation, including from distinctions in respiratory capacity, especially for young children, and phonetic tendencies (Asadi et al., 2020). Some patients shed SARS-CoV-2 with diminishing viability soon after symptom onset (Wö lfel et al., 2020), whereas others produce replication-competent virus for weeks (van Kampen et al., 2021). The proportion of viable SARS-CoV-2 in respiratory particles, and how case characteristics or environmental factors influence it, remains under investigation (Fears et al., 2020;Lednicky et al., 2020;Morris et al., 2020). Cumulatively, these sources of variation may influence the shedding model estimates, further increasing heterogeneity in individual infectiousness.
Taken together, our findings provide a potential path forward for disease control. While talking, singing and coughing, our models indicate that SARS-CoV-2 is shed via droplets (55.6-59.4% of shed virions), short-range aerosols (30.1-34.9%), long-range aerosols (7.7-8.3%) and buoyant aerosols (0.01-6.5%). Transmission, however, requires exposure. For direct transmission, droplets tend to be sprayed ballistically onto susceptible tissue. Aerosols can be inhaled, may penetrate more deeply into the lungs and more easily facilitate superspreading events. However, with short durations of stay in well-ventilated areas, the exposure risk for both droplets and aerosols remains correlated with proximity to infectious cases (Liu et al., 2017a;Prather et al., 2020). Strategies to abate infection should limit crowd numbers and duration of stay while reinforcing distancing, low-voice amplitudes and widespread mask usage; well-ventilated settings can be recognized as lower-risk venues. Coughing can shed considerable quantities of virions, while rVL tends to peak at 1 DFSO and can be high throughout the infectious period. Thus, immediate, sustained self-isolation upon illness is crucial to curb transmission from symptomatic cases. Collectively, our analyses highlight the role of cases with high rVLs in propelling the COVID-19 pandemic. While diagnosing COVID-19, qRT-PCR can also triage contact tracing, prioritizing these patients: for nasopharyngeal swabs immersed in 1 ml of transport media, !7.14 (95% CI: 7.07-7.22) log 10 copies/ml corresponds to the top 20% of COVID-19 cases for variants before August 2020. Doing so may identify asymptomatic and presymptomatic infections more efficiently, a key step towards mitigation and elimination as the pandemic continues.

Search strategy, selection criteria and data collection
We undertook a systematic review and prospectively submitted the protocol for registration on PROSPERO (registration number, CRD42020204637). Other than the title of this study, we have followed PRISMA reporting guidelines (Moher et al., 2009). The systematic review was conducted according to Cochrane methods guidance (Higgins et al., 2019).
The search included papers that (i) reported positive, quantitative measurements (copies/ml or an equivalent metric) of SARS-CoV-2, SARS-CoV-1 or A(H1N1)pdm09 in human respiratory specimens (endotracheal aspirate [ETA], nasopharyngeal aspirate [NPA], nasopharyngeal swab [NPS], oropharyngeal swab [OPS], posterior oropharyngeal saliva [POS] and sputum [Spu]) from COVID-19, SARS or A(H1N1)pdm09 cases; (ii) reported data that could be extracted from the estimated infectious periods of SARS-CoV-2 (defined as À3 to +10 DFSO for symptomatic cases and 0 to +10 days from the day of laboratory diagnosis for asymptomatic cases), SARS-CoV-1 (defined as 0 to +20 DFSO or the equivalent asymptomatic period) or A(H1N1)pdm09 (defined as À2 to +9 DFSO for symptomatic cases and 0 days to +9 days from the day of laboratory diagnosis for asymptomatic cases); and (iii) reported data for two or more cases with laboratory-confirmed COVID-19, SARS or A(H1N1)pdm09 based on World Health Organization (WHO) case definitions. Quantitative specimen measurements were considered after RNA extraction for diagnostic sequences of SARS-CoV-2 (Ofr1b, N, RdRp and E genes), SARS-CoV-1 (Ofr1b, N and RdRp genes) and A(H1N1)pdm09 (HA and M genes).
Studies were excluded, in the following order, if they (i) studied an ineligible disease; (ii) had an ineligible study design, including those that were reviews of evidence (e.g., scoping, systematic or narrative), did not include primary clinical human data, reported data for less than two cases due to an increased risk of selection bias, were incomplete (e.g., ongoing clinical trials), did not report an RNA extraction step before measurement or were studies of environmental samples; (iii) reported an ineligible metric for specimen concentration (e.g., qualitative RT-PCR or cycle threshold [Ct] values without calibration included in the study); (iv) reported quantitative measurements from an ineligible specimen type (e.g., blood specimens, pooled specimens or self-collected POS or Spu patient specimens in the absence of a healthcare professional); (v) reported an ineligible sampling period (consisted entirely of data that could not be extracted from within the infectious period); or (vi) were duplicates of an included study (e.g., preprinted version of a published paper or duplicates not identified by Covidence). We included data from control groups receiving standard of care in interventional studies but excluded data from the intervention group. Patients in the intervention group are, by definition, systematically different from general case populations because they receive therapies not being widely used for treatment, which may influence virus concentrations. Interventional studies examining the comparative effectiveness of two or more treatments were excluded for the same reason. Studies exclusively reporting semiquantitative measurements (e.g., Ct values) of specimen concentration were excluded as these measurements are sensitive to batch and instrument variation and, without proper calibration, cannot be compared on an absolute scale across studies (Han et al., 2021).
We searched, without the use of filters or language restrictions, the following sources: MEDLINE (via Ovid, 1946 to 7 August 2020), EMBASE (via Ovid, 1974 to 7 August 2020), Cochrane Central Register of Controlled Trials (via Ovid, 1991 to 7 August 2020), Web of Science Core Collection (including Science Citation Index Expanded, 1900 to 7 August 2020; Social Sciences Citation Index, 1900 to 7 August 2020; Arts & Humanities Citation Index, 1975 to 7 August 2020; Conference Proceedings Citation Index -Science, 1990 to 7 August 2020; Conference Proceedings Citation Index -Social Sciences & Humanities, 1990 to 7 August 2020; and Emerging Sources Citation Index, 2015 to 7 August 2020), as well as medRxiv and bioRxiv (both searched through Google Scholar via the Publish or Perish program, to 7 August 2020). We also gathered studies by searching through the reference lists of review articles identified by the database search, by searching through the reference lists of included articles, through expert recommendation (by Eric J. Topol and Akiko Iwasaki on Twitter) and by hand-searching through journals (Nature, Nat. Med., Science, NEJM, Lancet, Lancet Infect. Dis., JAMA, JAMA Intern. Med. and BMJ). A comprehensive search was developed by a librarian, which included subject headings and keywords. The search strategy had three main concepts (disease, specimen type and outcome), and each concept was combined using the appropriate Boolean operators. The search was tested against a sample set of known articles that were pre-identified. The line-by-line search strategies for all databases are included in Figure 1-source data 1, Figure 1-source data 2, Figure 1-source data 3, Figure 1-source data 4, Figure 1-source data 5. The search results were exported from each database and uploaded to the Covidence online system for deduplication and screening.
Two authors independently screened titles and abstracts, reviewed full texts, collected data and assessed risk of bias via Covidence and a hybrid critical appraisal checklist based on the Joanna Briggs Institute (JBI) tools for case series, analytical cross-sectional studies and prevalence studies (Moola et al., 2020;Munn et al., 2019;Munn et al., 2015). To evaluate the sample size in a study, we used the following calculation: where n Ã is the sample size threshold, z is the z-score for the level of confidence (95%), s is the standard deviation (assumed to be 3 log 10 copies/ml, one quarter of the full range of rVLs) and d is the marginal error (assumed to be 1 log 10 copies/ml, based on the minimum detection limit for qRT-PCR across studies) (Johnston et al., 2019). The hybrid JBI critical appraisal checklist is shown in the Appendix. Studies were considered to have low risk of bias if they met the majority of the items, indicating that the estimates were likely to be correct for the target population. Inconsistencies were resolved by discussion and consensus.  Yang et al., 2011), and data were collected from each study. For preprinted studies that were published as journal articles before the revised submission of this manuscript, we included the citation for the journal article. Descriptive statistics on quantitative specimen measurements were collected from confirmed cases directly if reported numerically or using WebPlotDigitizer 4.3 (https://apps.automeris.io/wpd/) if reported graphically. Individual specimen measurements were collected directly if reported numerically or, when the data were clearly represented, using the tool if reported graphically. We also collected the relevant numbers of cases, types of cases, reported treatments, volumes of transport media, numbers of specimens and DFSO (for symptomatic cases) or day relative to initial laboratory diagnosis (for asymptomatic cases) on which each specimen was taken. Hospitalized cases were defined as those being tested in a hospital setting and then admitted. Non-admitted cases were defined as those being tested in a hospital setting but not admitted. Community cases were defined as those being tested in a community setting. Symptomatic, presymptomatic and asymptomatic infections were defined as in the study. Based on rare description in contributing studies, paucisymptomatic infections, when described, were included with symptomatic ones. Pediatric cases were defined as those below 18 years of age or as defined in the study. Adult cases were defined as those 18 years of age or higher or as defined in the study.

Calculation of rVLs from specimen measurements
In this study, viral concentrations in respiratory specimens were denoted as specimen measurements, whereas viral concentrations in the respiratory tract were denoted as rVLs. To determine rVLs, each collected quantitative specimen measurement was converted to rVL based on the dilution factor. For example, measurements from swabbed specimens (NPS and OPS) typically report the RNA concentration in viral transport media. Based on the expected uptake volume for swabs (0.128 ± 0.031 ml, mean ± SD) (Warnke et al., 2014) or reported collection volume for expulsed fluid in the study (e.g., 0.5-1 ml) along with the reported volume of transport media in the study (e.g., 1 ml), we calculated the dilution factor for each respiratory specimen to estimate the rVL. If the diluent volume was not reported, then the dilution factor was calculated assuming a volume of 1 ml (NPS and OPS), 2 ml (POS and ETA) or 3 ml (NPA) of transport media (Lavezzo et al., 2020;Poon et al., 2004;. Unless dilution was reported for Spu specimens, we used the specimen measurement as the rVL (Wö lfel et al., 2020). The non-reporting of diluent volume was noted as an element increasing risk of bias in the hybrid JBI critical appraisal checklist. Specimen measurements (based on instrumentation, calibration, procedures and reagents) are not standardized and, as DFSO is typically based on patient recall, there is also inherent uncertainty in these values. While the above procedures (including only quantitative measurements after extraction as an inclusion criterion, considering assay detection limits and correcting for specimen dilution) have considered many of these factors, non-standardization remains an inherent limitation in the variability of specimen measurements.

Meta-regression of k and heterogeneity in rVL
To assess the relationship between k and heterogeneity in rVL, we performed a univariate metaregression (log k ¼ a*SD þ b, where a is the slope for association and b is the intercept) between pooled estimates of k (based on studies describing community transmission) for COVID-19 (k = 0.409) (Adam et al., 2020;Tariq et al., 2020;Zhang et al., 2020b;Laxminarayan et al., 2020;Bi et al., 2020;Endo et al., 2020;Riou and Althaus, 2020), SARS (k = 0.165) (Lloyd-Smith et al., 2005) and A(H1N1)pdm09 (k = 8.155) (Brugger and Althaus, 2020;Roberts and Nishiura, 2011) and the SD of the rVLs in contributing studies. Since SD was the metric, we used a fixed-effects model. For weighting in the meta-regression, we used the proportion of rVL samples from each study relative to the entire systematic dataset (W i ¼ n i =n total ). All calculations were performed in units of log 10 copies/ml. As the meta-regression used pooled estimates of k for each infection, it assumed that there was no correlated bias to k across contributing studies. The limit of detection for qRT-PCR instruments used in the included studies did not significantly affect the analysis of heterogeneity in rVL as these limits tended to be below the values found for specimens with low virus concentrations. The meta-regression was conducted using all contributing studies and showed a weak association. Meta-regression was also conducted using studies that had low risk of bias according to the hybrid JBI critical appraisal checklist and showed a strong association. The p-value for association was obtained using the meta-regression slope t-test for a, the effect estimate. While there is intrinsic measurement error in virus quantitation, based on the systematic review protocol and study design (as described above), this error should similarly increase heterogeneity in rVL for each virus, and the difference in heterogeneity in rVL between viruses should arise from the viruses.

Meta-analysis of rVLs
Based on the search design and composition of contributing studies, the meta-analysis overall estimates were the expected SARS-CoV-2, SARS-CoV-1 and A(H1N1)pdm09 rVL when encountering a COVID-19, SARS or A(H1N1)pdm09 case, respectively, during their infectious period. Pooled estimates and 95% CIs for the expected rVL of each virus across their infectious period were calculated using a random-effects meta-analysis (DerSimonian and Laird method). For studies reporting summary statistics in medians and interquartile or total ranges, we derived estimates of the mean and variance and calculated the 95% CIs (Wan et al., 2014). All calculations were performed in units of log 10 copies/ml. Between-study heterogeneity in meta-analysis was assessed using Cochran's Q test and the I 2 and t 2 statistics. If significant between-study heterogeneity in meta-analysis was encountered, sensitivity analysis based on the risk of bias of contributing studies was performed. The metaanalyses were conducted using STATA 14.2 (StataCorp LLC, College Station, TX, USA).

Age and symptomatology subgroup analyses of SARS-CoV-2 rVLs
The overall estimate for each subgroup was the expected rVL when encountering a case of that subgroup during the infectious period. Studies reporting data exclusively from a subgroup of interest were directly included in the analysis after rVL estimations. For studies in which data for these subgroups constituted only part of its dataset, rVLs from the subgroup were extracted to calculate the mean, variance and 95% CIs. Random-effects meta-analysis was performed as described above. For meta-analyses of pediatric and asymptomatic COVID-19 cases, contributing studies had low risk of bias, and no risk-of-bias sensitivity analyses were performed for these subgroups.

Distributions of rVL
We pooled the entirety of individual sample data in the systematic dataset by disease, COVID-19 subgroups and DFSO. For analyses of SARS-CoV-2 dynamics across disease course, we included estimated rVLs from negative qRT-PCR measurements of respiratory specimens for cases that had previously been quantitatively confirmed to have COVID-19. These rVLs were estimated based on the reported assay detection limit in the respective study. Probability plots and modified Kolmogorov-Smirnov tests used the Blom scoring method and were used to determine the suitability of normal, lognormal, gamma and Weibull distributions to describe the distribution of rVLs for SARS-CoV-2, SARS-CoV-1 and A(H1N1)pdm09. For each virus, the data best conformed to Weibull distributions, which is described by the probability density function where a is the shape factor, b is the scale factor and is rVL ( ! 0 log 10 copies/ml). Weibull distributions were fitted on the entirety of collected individual sample data for the respective category. Since individual specimen measurements could not be collected from all studies, there was a small bias on the mean estimate for each fitted distribution. Thus, for the curves shown in Figure 4B, C, the mean of the Weibull distributions summarized in Figure 4-figure supplement 2 was adjusted to be the subgroup meta-analysis estimate for correction; the SD and distribution around that mean remained consistent.
For each Weibull distribution, the value of the rVL at the x th percentile was determined using the quantile function, For cp curves, we used Equation (3) to determine rVLs from the 1st cp to the 99th cp (step size, 1%). Curve fitting to Equation (2) and calculation of Equation (3) and its 95% CI was performed using the Distribution Fitter application in Matlab R2019b (MathWorks, Inc, Natick, MA, USA).

Viral kinetics
To model SARS-CoV-2 kinetics during respiratory infection, we used a mechanistic epithelial cell-limited model for the respiratory tract (Baccam et al., 2006), based on the system of differential equations: where T is the number of uninfected target cells, I is the number of productively infected cells, V is the rVL, b is the infection rate constant, p is the rate at which airway epithelial cells shed virus to the extracellular fluid, c is the clearance rate of virus and d is the clearance rate of productively infected cells. Using these parameters, the viral half-life in the respiratory tract (t 1=2 ¼ ln 2=c) and the half-life of productively infected cells (t 1=2 ¼ ln 2=d) could be estimated. Moreover, the cellular basic reproductive number (the expected number of secondary infected cells from a single productively infected cell placed in a population of susceptible cells) was calculated by For initial parameterization, Equations (4)-(6) were simplified according to a quasi-steady state approximation (Ikeda et al., 2016) to where r ¼ pb=c, for a form with greater numerical stability. The system of differential equations was fitted on the mean estimates of SARS-CoV-2 rVL between -2 and 10 DFSO using the entirety of individual sample data in units of copies/ml. Numerical analysis was implemented using the Fit ODE app in OriginPro 2019b (OriginLab Corporation, Northampton, MA, USA) via the Runge-Kutta method and initial parameters V 0 , I 0 and T 0 of 4 copies/ml, 0 cells and 5 Â 10 7 cells, respectively, for the range -5 to 10 DFSO. The analysis was first performed with Equations (8) and (9). These output parameters were then used to initialize final analysis using Equations (4)- (6), where the estimates for b and d were input as fixed and variable parameters, respectively. The fitted line and its coefficient of determination (r 2 ) were presented. The estimated half-life of SARS-CoV-2 RNA has a skewed 95% CI (Figure 4-figure supplement 4). As c is in the denominator of the equation for half-life (t 1=2 ¼ ln 2=c), t 1=2 is sensitive to c below 1, which is the case for its lower 95% CI (Figure 4-figure supplement 4) and the source of the skew.
To estimate the average incubation period, we extrapolated the kinetic model to 0 log 10 copies/ ml pre-symptom onset. To estimate the average duration of shedding, we extrapolated the model to 0 log 10 copies/ml post-symptom onset. Unlike in experimental studies, this estimate for duration of shedding was not defined by assay detection limits. To estimate the average DFSO on which SARS-CoV-2 concentration reached diagnostic levels, we extrapolated the model pre-symptom onset to the equivalent of 1 and 3 log 10 copies/ml (chosen as example assay detection limits) in specimen concentration for NPSs immersed in 1 ml of transport media, as described by the dilution factor estimation above. The average time from respiratory infection to reach diagnostic levels was then calculated by subtracting these values from the estimated average incubation period. The extrapolated time for SARS-CoV-2 to reach diagnostic concentrations in the respiratory tract should be validated in tracing studies, in which contacts are prospectively subjected to daily sampling.

Likelihood of respiratory particles containing virions
To calculate an unbiased estimator for viral partitioning (the expected number of viable copies in an expelled particle at a given size), we multiplied rVLs with the volume equation for spherical particles during atomization and the estimated viability proportion, according to the following equation: where l is the expectation value, is the material density of the respiratory particle (997 kg/m 3 ), v p is the volumetric conversion factor (1 ml/g), g is the viability proportion, is the rVL and d is the hydrated diameter of the particle during atomization.
The model assumed g was 0.1% as a population-level estimate. For influenza, approximately 0.1% of copies in particles expelled from the respiratory tract represent viable virus (Yan et al., 2018), which is equivalent to one viable copy in 3 log 10 copies/ml for rVL or, after dilution in transport media, roughly one in 4 log 10 copies/ml for specimen concentration. Respiratory specimens taken from influenza cases show positive cultures for specimen concentrations down to 4 log 10 copies/ml . Likewise, for COVID-19 cases, recent reports also show culture-positive respiratory specimens with SARS-CoV-2 concentrations down to 4 log 10 copies/ml (Wö lfel et al., 2020), including from pediatric (L'Huillier et al., 2020) and asymptomatic (Arons et al., 2020) cases. Moreover, replication-competent SARS-CoV-2 has been found in respiratory specimens taken throughout the respiratory tract (mouth, nasopharynx, oropharynx and lower respiratory tract) (Jeong et al., 2020;Wö lfel et al., 2020). Taken together, these considerations suggested that the assumption for viability proportion (0.1%) was suitable to model the likelihood of respiratory particles containing viable SARS-CoV-2. In accordance with the discussion above, the model did not differentiate this population-level viability estimate based on age, symptomatology or sites of atomization. Based on the relative relationship between the residence time of expelled particles before assessment (~5 s) (Yan et al., 2018), we took the viability proportion to be for equilibrated particles.
Likelihood profiles were determined using Poisson statistics, as described by the probability mass function where k is the number of virions partitioned within the particle. For l, 95% CIs were determined using the variance of its rVL estimate. To determine 95% CIs for likelihood profiles from the probability mass function, we used the delta method, which specifies where s 2 D is the covariance matrix of and g _ ð Þ is the gradient of g ð Þ. For the univariate Poisson distribution, s 2 D ¼ l and

Rate profiles of particles expelled by respiratory activities
Distributions from the literature were used to determine the rate profiles of particles expelled during respiratory activities. For breathing, talking and coughing, we used data from Johnson et al., 2011.
For singing, we used data from Morawska et al., 2009 for smaller aerosols (d a < 20 mm) and used the profiles from talking for larger aerosols and droplets based on the oral cavity mechanism from Johnson et al., 2011. Rate profiles (particles/min or particles/cough) were calculated based on the corrected normalized concentration (dC n /dlogD p , in units of particles/cm 3 ) at each discrete particle size, normalization (32 size channels per decade) for the aerodynamic particle sizer used, unit conversion (cm 3 to l) and the sample flow rate (1 l/min). For coughing, the calculation assumed that participants coughed 10 times in the 30-s sampling interval. To determine the corrected normalized concentrations for breathing, we used a particle dilution factor of 4 and evaporation factor of 0.5, consistent with the other respiratory activities in Johnson et al., 2011. Breathing was taken to expel negligible quantities of larger respiratory particles based on the bronchiolar fluid film burst mechanism (Johnson et al., 2011). To account for intermittent breathing while talking and singing, the rate profiles for these activities included the contribution of aerosols expelled by breathing. We compared these rate profiles with those collected from talking loudly and talking quietly from Asadi et al., 2020. In our models, we took the diameter of dehydrated respiratory particles to be 0.3 times the initial size when atomized in the respiratory tract (Johnson et al., 2011;Lieber et al., 2021;Liu et al., 2017b). Equilibrium aerodynamic diameter was calculated by d a ¼ d p = 0 ð Þ 1=2 , where d p is the dehydrated diameter, is the material density of the respiratory particle and 0 is the reference material density (1 g/cm 3 ). Curves based on discrete particle measurements were connected using the nonparametric Akima spline function.

Shedding virions via respiratory droplets and aerosols
To model the respiratory shedding rate across particle size, rVL estimates and the hydrated diameters of particles expelled by a respiratory activity were input into Equation (10), and the output was then multiplied by the rate profile of the activity (talking, singing, breathing or coughing). To assess the relative contribution of aerosols and droplets to mediating respiratory viral shedding for a given respiratory activity, we calculated the proportion of the cumulative hydrated volumetric rate contributed by buoyant aerosols (d a 10 mm), long-range aerosols (10 mm < d a 50 mm), short-range aerosols (50 mm < d a 100 mm) and droplets (d a > 10 mm) for that respiratory activity. Since the Poisson mean was proportional to cumulative volumetric rate, this estimate of the relative contribution of aerosols and droplets to respiratory viral shedding was consistent among viruses and cps in the model.
To determine the total respiratory shedding rate for a given respiratory activity across cp, we determined the cumulative hydrated volumetric rate (by summing the hydrated volumetric rates across particle sizes for that respiratory activity) of particle atomization and input it into Equation (10). Using rVLs and their variances as determined by the Weibull quantile functions, we then calculated the Poisson means and their 95% CIs at the different cps.
To assess the influence of heterogeneity in rVL on individual infectiousness, we first considered transmission of A(H1N1)pdm09 via aerosols (Cowling et al., 2013). The 50% human infectious dose (HID 50 ) of aerosolized A(H1N1)pdm09 was taken to be 1-3 virions (Fabian et al., 2008). To determine the expected time required for a A(H1N1)pdm09 case to shed one virion via aerosols, we took the reciprocal of the Poisson means and their 95% CIs at the different cps of the estimated shedding rates. The expected time required for a COVID-19 case to shed one virion via aerosols or one virion via droplets or aerosols was determined in a same manner.

Supplementary files
. Transparent reporting form
The following dataset was generated: Author (

Continued on next page
Checklist items* *Descriptions of each item are included in the hybrid JBI critical appraisal checklist (Appendix). Y (green), U (yellow) and N (red) represent yes, unclear and no, respectively.