SARS-CoV-2 shedding dynamics across the respiratory tract, sex, and disease severity for adult and pediatric COVID-19

Background: Previously, we conducted a systematic review and analyzed the respiratory kinetics of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (Chen et al., 2021). How age, sex, and coronavirus disease 2019 (COVID-19) severity interplay to influence the shedding dynamics of SARS-CoV-2, however, remains poorly understood. Methods: We updated our systematic dataset, collected individual case characteristics, and conducted stratified analyses of SARS-CoV-2 shedding dynamics in the upper (URT) and lower respiratory tract (LRT) across COVID-19 severity, sex, and age groups (aged 0–17 years, 18–59 years, and 60 years or older). Results: The systematic dataset included 1266 adults and 136 children with COVID-19. Our analyses indicated that high, persistent LRT shedding of SARS-CoV-2 characterized severe COVID-19 in adults. Severe cases tended to show slightly higher URT shedding post-symptom onset, but similar rates of viral clearance, when compared to nonsevere infections. After stratifying for disease severity, sex and age (including child vs. adult) were not predictive of respiratory shedding. The estimated accuracy for using LRT shedding as a prognostic indicator for COVID-19 severity was up to 81%, whereas it was up to 65% for URT shedding. Conclusions: Virological factors, especially in the LRT, facilitate the pathogenesis of severe COVID-19. Disease severity, rather than sex or age, predicts SARS-CoV-2 kinetics. LRT viral load may prognosticate COVID-19 severity in patients before the timing of deterioration and should do so more accurately than URT viral load. Funding: Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant, NSERC Senior Industrial Research Chair, and the Toronto COVID-19 Action Fund.

For insight into these questions, we conducted a systematic review on SARS-CoV-2 quantitation from respiratory specimens and developed a large, diverse dataset of viral loads and individual case characteristics. Stratified analyses then assessed SARS-CoV-2 shedding dynamics across the respiratory tract, age, sex, and COVID-19 severity.

Data sources and searches
Our systematic review identified studies reporting SARS-CoV-2 quantitation in respiratory specimens taken during the estimated infectious period (−3 to 10 days from symptom onset [DFSO]) Wölfel et al., 2020). The systematic review protocol was based on our previous study (Chen et al., 2021a) and was prospectively registered on PROSPERO (registration number, CRD42020204637). The systematic review was conducted according to Cochrane methods guidance (Higgins et al., 2019). PRISMA reporting guidelines were followed (Moher et al., 2009).
Up to November 20, 2020, we searched, without the use of filters or language restrictions, the following sources: MEDLINE (Ovid, 1946to November 20, 2020, EMBASE (Ovid, 1974to November 20, 2020, Cochrane Central Register of Controlled Trials (via Ovid, 1991to November 20, 2020, Web of Science Core Collection (up to November 20, 2020, and medRxiv and bioRxiv (both searched through Google Scholar via the Publish or Perish program, up to November 20, 2020. We also gathered studies by searching through the reference lists of review articles identified by the database search, by searching through the reference lists of included articles, through expert recommendation (by Eric J Topol and Akiko Iwasaki on Twitter) and by hand-searching through journals. A comprehensive search was developed by a librarian (ZP). The line-by-line search strategies for all databases are included in Figure 1-source data 1 to 5. The search results were exported from each database and uploaded to the Covidence online system (research resource identifier, RRID:SCR_016484) for deduplication and screening.

Data extraction and risk-of-bias assessment
Two authors (PZC and NB) independently collected data (specimen measurements taken between -3 and 10 DFSO, specimen type, volume of viral transport media [VTM], and case characteristics, including age, sex, and disease severity) from contributing studies and assessed risk of bias using a modified version of the Joanna Briggs Institute (JBI) tools for case series, analytical cross-sectional studies, and prevalence studies (Moola et al., 2020;Munn et al., 2019;Munn et al., 2015) (shown in the Appendix). Items were judged with responses to data inquiries, if authors responded.
Data were collected for individually reported specimens of known type, with known DFSO, and for COVID-19 cases with known age, sex or severity. Case characteristics were collected directly from contributing studies when reported individually or obtained via data request from the authors. Data from serially sampled asymptomatic cases were included, and the day of laboratory diagnosis was referenced as 0 DFSO (Lavezzo et al., 2020;Wölfel et al., 2020). Based on the modified JBI checklist, studies were considered to have low risk of bias if they met the majority of items and included item 1 (representative sample). Discrepancies were resolved by discussion and consensus. Studies at high or unclear risk of bias typically included samples that were not representative of the target population; did not report the VTM volume used; had non-consecutive inclusion for case series and cohort studies or did not use probabilitybased sampling for cross-sectional studies; and did not report the response rate.

Respiratory viral load
To enable analyses based on respiratory viral load (rVL, viral RNA concentration in the respiratory tract) and to account for between-study variation in specimen measurements, the rVL for each collected sample was estimated based on the specimen concentration (viral RNA concentration in the specimen) and its dilution factor in VTM. Typically, swabbed specimens (NPS and OPS) report the viral RNA concentration in VTM. Based on the VTM volume reported in the study along with the expected uptake volume for swabs (0.128 ± 0.031 ml, mean ± SD) (Warnke et al., 2014), we calculated the dilution factor for each respiratory specimen and then estimated the rVL. Similarly, liquid specimens (ETA, POS, and Spu) are often diluted in VTM, and the rVL was estimated based on the reported collection and VTM volumes. If the diluent volume was not reported, then VTM volumes of 1 ml (NPS and OPS) or 2 ml (POS and ETA) were assumed (Lavezzo et al., 2020;. Unless dilution was reported, Spu specimens were taken as undiluted (Wölfel et al., 2020). The non-reporting of VTM volume was noted as an element increasing risk of bias in the modified JBI critical appraisal checklist. For laboratory-confirmed COVID-19 cases, negative specimen measurements were taken at the reported assay detection limit in the respective study.

Case definitions
As severity in the clinical manifestations of COVID-19 and case-fatality rates tend to increase among children (aged 0-17 years), younger adults (aged 18-59 years), and older adults (aged 60 years or older) (Onder et al., 2020;Zheng et al., 2020), the data were delineated by these three age groups. Cases were also categorized by sex.
U.S. National Institutes of Health guidance was used to categorize disease severity as nonsevere or severe (National Institutes of Health, 2021) (Appendix 1-table 1). The nonsevere group included those with asymptomatic infection (individuals who test positive via a molecular test for SARS-CoV-2 and report no symptoms consistent with COVID-19); mild illness (individuals who report any signs or symptoms of COVID-19, including fever, cough, sore throat, malaise, headache, muscle pain, nausea, vomiting, diarrhea, loss of taste and smell, but who do not have dyspnea or abnormal chest imaging); and moderate illness (individuals with clinical or radiographic evidence of LRT disease, fever >39.4 °C or SpO 2 >94% on room air) disease. The severe group included those with severe illness (individuals who have SpO 2 <94% on room air, [PaO 2 /FiO 2 ] < 300 mmHg, respiratory rate >30 breaths/min or lung infiltrates > 50%) and critical illness (respiratory failure, septic shock, or multiple organ dysfunction).

Regression analyses
To assess the respiratory shedding of SARS-CoV-2 and compare age, sex, or severity groups, we analyzed the data via normal linear regression (Hurst et al., 2020;Lucas et al., 2020). Previous studies have shown that SARS-CoV-2 shedding tends to diminish exponentially after 1 DFSO in the URT and, at least, after 4 DFSO in the LRT (Bernheim et al., 2020;Chen et al., 2021a;Wölfel et al., 2020). Although LRT shedding may peak before 4 DFSO, there is limited data near or before symptom onset. Hence, rVLs (in units of log 10 copies/ml) between 1 and 10 DFSO for the URT, or 4 and 10 DFSO for the LRT, were fitted using linear regression with interaction: where V represents the rVL, α represents the estimated mean rVL (at 1 DFSO for URT or 4 DFSO for LRT) for the reference group, X1 represents DFSO for the reference group, X2 represents the comparison group, β1 represents the effect of DFSO on rVL for the reference group, β2 represents the effect of the comparison group on the main effect (mean rVL at 1 DFSO), and β3 represents the interaction between DFSO and groups. Regression analyses were offset by DFSO such that mean rVLs at 1 DFSO for URT, or 4 DFSO for LRT, were compared between groups by the main effect (i.e., effect on the intercept in the regression t-test for β2 ). Shedding dynamics were compared between groups by interaction (regression t-test for β3 ). The statistical significance of viral clearance for each group was analyzed using simple linear regression (regression t-test on the slope). Each group in statistical analyses included all rVLs for which the relevant characteristic (LRT or URT, age group, sex, or disease severity) was ascertained at the individual level. Groups with small sample sizes were not compared, as these analyses are more sensitive to potential sampling error.
Regression models were extrapolated to 0 log 10 copies/ml to estimate the total duration of shedding. Some clinical studies report shedding duration based on assay negativity, when the viral RNA concentration in the specimen reaches the detection limit of the assay (often between 1 and 4 log 10 copies/ml), and these cases may continue to shed viral RNA. To show the relationship between the two approaches, we used our regression model for URT shedding and estimated the shedding duration to a specimen concentration of 3 log 10 copies/ml when sampling was conducted with nasopharyngeal swabs (approximately equivalent to an rVL of 2.1 log 10 copies/ml). Then, the estimated mean duration of URT shedding for severe cases was 20.8 (95% CI: 14.5-27.0) DFSO, while it was 20.3 (95% CI: 16.8-23.7) DFSO for nonsevere cases. These values are in line with those reported by studies considering the assay detection limit (Cevik et al., 2021), supporting our regression models, and can be compared with those reported in the body text.
Statistical analyses were performed using OriginPro 2019b (RRID:SCR_014212, OriginLab) and the General Linear regression app. p-Values below 0.05 were considered statistically significant.

Distribution analyses
Previously, our analyses found that SARS-CoV-2 rVLs best conform to Weibull distributions (Chen et al., 2021b). To assess heterogeneity in shedding in this study, rVL data were fitted to Weibull distributions. The Weibull quantile function and Weibull cumulative distribution function were used to estimate the rVL at a case percentile and the percentage of cases at a given rVL, respectively. Each distribution was fitted to groups that included all rVLs for which the relevant characteristic (LRT or URT, age group, sex, or disease severity) was ascertained at the individual level. Distribution fitting was performed using Matlab R2019b (RRID:SCR_001622, MathWorks) and the Distribution Fitter app.

Prognostication accuracy
The fitted Weibull distributions were used to estimate the accuracy when using URT or LRT rVLs of SARS-CoV-2 as a prognostic indicator for COVID-19 severity. The overlapped area under the curve (AUC) and separated AUC were calculated using the rVL distributions for severe and nonsevere adult COVID-19. These calculations were performed for each DFSO and, separately, for the URT and LRT. The estimated maximal accuracy for prognostication at a given rVL threshold was then estimated by , where AUCseparated represents the AUC that was separated for the nonsevere and severe distributions. The 95 % CIs for prognostication accuracy were estimated using the proportional 95 % CIs in the respective Weibull cumulative distributions. As the Weibull cumulative distributions estimate the percentage of cases at a given rVL, they were also used to estimate the sensitivity and specificity at a given prognostic threshold of rVL. The cases with rVL lower than the prognostic threshold were predicted to have nonsevere COVID-19, whereas those with rVL above it were predicted to have severe COVID-19. Hence, we used the cumulative distributions for nonsevere and severe adult cases on a DFSO and calculated the proportion of cases that were true positive, false positive, false negative, and true negative rates across prognostic thresholds of rVL. Sensitivity and specificity were calculated based on these values. These analyses were coded in Matlab R2019b (RRID:SCR_001622, MathWorks) and are available at GitHub (copy archived at swh:1:rev:c96390f98f47f17939f3669c7c8fad96f9603e84, Chen, 2019). Source data 2. Search strategy used for EMBASE.
Source data 3. Search strategy used for Cochrane Central.
Source data 4. Search strategy used for Web of Science Core Collection.
Source data 5. Search strategy used for medRxiv and bioRxiv.

Overview of contributing studies
The systematic search (Figure 1-source data 1, Figure 1-source data 2, Figure 1-source data 3, Figure 1-source data 4, Figure 1-source data 5) identified 5802 deduplicated results. After screening and full-text review, 26 studies met the inclusion criteria, and data were collected for individually reported specimens of known type and taken on a known DFSO for COVID-19 cases with known age, sex, or severity ( Figure 1). From 1402 COVID-19 cases, we collected 1915 quantitative specimen measurements (viral RNA concentration in a respiratory specimen) of SARS-CoV-2 ( Table 1) and used them to estimate rVLs (viral RNA concentration in the respiratory tract) (Figure 1-figure supplement 1). For pediatric cases, the search found only nonsevere infections and URT specimen measurements. Appendix 1-
After stratifying adults for disease severity, our analyses showed no significant differences in URT shedding levels or dynamics between sex or age groups ( Figure 2D,E, Figure 2-figure supplement 1). For severe disease, male and female cases had comparable mean rVLs at 1 DFSO (p = 0.326) and rates of SARS-CoV-2 clearance (p for interaction = 0.280). Similarly, for nonsevere illness, male and female cases had no significant difference in mean rVL at 1 DFSO (p = 0.085) or URT dynamics (p for interaction = 0.644). For nonsevere illness, younger and older adults had no significant difference in URT shedding levels at 1 DFSO (p = 0.294) or post-symptom onset dynamics (p for interaction = 0.100). For severe disease, the adult age groups showed similar mean rVLs at 1 DFSO (p = 0.915) and rates of viral clearance (p for interaction = 0.359).
Since cases with severe COVID-19 tend to deteriorate at 10 DFSO (Solomon et al., 2020;Zhou et al., 2020), early differences in shedding may predict disease severity before deterioration. To assess the prognostic utility of URT shedding, we used the rVL distributions of nonsevere and severe adult cases and calculated the AUC that is overlapped or separated ( Figure 4B). The greater the separation between these rVL distributions, the greater the ability to differentiate severe COVID-19 from nonsevere illness, and this AUC analysis estimates the maximal accuracy of prognostication ( Figure 4C). At each DFSO, these URT distributions were largely overlapped. Moreover, the cumulative density distributions of rVL ( Figure 4D) estimated poor sensitivity and specificity for prognostication (Figure 4-figure supplement 2). Thus, our data indicated that URT shedding inaccurately predicts COVID-19 severity.
We also assessed the prognostic utility of LRT shedding. We calculated the AUC that is overlapped or separated, which showed greater separation between the LRT distributions of severe and nonsevere cases ( Figure 4F). The estimated accuracy for using LRT shedding as a prognostic indicator for COVID-19 severity was up to 81 % ( Figure 4G). As a resource, the cumulative distributions of LRT shedding ( Figure 4H) enable for the estimation of the specificity and sensitivity at different prognostic thresholds of LRT rVL. For example, at 5 DFSO, the estimated specificity was 93.3 % and the estimated sensitivity was 64.4 % at a prognostic threshold of 9.10 log 10 copies/ml (Figure 4-figure supplement  3). For 8 DFSO, the estimated specificity and sensitivity was 73.1% and 88.8%, respectively, at a    prognostic threshold of 5.95 log 10 copies/ml. These estimated specificities and sensitivities agreed with the estimated accuracy for prognostication from their AUC analyses. Taken together, our data indicated that LRT shedding more accurately predicts COVID-19 severity than does URT shedding.

Discussion
Our study systematically developed a dataset of COVID-19 case characteristics and rVLs and conducted stratified analyses on SARS-CoV-2 shedding post-symptom onset. In the URT, we found that adults with severe COVID-19 showed slightly higher rVLs shortly after symptom onset, but similar SARS-CoV-2 clearance rates, when compared with their nonsevere counterparts. After stratifying for disease severity, our analyses showed that sex and age had nonsignificant effects on SARS-CoV-2 shedding for each included analysis (summarized in Appendix 1-table 4). Thus, while sex and age influence the tendency to develop severe COVID-19 (Onder et al., 2020;Tartof et al., 2020;Zhou et al., 2020), we find no such sex dimorphism or age distinction in shedding among cases of similar severity. This includes children, who had nonsevere illness in our study and show similar URT shedding post-symptom onset as adults with nonsevere illness.
Notably, our analyses indicate that high, persistent LRT shedding of SARS-CoV-2 characterizes severe COVID-19 in adults. Previous reports have found prolonged LRT shedding for weeks in critically ill adult patients (Buetti et al., 2020;Huang et al., 2020). Our results provide additional insights into the LRT kinetics of SARS-CoV-2 in adults, particularly soon after symptom onset. They reveal a severity-associated difference in both shedding and clearance in the LRT which begins, at least, at 4 DFSO; our dataset had limited LRT samples before 4 DFSO. Interestingly, our analyses also reveal an early bifurcation between the LRT and URT for severe COVID-19. That is, severe disease is associated with higher rVLs in the LRT than the URT throughout the analyzed period, whereas nonsevere illness shows similar shedding between the LRT and URT. This suggests that the effective immune responses associated with milder COVID-19, including innate, cross-reactive, and coordinated adaptive immunity Rydyznski Moderbacher et al., 2020;Ng et al., 2020;Pierce et al., 2020;Takahashi et al., 2020), do not significantly inhibit early, or prolonged, SARS-CoV-2 replication in the LRT of severely affected adults. Hence, poorly controlled LRT replication tends to continue, at least, to 10 DFSO, which coincides with the timing of clinical deterioration (median, 10 DFSO) (Solomon et al., 2020;Zhou et al., 2020). Moreover, the bifurcated profiles of LRT shedding concur with the observed severity-associated differences in lung pathology, in which severe cases show hyperinflammation and progressive loss of epithelial-endothelial integrity (Magro et al., 2020;Matheson and Lehner, 2020;Xu et al., 2020b).
Thus, LRT shedding may predict COVID-19 severity, serving as a prognostic factor. As emerging evidence suggests that timing influences the efficacy of anti-SARS-CoV-2 therapies (O'Brien et al., 2021;Weinreich et al., 2021a), early clinical decision making is crucial. A prognostic indicator guides early risk stratification, identifying high-risk individuals before they deteriorate into severe COVID-19. This facilitates the early administration of the efficacious therapies to these patients and may reduce the incidence of severe and fatal COVID-19 (O'Brien et al., 2021;Weinreich et al., 2021a;Weinreich et al., 2021b). Additional studies should further explore the prognostic utility of LRT shedding in clinical settings, including toward improving COVID-19 outcomes.
LRT shedding can be assessed noninvasively. This study predominantly analyzed expectorated sputum, which can be obtained from a deep cough, as the LRT specimen. Since SARS-CoV-2 detection occurs more frequently in expectorated sputum than in URT specimens, including nasopharyngeal swabs (Fajnzylber et al., 2020;Wang et al., 2020;Wölfel et al., 2020), SARS-CoV-2 quantitation from sputum may more accurately diagnose COVID-19 while simultaneously predicting severity. Noninvasively induced sputum presents a potential alternative for patients without sputum production (Lai et al., 2020), although it was not assessed in this study and its prognostic utility remains to be evaluated. Furthermore, our data suggest that sex and age may not significantly influence prognostic thresholds but that the time course of disease may. Prognostication should account for the dynamics of shedding, and both the rVL and DFSO of a sputum specimen should be considered.
While our analyses did not account for virus infectivity, higher SARS-CoV-2 rVL is associated with a higher likelihood of culture positivity, from adults (van Kampen et al., 2021;Wölfel et al., 2020) as well as children (L'Huillier et al., 2020), and a higher transmission risk (Marks et al., 2021). Hence, our results suggest that infectiousness increases with COVID-19 severity, concurring with epidemiological analyses Sayampanathan et al., 2021). They also suggest that adult and pediatric infections of similar severity have comparable infectiousness, reflecting epidemiological findings on age-based infectiousness (Laxminarayan et al., 2020;Li et al., 2021;Sun et al., 2021). Furthermore, since respiratory aerosols are typically produced from the LRT (Johnson et al., 2011), severe SARS-CoV-2 infections may have increased, and extended, risk for aerosol transmission. As severe cases tend to be hospitalized, this provides one possible explanation for the elevated risk of COVID-19 among healthcare workers in inpatient settings (Nguyen et al., 2020); airborne precautions, such as the use of N95 or air-purifying respirators, should be implemented around patients with COVID-19.
Our study has limitations. First, while our study design systematically developed a large, diverse dataset, there were few severe female cases with LRT specimens and no severe pediatric cases included. Statistical comparisons involving these groups were not conducted. Additional studies should permit these remaining comparisons. Second, our analyses did not account for additional case characteristics, including comorbidities, and their relationships with SARS-CoV-2 kinetics remain unclear. Third, the review found that expectorated sputum was the predominant LRT specimen used for SARS-CoV-2 quantitation, and our analyses on LRT kinetics may not generalize to cases without sputum production. The systematic dataset also consisted largely of hospitalized patients, and our results may not generalize to asymptomatic infections.
In summary, our findings provide insight into the kinetics of SARS-CoV-2 and describe virological factors that facilitate the pathogenesis of severe COVID-19. They show that high, persistent LRT shedding characterizes severe disease in adults, highlighting the potential prognostic utility of SARS-CoV-2 quantitation from LRT specimens. Lastly, each study identified by our systematic review collected specimens before October 2020. As widespread transmission of the emerging variants of concerns likely occurred after this date (Davies et al., 2021;Konings et al., 2021;Tegally et al., 2021), our study presents a quantitative resource to assess the effects of their mutations on respiratory shedding levels and dynamics.

Additional files
Supplementary files • Transparent reporting form
The following dataset was generated: Descriptions of each item are included in the modified JBI critical appraisal checklist. Grey, yellow, and red represent yes (Y), unclear (U), and no (N), respectively.
6. Were valid methods used for the identification of the condition? Many health problems are not easily diagnosed or defined and some measures may not be capable of including or excluding appropriate levels or stages of the health problem. If the outcomes were assessed based on existing definitions or diagnostic criteria, then the answer to this question is likely to be yes. If the outcomes were assessed using observer reported, or selfreported scales, the risk of over-or under-reporting is increased, and objectivity is compromised. Importantly, determine if the measurement tools used were validated instruments as this has a significant impact on outcome assessment validity. 7. Were standard, valid methods used for measurement of the exposure?
The study should clearly describe the method of measurement of exposure. Assessing validity requires that a 'gold standard' is available to which the measure can be compared. The validity of exposure measurement usually relates to whether a current measure is appropriate or whether a measure of past exposure is needed. In this study, standard methods to measure viral load in respiratory specimens are assays quantifying via one of the diagnostic sequences (Ofr1b, N, RdRp, and E genes) for SARS-CoV-2. 8. Was the exposure measured in an objective, reliable way for all participants?
The study should clearly describe the procedural aspects of the measurement of exposure as well as factors that can contribute to heterogeneity in measurement. In this study, objective, reliable interpretation of the exposure depends on the use of quantitative calibration; the specification of extraction; determination of the viral load as a standard metric (e.g., copies/ml or equivalent) or in a manner that can be converted to a standard metric; and, if present, specification of the amount of diluent (e.g., viral transport media) used. 9. Was there clear reporting of clinical information of the participants?
There should be clear reporting of clinical information of the participants such as the following information where relevant: disease status, comorbidities, stage of disease, previous interventions/treatment, results of diagnostic tests, etc. In addition, there should be clear reporting of the number and types (asymptomatic, presymptomatic, symptomatic, adult, pediatric, hospitalized, non-admitted, community, etc.) of cases for measurements within the sampling periods of interest. For studies that include data outside of the infectious period, there should be clear reporting of clinical information for participants for the specimen measurements that were collected from within the infectious period. 10. Was statistical analysis appropriate?
As with any consideration of statistical analysis, consideration should be given to whether there was a more appropriate alternate statistical method that could have been used. The methods section of studies should be detailed enough for reviewers to identify which analytical techniques were used and whether these were suitable.

Low
The majority of critical appraisal criteria are met (≥6/10 items) and included item 1 (representative sample). The estimates are likely to be correct for the target population.

High
The majority of critical appraisal criteria are not met (<6/10 items) or did not include item 1 (representative sample). This may impact on the validity and reliability of the estimates. The estimates may not be correct for the target population.
Unclear The majority of items are unclear. There was insufficient information to assess the risk of bias.