Application of human RNase P normalization for the realistic estimation of SARS-CoV-2 viral load in wastewater: A perspective from Qatar wastewater surveillance

The apparent uncertainty associated with shedding patterns, environmental impacts, and sample processing strategies have greatly influenced the variability of SARS-CoV-2 concentrations in wastewater. This study evaluates the use of a new normalization approach using human RNase P for the logic estimation of SARS-CoV-2 viral load in wastewater. SARS-CoV-2 variants outbreak was monitored during the circulating wave between February and August 2021. Sewage samples were collected from five major wastewater treatment plants and subsequently analyzed to determine the viral loads in the wastewater. SARS-CoV-2 was detected in all the samples where the wastewater Ct values exhibited a similar trend as the reported number of new daily positive cases in the country. The infected population number was estimated using a mathematical model that compensated for RNA decay due to wastewater temperature and sewer residence time, and which indicated that the number of positive cases circulating in the population declined from 765,729 ± 142,080 to 2,303 ± 464 during the sampling period. Genomic analyses of SARS-CoV-2 of thirty wastewater samples collected between March 2021 and April 2021 revealed that alpha (B.1.1.7) and beta (B.1.351) were among the dominant variants of concern (VOC) in Qatar. The findings of this study imply that the normalization of data allows a more realistic assessment of incidence trends within the population.


Introduction
COVID-19 has spread rapidly and has been detected in over 220 countries worldwide. Globally, as of 7th March 2022, there have been 445,096,612 confirmed cases of COVID-19, including 5,998,301 deaths (WHO, 2021). RT-qPCRbased clinical diagnostics are among the primary methods used by many countries to control this pandemic by tracing and isolating/treating COVID-19 patients and their first-degree contacts (Abu-Raddad et al., 2021a,b). However, clinical monitoring and contact tracing methods are not practical tools for the early detection and prediction of outbreaks. Clinical diagnostics also miss out on the asymptomatic individuals who represent most of the actual COVID-19 cases. Wastewaterbased epidemiology (WBE) is often promoted as a promising tool for early detection of COVID-19 outbreaks by monitoring the presence of the SARS-CoV-2 RNA in the wastewater. WBE analyses also allow the approximation of the number of infected populations by correlating the amount of SARS-CoV-2 RNA shed in faeces and urine with other hydro-chemical parameters (e.g., flow rate, temperature, ammonia nitrogen, etc.) of the wastewater. Many studies have reported the presence of SARS-CoV-2 in faeces and other excreta for both symptomatic and asymptomatic individuals (Han et al., 2020b;Tang et al., 2020). Viral load in the range of 10 5 -10 8 copies/L, 10 6 -10 11 copies/L, and 10 4 -10 14 copies/L were found in the urine, fecal, and saliva/sputum samples respectively (Feng et al., 2021;Frithiof et al., 2020;Han et al., 2020a;Jeong et al., 2020;Kim et al., 2020;Lescure et al., 2020;Pan et al., 2020;Xiao et al., 2020;Yoon et al., 2020).
Several factors can influence RNA concentrations in wastewater mainly when entering the sewer system and thereby impacting its epidemiological value. This happens mainly due to the inactivation of the virus particles by environmental factors (solid particles, organic matters, temperature, pH), and chemical compounds (detergents, disinfectants) in the sewage (Du et al., 2020;Ling et al., 2020;Liu et al., 2020;Xing et al., 2020;Xu et al., 2020). The presence of antagonistic microorganisms could also increase the extent of inactivation of RNA in samples via predation and enzymatic breakdown (Bosch et al., 2006;Hao et al., 2010), and the extent of inactivation further increases in the solid phase (e.g. sludge) as compared to the liquid phase (Bodzek et al., 2019;Chaudhry et al., 2015). Other major factors are population size and the fecal shedding (including shedding pattern, recovery, rate, and load distribution), Despite these large contributing factors in wastewater, many WBE studies have confirmed the presence of SARS-CoV-2 RNA fragments in wastewater samples, including studies in the Arab Gulf area (Albastaki et al., 2021;Saththasivam et al., 2021) and successfully demonstrated its correlation with the reported clinical cases.
It has been reported that patients have shown prolonged viral shedding in their stool despite being tested negative using nasopharyngeal swabs (Du et al., 2020;Ling et al., 2020;Liu et al., 2020;Xing et al., 2020;Xu et al., 2020). Data findings indicated that a single infected person may shed between millions and billions of copies of SARS-CoV-2 in wastewater, during the peak of infection (Hart and Halden, 2020;Saawarn and Hait, 2021). This may cause an overestimation of the infected population (Schmitz et al., 2021). Therefore, normalization of SARS-CoV-2 Ct values obtained from wastewater samples is essential for accurate data quantification, and interpretation, and helps to compensate for the fluctuations of SARS-CoV-2 Ct values in wastewater due to variable shedding rate among the infected population. Normalization techniques usually use reference genes to exclude the possibility of false negatives that could arise from the presence of inhibitors or the low quality and integrity of RNA samples (Carraturo et al., 2020;Foladori et al., 2020;La Rosa et al., 2020;Larsen and Wigginton, 2020;Race et al., 2020). To compare SARS-CoV-2 levels among the sewage samples over time, normalization is essential. As human cells contain only one copy of RNase P, which encodes the mRNA moiety for the enzyme, hence their Ct values correspond to a range of input cell numbers (Albastaki et al., 2021;Saththasivam et al., 2021). Many investigational studies were carried out using only a single reference target for calculating the ratio between the interested genes and the normalizer (McQuaig et al., 2009). As the housekeeping genes are evaluated in a wide variety of tissues and different abundances. The geometric mean technique is preferred over the conventional arithmetic mean to average the reference genes. An accurate normalization factor can be obtained by averaging a series of selected housekeeping genes using the geometric mean approach (Vandesompele et al., 2002). A study conducted by Maidana-Kulesza et al. (2021) utilized RNase P to standardize SARS-CoV-2 levels and compared to COVID-19 cases. Although there has not been a consensus method, normalizing using reference genes could result in better accuracy. Furthermore, the ratio before and after normalization might reflect the viral loss in the sewage system and viral recovery in the laboratory (CDC, 2021). The normalization of raw Ct values has been a subject of several studies. For example, D'Aoust et al. have normalized the Ct value against human fecal markers (PMMoV or HF183) and showed improved correlations between the trends of SARS-CoV-2 concentrations and reported clinical cases (D'Aoust et al., 2021;Jafferali et al., 2021). Another study emphasized the importance of normalizing to PMMoV marker to compare SARS-CoV-2 concentrations among WWTPs (Wolfe et al., 2021). Although human fecal indicators can be detected in raw sewage, it is not easy to correlate the relationship between the concentration of human fecal indicators (PMMoV and CrAssphage) and the wastewater signal of the target pathogen due to a vast difference in the signal between them, even if human fecal indicators can be detected in raw sewage (Holm et al., 2022).
Apart from the quantifying viral RNA in wastewater samples, the identification of SARS-CoV-2 variants of concern (VOC) via the WBE approach is gaining traction in recent days (Crits-Christoph et al., 2020;Jahn et al., 2021). The growing list of new VOCs since the first wave of infection is a pressing issue (Heijnen et al., 2021). VOCs such as Alpha (B.1.1.7) and Beta (B.1.351) became dominant compared to the original SARS-CoV-2 virus (Wang et al., 2021). Over time, more variants such as Gamma (P.1), Delta (B.1.617.2), and Omicron (B.1.1.529) were added to this category due to their higher infection rate and potential morbidity. Hence, it is important to develop efficient diagnostic tools that could rapidly identify and evaluate the distribution of VOC within the community. WBE could offer a faster route in mapping the distribution of VOC in a particular community using the respective wastewater treatment plants.
Following the promising results obtained during the monitoring of the first wave in Qatar in 2020 (Saththasivam et al., 2021), the current study evolves to monitor the progression of the second wave of Covid-19 in Qatar by including the following (i) Analysis of the effect of normalization of SARS-CoV-2 concentration against Human RNase P (ii) Estimation of the infected population by considering the RNA decay due to wastewater temperature and sewer residence time and (iii) A pilot study to sequence SARS-CoV-2 genomic data in wastewater samples and comparison against COVID-19 patient sequencing data collected during the similar period.

Wastewater sample collection and pre-processing
One litre of 24-h composite raw wastewater samples were collected weekly from five major wastewater treatment plants (WWTPs) in Qatar. The locations and areas served by these plants are shown in Figure S1. During the period of this study (February to August 2021), a total of 170 wastewater samples were collected and analyzed. The samples were collected in sterile 1 L glass bottles and treated at 56-58 • C for 30 min using a water bath to inactivate the virus at the collection site (Batéjat et al., 2021). The treated samples were then transported within two hours in iceboxes to the laboratory for further processing. All the samples were processed within 24 h of collection. Standard personal protective equipment (PPEs) were used throughout the sample collection and processing periods.

Viral precipitation
Samples were processed using PEG protocol as described previously (Bibby and Peccia, 2013;Saththasivam et al., 2021). Briefly, 25 mL of glycine buffer (0.05 M glycine, 3% Bovine Serum Albumin (BSA)) was added to 200 mL of wastewater to detach the virus genome, which binds to the organic matter. The mixture was incubated at 4 • C for 2 h with shaking at 200 rpm. The samples were then centrifuged at 4500 × g for 30 min at 4 • C without braking to remove the large debris and bacterial cells. After centrifugation, the supernatant was carefully pipetted into a sterilized container and containing PEG 8000 (100 g/L) and NaCl (22.5 g/L), and the mixture was gently homogenized at room temperature. The mixture was incubated overnight at 4 • C during agitation (100 rpm) followed by centrifugation for 90 min (13,000 × g and 4 • C) without braking. After centrifugation, the sample was decanted carefully, and the container was returned to the refrigerated (4 • C) centrifuge. The sample was centrifuged again for 10 min at 13,000 × g 4 • C with brake set to 3 (of 9). The resulting viral-containing pellet was eluted in 1 mL DNA/RNA Shield™ (Zymo Research, Irvine CA, USA Cat. No. R1100-250) and stored at −80 • C until further processing. The concentration process was done in duplicate.

RNA extraction, qRT-PCR running of SARS-CoV-2 and normalization
Viral RNA was extracted from 200 µL viral-containing pellet using extraction kits Quick-RNA Viral Kits (Zymo Research, Irvine CA, USA Cat. No. R1041) according to the manufacturer's protocol. The Viral RNA was eluted in 30 µl of nuclease-free water and stored at −80 • C until the next step which was the detection of SARS-CoV-2 using qRT-PCR. The identification and quantity of SARS-CoV-2 in the extracted Viral RNA was assessed with real-time quantitative polymerase chain reaction (qRT-PCR) using SARS-CoV-2 (2019-nCoV) CDC qPCR Probe Assay Research Use Only (RUO) kit (Integrated DNA Technologies, IDT, Coralville, IA, USA Cat number 10006713) and Luna Universal Probe One-Step RT-qPCR Kit (New England BioLabs, USA; Cat number E3006E) on Applied Biosystems 7500 Fast Real-Time PCR instrument (Applied Biosystems, CA, USA). SARS-CoV-2 (2019-nCoV) CDC RUO Plasmid Controls (Integrated DNA Technologies, IDT, Coralville, IA, USA Cat number 10006625) was used as the positive control. All qRT-PCR amplifications were performed in 20 µL reaction mixtures using Luna Universal Probe One-Step RT-qPCR Kit. The reactions were performed in 20 µL reaction mixtures each containing: 2.5 µl nuclease-free water, 1.5 µL of each combined primers and probe mix (Saththasivam et al., 2021), 1ul Luna WarmStart RT Enzyme Mix (20X), 10 µL Luna Universal Probe One-Step Reaction Mix and 5 µL extracted viral RNA. Cycling conditions: reverse transcription at 55 • C for 10 min, initial denaturation 95 • C for 1 min, denaturation & extension (10 s & 60 s) for 45 cycles. Instrument setting: detector (FAM) and Quencher (none). Based on the results of Sketa22 qPCR Assay, RNAs were successfully extracted and detected from the samples with no significant impact of PCR inhibitors and were therefore used for more downstream RT-qPCR analysis of SARS-CoV-2. Amplification efficiency (E%) of CDC N1 (2019-nCOV_N) was 95.7%, which is within the recommended range (90%-110%) of MIQE guidelines (Bustin et al., 2009). On the other hand, the amplification (E%) of the CDC N2 (2019-nCOV_N) assays was only 81.26%. All assays had a correlation coefficient of 1 (R2). The slopes of standard curves for N1 and N2 were −3.4295 and −3.8716 respectively.
The N1 and N2 Y-intercepts were 39.102 and 41.041 respectively. The N1 and N2 Y-intercepts were 39.102 and 41.041 respectively. Normalization was performed by correcting the Ct of RNA fragments with respect to the ratio of sample RNase P and geomean of all RNase P Ct values (Duchamp et al., 2010). where: Ct Normalized value = sample SARS-CoV-2 Ct value × sample RNase P Ct value Geometric mean of RNase P Ct values Geometric mean was more suitable than the conventional arithmetic mean in this study in order to account for different abundance and functional classes in various human tissues (Vandesompele et al., 2002).

Library preparation and sequencing
Sequencing for selected wastewater samples was implemented and completed at the Genomic Core Laboratory at Weill Cornell Medicine-Qatar (WCM-Q) in three independent sequencing runs. Next-generation sequencing (NGS) library construction was performed using the CleanPlex SARS-CoV-2 Panel (Paragon Genomics, USA; SKU: 918012). The protocol for target enrichment and library preparation followed the manufacturer's instructions. Gel-size selection on a 3% agarose gel was utilized to prevent the formation of adapter dimers. NGS libraries were quantified using KAPA Library Quantification Kit (Roche, USA; KK4824). The resulting libraries were normalized, pooled in equimolar amounts, and sequenced on an Illumina MiSeq instrument using a paired-end 150bp kit (Illumina, USA; MS-102-2002). All procedures were implemented following manufacturers' protocols. For Illumina reads, sequences were aligned with BWA-MEM and SNPs called with SAMTOOLS as previously described (Abu-Raddad et al., 2020;Al Khatib et al., 2020). The overall deep coverage allowed for calling low-frequency mutations in most samples down to a frequency of average. Readings were mapped against SARS-CoV-2 reference genome (Wuhan-Hu-1 [GenBank accession numbers NC_045512.2 and MN908947.3]).

Estimation of the infected population -modeling approach
An estimation of the infected population within the zones and districts served by a WWTP can be made using the following equation (Saththasivam et al., 2021): where C RNA(WWTP) , F, α, β, and γ represents the measured RNA concentration at the inlet of WWTP, the volumetric flow rate of the WWTP (L/day), fecal load (g/day/person), fecal shedding (viral copies/g) and RNA losses in the sewer respectively. Additional information related to these input parameters can be found in Table 1. Due to the large variations of the α and β values, the range of infected population (N) within each WWTP was calculated either using the central limit theorem of probability theory or Monte-Carlo-Bayesian approach. The latter approach was used when the standard deviations in the infected population (δN) were significantly large. These calculation methodologies have been described in a previous publication (Saththasivam et al., 2021).
Apart from α and β, the input parameter γ has a significant effect on the estimation of the infected population. RNA losses in wastewater samples can be attributed to adsorption of RNA on stool and solids, half-life decay, degradation due to interaction with other compounds in wastewater, and many other factors. These factors are generally difficult to estimate due to huge variability and lack of sufficient information. In this study, the denominator (1 − γ ) was specifically used to consider the effect of the half-life of SARS-CoV-2 RNA. As half-life is a function of wastewater temperature and residence time, the degradation of viral RNA with respect to time in wastewater can be approximated as (Hart and Halden, 2020): where C RNA(to) = excreted RNA concentration into the sewer, t transit = sewer transit time and t 1/2 = half-life of SARS-CoV-2

RNA.
As the half-life of the RNA is also affected by the wastewater temperature, the adjusted half-life can be defined as (Hart and Halden, 2020): where t 1 2 ,ref = reference half-life, Q 10 is a temperature-dependent rate of change, T WWTP = wastewater temperature at the inlet of WWTP, T ref = temperature at which initial half-life was derived.
Finally, the infected population-based on half-life correction can be defined as: The reference half-life t 1 2 ,ref for SARS-Cov-2 was calculated based on a decay rate of 0.183 ± 0.008 day −1 at a reference temperature of 25 • C in untreated wastewater  and temperature-dependent rate of change, Q 10 = 2 was used to correct the half-life (Hart and Halden, 2020). The estimated infected population obtained from the statistical model cannot be directly compared against the daily reported clinical cases since the measured C RNA at the inlet of WWTPs represents cumulative concentration shed by patients that were infected several days or weeks before each sampling day. In addition, the daily reported clinical cases did not include the untested (often asymptomatic) population who could be potentially COVID-19 positive and shedding the RNA via human waste. Hence, in this study, the model output is compared against cumulative clinical data over 30 days as several studies have confirmed the prolonged RNA shedding of SARS-CoV-2 patients. Assuming a typical delay in diagnosis of approximately 10 days, the number of infected population at any given date was estimated by taking the sum of patients clinically diagnosed on the day of sampling, 10 days prior to the sampling date, and 19 days after each wastewater sampling date. These cumulative positive cases were further corrected using a diagnosis ratio of 0.2. This ratio was obtained based on a compilation of epidemiological evidence including time series of diagnosed cases (Abu-Raddad et al., 2021a,b; Chemaitelly et al., 2021a), a series of seroprevalence studies (Al-Thani et al., 2021;Coyle et al., 2021;Jeremijenko et al., 2021), the volume of testing over time (Abu-Raddad et al., 2021a,b;Chemaitelly et al., 2021a), and a series of mathematical modeling studies that modeled the COVID-19 epidemic in Qatar through its different waves (Ayoub et al., 2021a,b;Seedat et al., 2021).

Detection of SARS-CoV-2 RNA in wastewater
The first SARS-CoV-2 WBE study was launched during the first wave of the pandemic in Qatar in 2020 to monitor the levels of RNA fragments at the inlet of the major municipal wastewater treatment plants (Saththasivam et al., 2021). The findings of that study correlated well with the reported clinical cases where the developed WBE model provided a good estimation of the infected population. This current study was commenced in February 2021 responding to the second wave of infection where the clinical cases nearly tripled in January 2021. We have routinely analyzed raw sewage samples and quantified the concentration of SARS-CoV-2 RNA, C RNA in the samples collected from five WWTPs (Fig. S1).
All sewage samples collected from the WWTPs over the 29 weeks tested positive for both N-gene primers (N1 and N2). As shown in Fig. 1, Ct values obtained for N1 and N2 regions ranged between 23.4-37.3 and 25.9-38.3 respectively. The Ct values measured in the current study were relatively lower than the readings obtained during the first wave of the pandemic in Qatar where the minimum Ct N1 and Ct N2 values were 27.2 and 27.5 respectively. This could indicate a wide variation of the shedding rates by the more dominant VOC.
It can be seen from Fig. 1 that the normalization of SARS-CoV-2 Ct against RNase P led to slightly higher viral Ct counts in almost all the samples. There are no apparent changes observed in the trendlines pre and post-normalization in all the WWTPs. After normalization, the Ct values ranged between (26.72 ± 0.97-37.59 ± 1.18), (25.63 ± 0.75-36.24 ± 1.08), (26.84 ± 1.34-37.08 ± 0.35), (25.47 ± 0.73-36.31 ± 0.51) and (24.08 ± 0.57-39.38 ± 0.49) in WWTP1, WWTP2, WWTP3, WWTP4 and WWTP5 respectively. Noticeable dips in Ct values can be observed in all the WWTPs in the third week of March 2021. These represent the lowest Ct counts ever observed in the entire study. A noticeable increase in the Ct values can be seen thereafter for all the WWTPs until June 2021 before recording a less aggressive reversal in July 2021. Ct plots of each WWTP as shown in Fig. 1 are useful for the local authorities to deploy timely corrective measures targeting the specific regions served by these WWTPs. However, these plots did not provide sufficient information to enable comparisons between the zones/districts served by each WWTP due to the difference in the operating flow rates and capacities of the plants. For direct comparison between WWTPs and against the recorded clinical cases, the computation of viral loads (C RNA × flow rate) are more appropriate. The subsequent section discusses the correlation between clinical cases and total viral load obtained across these five WWTPs.
It is worth mentioning that there is no significant variation in C RNA between weekly and daily sampling. Figure S2 shows the average distribution of N1-Ct values of composite samples collected between 2nd of May 2021 and 5th of May 2021 from WWTP1. N1-Ct varied between 30.21 ± 0.34 and 31.34 ± 1.03 with an average of 30.74 ± 0.7 while Ct values for N2-gene ranged from 29.48 ± 0.50 to 32.45 ± 0.69 with an average of 31.45 ± 1.27. Based on this observation, we continue to adopt the weekly sampling in our study.

Comparison of total viral load against clinical cases
The total viral load of SARS-CoV-2 in the wastewater was computed by summing the viral loads (copies/day) of each WWTP and then compared against the daily positive clinical cases as shown in Fig. 2. The viral load of each WWTP was calculated by multiplying the normalized C RNA and wastewater flow rates. During the period of study, the daily positive clinical cases reached a maximum of 3361 cases on 4th April 2021. The cases were on a continuous decline until early July 2021 when a slight rise was again observed between mid-July and mid-August 2021 before continuing a decline until the end of the study. A similar pattern can be observed in the total viral loads. The viral loads in these WWTPs were one to two orders higher in March and April 2021 where the maximum viral load of 1.15 × 10 15 copy/day was observed in the samples collected on 21st March 2021. The total viral load in these WWTPs then started to decrease over the months where the lowest reading of 5.70 x 10 11 copies/day was recorded on 20th June 2021. The viral load pattern exhibited an increase from early-July 2021 till early August 202 which mirrored the rise in the daily clinical cases in July and August 2021. It is also evident from Fig. 2 (July to August 2021 period) that the increase in the total viral load across the five WWTPs tends to appear a few weeks earlier than the surge in the daily clinical cases, hence suggesting that monitoring of viral loads in sewage could potentially be used as an early detection tool for SARS-Cov-2 outbreak monitoring.
Normalization of Ct-N1 has profound effects on the estimation of the infected population, especially during the initial period of the study when the viral load in the wastewater was very high (Ct < 26). Despite that many factors can interfere with PCR reaction and cause loss of RNA, in this study we have witnessed lower Ct values than expected especially during the peak period which has led to an overestimation of the population in Qatar. As shown in Fig. 3, the corrected values during this high infectivity period were 41%-65% lower post-normalization. The normalization of the SARS-CoV-2 Ct values against the RNase P helped to compensate for the viral load fluctuations due to the variable shedding rate in the sewage system. The predicted infected population using the WBE model matches the trend of the 30 days of cumulative SARS-CoV-2 positive cases (corrected using a diagnostic ratio of 0.2) as shown in Fig. 4. The effective lockdown measures imposed by the local authorities in late March 2021 to control the second wave can be observed from mid-April 2021onwards where the model showed a rapid fall in the infected population. After reaching a high estimate of 765,729 ± 142,080 cases on 21st March 2021, the predicted cases have declined until late June 2021 with the lowest estimated cases of 453 ± 272. A momentary increase in the infected population can be observed in July 2021 when 14,246 ± 1016 cases were estimated based on the samples collected on 26th July 2021. This increase is most likely attributed to returning travelers after the summer break. The revised strategy by the local authorities to increase the quarantine period and impose stringent measures paid off as the cases started to decline again in August 2021. From Fig. 4, it can be seen that the model has underpredicted the number of infected populations for most of the period when compared with the cumulative cases. This is not surprising as the viral RNA load in the sewage is one of the key input parameters of the model. As RNA fragments in sewage are fragile, unstable, and easily lost during sample processing, the accurate recovery and quantification of RNA fragments from sewage samples are challenging (Kitajima et al., 2020). Also, losses of RNA fragments during sewer transit, adsorption on solids/sludge as well as variability in the shedding rate of different variants are among other key reasons affecting the quantification of RNA (Kocamemi et al., 2020;Westhaus et al., 2021). As often noted by many researchers, the high variability of viral shedding in human waste  is another parameter that affects the model prediction. Apart from the modeling issues, the usage of a constant diagnostic ratio to account for asymptomatic patients, especially during the low infectivity period could be one of the reasons for the mismatch between the 30 days' cumulative clinical cases and the model. The diagnostic ratio can be higher than 0.2 especially during the later stage of this study due to the increased tracing and awareness in the population.

Variants of concern identification by whole-genome sequencing
The rapid rise in COVID-19 clinical cases and viral loads in wastewater prompted an investigation to profile the variants for the samples collected between 7th March 2021 and 28th April 2021. The SARS-CoV-2 whole-genome sequencing was performed by pooling all replicates of the sample from each site to obtain sufficient viral copies. Eight genetic-unique markers were targeted to classify the variants. Our analyses resulted in the identification of two SARS-CoV-2 VOC in the samples collected during this period. The detected variants were (i) B.  found in Table S2. Fig. 5 shows the distribution of these VOC in the five WWTPs. The Alpha wave seemed to plateau in mid-April 2021 where a decline in dominance can be seen thereafter. Similarly, the Beta variant was declining in almost all the WWTPs after reaching a peak in mid-March 2021. Clinical testing during this period indicated that alpha and beta were the dominant VOC in Qatar during this period. It is interesting to note that another unconfirmed variant was detected in wastewater. Weeks later, the clinical data identified two circulating variants from randomly collected clinical samples in Qatar and were referred to as Delta and another unidentified variant. (Abu-Raddad et al., 2021a,b;Benslimane et al., 2021;Chemaitelly et al., 2021a,b;Hasan et al., 2021;Qvgs, 2021;Tang et al., 2021). Nevertheless, more investigation is needed to correlate the unconfirmed variant found in wastewater and the ones observed in the clinical study.
This finding highlights that sequencing wastewater samples could help to identify new variants that are circulating within a population earlier and quicker. On the other hand, the analyses of genomic composition in wastewater are extremely sensitive, challenging, and can be easily compromised during the virus concentration steps. It is also susceptible to the presence of solid particles, and human and bacteria RNAs that are abundantly available in the wastewater samples (Jahn et al., 2021). Additional work is needed to improve the sensitivity of the analyses with the primary aim of detecting new genomic variants in wastewater samples.

Conclusion
WBE has provided an early indication of the re-emergence of SARS-CoV-2 in Qatar however several parameters have greatly influenced the variability of SARS-CoV-2 concentrations in wastewater including uncertainty associated with shedding patterns, environmental impacts, and sample processing strategies. This study has utilized a new normalization approach using human RNase P for the realistic estimation of SARS-CoV-2 viral load in wastewater. Normalization of Ct-N1 values was found to have a profound effect on the estimation of the infected population, particularly when the viral load in the wastewater was very high (Ct < 26). Infection rates and dynamics of VOC have been identified on a mass scale, which is not feasible in clinical RT-qPCR diagnoses that are based predominantly on symptom presentation and contact tracing. The estimation of the infection population was calculated by considering the effect of wastewater temperature, sewer residence time, and normalization of viral Ct against Human RNase P. Our analysis of the sequencing data from wastewater samples were able to identify the circulating VOC in the community. The findings reflect the importance of this normalization approach to reflect a more realistic estimation of the infected population and be more comparable with the reported clinical cases.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.