Use of wastewater surveillance for early detection of Alpha and Epsilon SARS-CoV-2 variants of concern and estimation of overall COVID-19 infection burden

A decline in diagnostic testing for SARS-CoV-2 is expected to delay the tracking of COVID-19 variants of concern and interest in the United States. We hypothesize that wastewater surveillance programs provide an effective alternative for detecting emerging variants and assessing COVID-19 incidence, particularly when clinical surveillance is limited. Here, we analyzed SARS-CoV-2 RNA in wastewater from eight locations across Southern Nevada between March 2020 and April 2021. Trends in SARS-CoV-2 RNA concentrations (ranging from 4.3 log10 gc/L to 8.7 log10 gc/L) matched trends in confirmed COVID-19 incidence, but wastewater surveillance also highlighted several limitations with the clinical data. Amplicon-based whole genome sequencing (WGS) of 86 wastewater samples identified the B.1.1.7 (Alpha) and B.1.429 (Epsilon) lineages in December 2020, but clinical sequencing failed to identify the variants until January 2021, thereby demonstrating that ‘pooled’ wastewater samples can sometimes expedite variant detection. Also, by calibrating fecal shedding (11.4 log10 gc/infection) and wastewater surveillance data to reported seroprevalence, we estimate that ~38% of individuals in Southern Nevada had been infected by SARS-CoV-2 as of April 2021, which is significantly higher than the 10% of individuals confirmed through clinical testing. Sewershed-specific ascertainment ratios (i.e., X-fold infection undercounts) ranged from 1.0 to 7.7, potentially due to demographic differences. Our data underscore the growing application of wastewater surveillance in not only the identification and quantification of infectious agents, but also the detection of variants of concern that may be missed when diagnostic testing is limited or unavailable.


H I G H L I G H T S
• Wastewater surveillance provides critical information for understanding COVID-19 infection burden. • Sequencing data reveal emergence of Alpha and Epsilon variants of concern in wastewater prior to clinical confirmation. • Calibrated fecal shedding model estimates 3.7-fold undercount for COVID-19 infections. • Wastewater can help fill public health surveillance gaps when clinical testing declines.

G R A P H I C A L A B S T R A C T A B S T R A C T A R T I C L E I N F O Editor: Warish Ahmed
A decline in diagnostic testing for SARS-CoV-2 is expected to delay the tracking of COVID-19 variants of concern and interest in the United States. We hypothesize that wastewater surveillance programs provide an effective alternative for detecting emerging variants and assessing COVID-19 incidence, particularly when clinical surveillance is limited. Here, we analyzed SARS-CoV-2 RNA in wastewater from eight locations across Southern Nevada between March 2020 and April 2021. Trends in SARS-CoV-2 RNA concentrations (ranging from 4.3 log 10 gc/L to 8.7 log 10 gc/L) matched trends in confirmed COVID-19 incidence, but wastewater surveillance also highlighted several limitations with the clinical data. Amplicon-based whole genome sequencing (WGS) of 86 wastewater samples identified the B.1.1.7 (Alpha) and B.1.429 (Epsilon) lineages in December 2020, but clinical sequencing failed to identify the variants until January 2021, thereby demonstrating that 'pooled' wastewater samples can sometimes expedite variant detection. Also, by calibrating fecal shedding (11.4 log 10 gc/infection) and wastewater surveillance data to reported seroprevalence, we estimate that~38% of individuals in Southern Nevada had been infected by SARS-CoV-2 as of April 2021, which is significantly higher than the 10% of individuals confirmed through clinical testing. Sewershed-Keywords: SARS-CoV-2 COVID-19 Virus Wastewater Mutation Variant

Introduction
As of March 2021 in the United States, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) had resulted in 30 million confirmed COVID-19 cases and 536,000 deaths (https://coronavirus.jhu.edu/map. html, accessed 3/30/21). In up to 70% of all COVID-19 infections, individuals show mild to no symptoms (Buitrago-Garcia et al., 2020;Johansson et al., 2021;Oran and Topol, 2020), highlighting the possibility that the aforementioned case count may significantly underestimate the true number of infections. Diagnostic tests are known to produce false negative results (Kanji et al., 2021;Kucirka et al., 2020;Woloshin et al., 2020), suggesting that infections may also go undetected due to methodological limitations. As the number of individuals seeking diagnostic testing declines, combined with the relaxation of COVID-19 mitigation measures (e.g., capacity limits and mask mandates) (Anderson et al., 2020;Hatef et al., 2021), known and emerging SARS-CoV-2 variants of concern (VOCs) and interest (VOIs) have the potential to trigger new waves of infection (Abdelnabi et al., 2021;Abdool Karim and de Oliveira, 2021;Harvey et al., 2021). In the U.S., this has occurred multiple times throughout the COVID-19 pandemic, including the winter of 2020/2021 with the Alpha and Epsilon VOCs, the summer of 2021 with the Delta VOC, and the winter of 2021/2022 with the Omicron VOC.
Since the middle of the 20th century, wastewater surveillance programs have been used to investigate viral infections, to study illicit drug use, and to understand the socioeconomic status of a community based on its food consumption Choi et al., 2019;Gerba et al., 2018;Matrajt et al., 2020). Early in the COVID-19 pandemic, studies confirmed that infected individuals shed significant quantities of SARS-CoV-2 RNA in fecal matter over a period of days or weeks (Holshue et al., 2020;Wang et al., 2020;Wölfel et al., 2020;Xiao et al., 2020). These findings indicated that pre-symptomatic, asymptomatic, and symptomatic infected individuals could potentially be monitored through the analysis of wastewater samples collected from municipal treatment plants, sewer collection systems, or even individual facilities Bivins et al., 2020;Hart and Halden, 2020;Kitajima et al., 2020;Medema et al., 2020;Orive et al., 2020). These pooled samples do not rely on individuals to seek diagnostic testing and can potentially be analyzed with reduced costs and within expedited timeframes relative to traditional clinical surveillance. Consequently, wastewater surveillance was deployed throughout the world starting in early 2020 to track levels of SARS-CoV-2 RNA in sewage.
In addition to polymerase chain reaction (PCR)-based approaches to detect and quantify SARS-CoV-2 RNA, next-generation sequencing (NGS) tools have also been developed for the characterization of SARS-CoV-2 genomes in both clinical and wastewater samples (Crits-Christoph et al., 2021;Fontenele et al., 2021;Nemudryi et al., 2020). Although these tools can achieve high coverage and depth, limited studies have been conducted on the feasibility of sequencing SARS-CoV-2 genomes in complex wastewater matrices. This goal is complicated by the fact that variants often include multiple characteristic mutations that must be detected simultaneously and attributed to individual genomes for accurate identification. For example, the Alpha (B.1.1.7) VOC contains 17 key mutations, notably the N501Y mutation and the ΔH69/ΔV70 deletion that are both present in the spike protein . We hypothesized that wastewater surveillance could be a valuable complement to clinical testing for the detection of VOCs. Specifically, concurrent sequencing of clinical and wastewater samples can increase the probability of rapid variant detection and provide further validation of the emergence of viral variants in communities.
Here, we explore three wastewater surveillance themes using samples collected in Southern Nevada: (1) trend analysis comparing SARS-CoV-2 RNA concentrations in wastewater with confirmed COVID-19 case data, (2) amplicon-based WGS for variant characterization and VOC detection, and (3) use of wastewater-based epidemiology to estimate total COVID-19 incidence in the community. Over the first 13-months of the COVID-19 pandemic, we monitored SARS-CoV-2 at seven wastewater treatment facilities across Southern Nevada and observed correlations between confirmed case counts and SARS-CoV-2 concentrations within the individual sewersheds. Over a one-month period, weekly grab sampling at a manhole also detected an ongoing outbreak at a community shelter. Next, we sequenced and analyzed 86 wastewater samples and 575 clinical samples from Southern Nevada during the same time period, ultimately using wastewater to detect the introduction of the Alpha (B.1.1.7) and Epsilon (B.1.429) lineages in the community up to one month prior to clinical confirmation. Finally, by calibrating fecal shedding and wastewater surveillance data to reported seroprevalence, we developed an approach for estimating the total number of infections in individual sewersheds and across Southern Nevada between March 2020 and April 2021.

Sample collection
Wastewater surveillance was performed weekly at seven treatment facilities (described as Facilities 1-7) that collectively encompass the vast majority of the population of the Las Vegas metropolitan area. The average daily flow at the treatment facilities ranged from 0.8 million gallons per day (mgd) to 100 mgd, and the corresponding sewersheds served 16,000 to 872,000 individuals, respectively (Table 1). Due to a combination of factors related to study design and/or practical limitations, sample types included grab primary effluent at Facility 1 (collected at~10:00 am), 24-h flow-weighted composite influent at Facilities 2-4, and grab influent at Facilities 5-7 (collected at~8:00 am). Weekly monitoring at these facilities (every Monday) commenced on different dates: March 2020 for Facility 1, August 2020 for Facilities 2-6, and December 2020 for Facility 7. Weekly sampling at Facility 8 (i.e., a community shelter manhole) was conducted during a four-week span from November 23, 2020 through December 14, 2020 and involved collection of three grab samples spaced 5 min apart (collected at~6:00 am) to generate a manual composite.

Quantification of SARS-CoV-2 in Southern Nevada wastewater
Sample processing and qPCR analysis followed a modification of our previously published protocols . Instead of a combined sample concentration approach with tangential hollow fiber ultrafiltration (HFUF) and centrifugal ultrafiltration, 10-L samples from Facilities 1 and 4 were processed with HFUF alone (REXEED-25S, 30 kDa, Asahi Kasei Medical Co., Japan), and 150-mL samples from all other locations were processed with Centricon centrifugal ultrafiltration (Centricon Plus-70, 100 kDa, Millipore Sigma, Burlington, MA, USA). This was meant to ensure experimental continuity , while allowing for additional sample throughput for the sites added in the current study. SARS-CoV-2 RNA concentrations were reported as averages (±1 standard deviation) of duplicate qPCR reactions across four SARS-CoV-2 gene target assays (orf1a, E_Sarbeco, N1, and N2). All concentrations were adjusted for equivalent sample volume and sample-specific recovery of spiked bovine coronavirus (BCoV) (summarized later). All other details were described previously .
Our previous findings demonstrated that wastewater concentrations of SARS-CoV-2 RNA were impacted by a number of factors beyond COVID-19 incidence, including sewershed characteristics (e.g., size and flow rate), sample type, and sample time . The grab primary effluent at Facility 1 consistently underestimates concentrations in the corresponding composite influent and composite primary effluent, presumably because it reflects influent wastewater arriving at the facility between 5:00 am and 6:00 am . To account for this diurnal variability and ultimately provide a more accurate infection estimate using the model described later, the observed concentrations for Facility 1 were increased by a factor of 3.5-the median from the prior study . On the other hand, the mid-morning grab influent samples at Facilities 5 and 6 were not significantly different from split samples of composite influent (p = 0.62 and p = 0.14, respectively) based on paired t-tests (N = 25 each). Therefore, no adjustment for diurnal variability was warranted for those facilities. Facility 7 does not collect composite samples so it was not possible to conduct a statistical evaluation of diurnal variability to determine whether the mid-morning grab sample was representative of concentrations throughout the day.

Detection of variants of concern (VOCs) in Southern Nevada
Sequencing libraries were constructed using the CleanPlex SARS-CoV-2 FLEX Panel from Paragon Genomics per manufacturer's instructions. More than 10 ng of total RNA was processed for first-strand cDNA synthesis. Libraries were sequenced using an Illumina NextSeq 500 sequencer with either mid-output or high-output v2.5 (300 cycles) flow cells. Upon sequencing, Illumina adapter sequences were trimmed from read pairs using cutadapt version 3.2 (Martin, 2011). Sequencing reads were mapped to the SARS-CoV-2 reference genome (NC_045512.2) using bwa mem, version 0.7.17-r1188 (Li, 2013). Paragon Genomics CleanPlex SARS-CoV-2 FLEX tiled-amplicon primers were trimmed from the aligned reads using fgbio TrimPrimers version 1.3.0 in hard-clip mode and variants were called by iVar variants v1.3 (Grubaugh et al., 2019). Genome coverages were calculated using samtools coverage v1.10 (Li et al., 2009).
Variants were called from the aligned samples by searching for each of the 14 non-synonymous mutations, three deletions, and six synonymous mutations commonly observed in B.1.1.7 samples (Table S4). B.1.1.7-positive samples were identified using defined criteria, including observation of multiple sites and prioritization of mutations of functional concern. We first required the detection of the mutation responsible for spike N501Y (23603 A>T). Because N501Y is also found in the unrelated VOC lineages P.1 and B.1.351 (Faria et al., 2021), we also required the detection of the spike ΔH69/ΔV70 deletion (21765-21770 deletion) or receptor binding domain (RBD) mutant A570D (23271 C>A). Using these criteria, we identified seven wastewater samples with N501Y and either ΔH69/ΔV70 or A570D, in addition to many other known B.1.1.7 mutations (Fig. 3A).
We also interrogated the wastewater samples for evidence of the highly transmissible B.1.429 lineage, searching our variant calls for five nonsynonymous mutations in the genome . One additional mutation not described in the original report )-N protein T205I-was added to the search, due to it being reported as another characteristic mutation of B.1.429 (Bourassa et al., 2021). For B.1.429 assignment in wastewater, we required the detection of the spike mutation L452R (22917 T>G) and either of the additional spike mutations S13I (21600 G>T) or W152C (22018 G>T). Using these criteria, B.1.429 was detected in five wastewater samples (Fig. 3B).

Processing of clinical samples and whole genome sequencing
Clinical samples were collected and confirmed for the presence of SARS-CoV-2 RNA using RT-PCR at the Southern Nevada Public Health Laboratory, as described previously (Hartley et al., 2020;Tillett et al., 2021). WGS on samples collected in December 2020 was performed by the UNLV lab using the CleanPlex SARS-CoV-2 FLEX Panel (Paragon Genomics) on an Illumina NextSeq 500 sequencer. Samples collected from January to April 2021 were sequenced by the Nevada State Public Health Laboratory using a Clear Labs platform; the ClearLabs SARS-CoV-2 test is an automated NGS library preparation and sequencing platform that uses a modified ARTIC v3 library preparation and performs ONT sequencing on a GridION (Oxford Nanopore). Genome consensus sequences were assembled using Medaka via the ARTIC pipeline , and mutations and deletions were identified using NextClade. Sequences were then classified by all defined viral lineages categorized both by Pangolin lineage classifications and major NextStrain (Hadfield et al., 2018) clades.

Estimating total COVID-19 infections in Southern Nevada
In order to develop a total infection estimate for Southern Nevada (see Fig. S1 for a schematic illustrating the overall process), we first performed a numerical integration of the wastewater SARS-CoV-2 concentrations observed at each facility (Table S2), coupled with the corresponding average daily flow rates (Table 1). These total wastewater loads (Table S1) were then divided by the total fecal load per infected individual over the course of an infection. Based on a preliminary fecal shedding model for SARS-CoV-2 Wölfel et al., 2020), we initially assumed a daily feces production rate of 126 g per person per day and a fecal load of 8.9 log 10 gene copies (gc) per gram, which was assumed to decrease steadily over approximately 25 days (Fig. 1). After numerical integration, this resulted in a total fecal load of 11.1 log 10 gc over a typical infection period. Modifications to this initial approach are described in the Results section. Note that the trajectory in Fig. 1 represents a simplification of fecal shedding and is primarily meant to provide a 'real-world' interpretation of the calibrated total fecal load. Also, this calibrated shedding parameter actually represents the total amount of SARS-CoV-2 shed through feces, urine, sputum, saliva, etc., but it is reasonable to assume that fecal shedding dominates at the community level (Crank et al., 2022). Importantly, the assumed trajectory and source (i.e., feces) do not impact the overall infection estimates.
To account for their shorter monitoring periods, the total SARS-CoV-2 loads for Facilities 2-7 were adjusted by multiplying by the ratio of the total SARS-CoV-2 load observed at Facility 1 since the start of the pandemic to the observed load at Facility 1 corresponding to each facility's monitoring period. This increased the total viral loads by a factor of 1.02 for Facility 4, 1.43 for Facilities 2-3 and 5-6, and 1.94 for Facility 7 (Table S1) to account for the first few months of the pandemic when these facilities were not monitored. This approach assumes the relative COVID-19 infection rates were similar for each sewershed, which appears to be a reasonable assumption based on the comparison of sewershed-specific case data ( Fig. S2 and Table S1). For example, the corresponding adjustment ratios based on the confirmed case counts were 1.01, 1.27-1.36, and 1.88 (Table S1), respectively, but we assumed the adjustment ratios derived from the Facility 1 viral loads were still superior based on known limitations with clinical data. Two other modifications were made to the wastewater surveillance datasets: (1) Facility 1 concentrations were increased by a factor of 3.5 to adjust for diurnal variability (described earlier ) and (2) the late-December SARS-CoV-2 spike at Facility 7 (8.7 log 10 gc/L) was adjusted downward to match the following week's concentration. Including the spike resulted in a seemingly erroneous total infection estimate and poor alignment between modeled and observed wastewater concentrations for Facility 7. The high concentration was assumed to be related to grab sampling coupled with the small size of that sewershed, leading to a presumably non-representative SARS-CoV-2 concentration in that particular sample.
Sewershed-specific populations and case counts were derived from data published by the Southern Nevada Health District (SNHD, 2021). Specifically, zip code-level data were allocated to sewersheds by crossreferencing zip code locations against jurisdictional maps. The vast majority of zip codes in Southern Nevada align by municipality/jurisdiction, thus each zip code can be assigned to a specific facility and sewershed. Sewershed-specific ascertainment ratios (i.e., X-fold infection undercounts) were then calculated as wastewater-derived infection estimates divided by confirmed case counts. These ascertainment ratios were then used to revise a previously published MATLAB (MathWorks, Natick, MA) model  for estimating wastewater concentrations based on confirmed case counts (code provided in Text S1). The original model used a different fecal shedding parameter and also assumed a single ascertainment ratio of 2 for all sewersheds (i.e., 50% 'asymptomatic'). The revised code increased daily case counts to adjust for the ascertainment ratio and then entered the corresponding number of infected individuals into a 25-day shedding sequence that followed the aforementioned trajectory (Fig. 1). The total daily SARS-CoV-2 load for each sewershed was then divided by the average daily flow rate for the corresponding wastewater treatment facility to arrive at the expected concentration for each day.

Human subjects statement
The University of Nevada Las Vegas Institutional Review Board (IRB) reviewed this project and determined it to be exempt from human subject research according to federal regulations and university policy.

Quantification of SARS-CoV-2 in Southern Nevada wastewater
During the 13-month monitoring period, SARS-CoV-2 RNA was detected in nearly all samples, and recovery-adjusted concentrations ranged from a minimum of 4.3 log 10 gc/L for Facility 1 at the onset of the pandemic to a maximum of 8.7 log 10 gc/L for Facility 7 during the winter 2020/2021 surge (Table S2). Fig. 2 summarizes the site-specific average BCoV recoveries and also illustrates the relationship between wastewater SARS-CoV-2 concentrations and confirmed COVID-19 case data reported at the zip code level by the Southern Nevada Health District. With the exception of  . The shedding trajectory was then revised for the current study (upper dashed line) based on the seroprevalence calibration approach. The solid black line illustrates an alternative constant-shedding assumption that achieves the same total viral load over the infection period. After integrating the fecal shedding curves over the duration of an infection (assumed to be 25 days), the models resulted in total SARS-CoV-2 loads of 1.21 × 10 11 gc/infection (original assumption) or 2.42 × 10 11 gc/infection (final assumption).
Facility 7, the data demonstrate that trends in wastewater SARS-CoV-2 concentrations align with trends in COVID-19 incidence, even at the sewershed level. Trends were less apparent for Facility 7 (Fig. 2H), presumably due to its sample type (i.e., grab rather than composite), small sewershed size, and shorter monitoring period, all of which make this system more susceptible to short-term fluctuations in SARS-CoV-2 load. However, there was an extreme concentration spike at Facility 7 in late December (8.7 log 10 gc/L) that could potentially be explained by the surge in COVID-19 incidence. Limited grab samples were also collected from a manhole serving a community shelter (Facility 8). SARS-CoV-2 RNA was detected in all four weekly samples collected between November 23, 2020 and December 14, 2020, with concentrations steadily increasing from 4.6 log 10 gc/L to 6.8 log 10 gc/L during this time (Table S2). Subsequent communication with shelter personnel confirmed that a COVID-19 outbreak occurred within the complex during this monitoring period.
3.2. Whole genome sequencing (WGS) of viral genomes from wastewater: Identification of the Alpha (B.1.1.7) lineage SARS-CoV-2 VOCs are known to include a series of mutations affecting their virulence and ability to evade antibody response (Deng et al., 2021). Given declining testing rates and the challenges with contact tracing (Becker et al., 2021), we asked whether viral variants could be identified through the analysis of wastewater samples. Using an amplicon-based NGS platform, we performed WGS of the SARS-CoV-2 genome in 86 wastewater samples collected in Southern Nevada between November 30, 2020 and February 22, 2021. Illumina sequencing yielded an average of 2.7 million 2 × 150 basepairs per wastewater sample (Table S3). Mean aligned genome coverage spanned 29,800 of the 29,903 nucleotides in the SARS-CoV-2 genome, or~99% (Table S3); mean depth of coverage ranged between 3000-and 37,000-fold, with a median depth of 15,000-fold across all samples (Table S3). In addition, we sequenced three RNA controls, including a B.1.1.7 patient-derived RNA sample, a synthetic B.1.1.7 RNA sample, and a synthetic lineage 19B RNA sample.
We first detected the B.1.1.7 lineage using WGS on December 21, 2020 at Facility 5. This sample satisfied all minimum criteria for B.1.1.7 identification, including the N501Y, ΔH69/ΔV70, and A570D mutations (Fig. 3A), at frequencies between 10 and 27% (Table S5). We additionally detected the spike Y144 deletion and spike single nucleotide variations (SNVs) at P681H, T716I, and S982A. The B.1.1.7 lineage was also detected at Facilities 1, 2, and 6 in samples collected on February 8, 2021, with varying numbers of observed mutations. Seven of the 23 definitional mutations were observed in the Facility 1 sample, all at or below 50% frequency, while 21 and 18 mutations could be identified in the Facility 2 and 6 samples, respectively. On February 15, 2021, B.1.1.7 was detected for a second time in Facility 6 wastewater, this time with nine mutations identified, and for the first time at Facility 7, with seven mutations identified. On February 22, 2021, B.1.1.7 was detected for a second time at Facility 5, with six mutations identified. Among these seven total samples that satisfied all minimum criteria for B.1.1.7 identification, between six and 18 additional B.1.1.7 mutations were observed ( Fig. 3A and Table S5).

Identification of the Epsilon (B.1.429) lineage
We first detected this VOC in a sample collected from Facility 8 (i.e., the manhole for the community shelter) on December 14, 2020. In this earliest sample, all six of the defining B.1.429 mutations were observed, at frequencies ranging between 7 and 80% ( Table S6). The fact that this VOC was only detected in the last of four samples collected at Facility 8 suggests that the outbreak at the community shelter likely consisted of multiple SARS-CoV-2 variants. On December 28, 2020, B.1.429 was also detected at Facility 5-the same location where B.1.1.7 was first identified. In three additional wastewater samples (Table S6), we detected the spike protein allele of concern L452R as early as December 7, 2020, but without additional variant observations, association of these samples to B.1.429 could not be made confidently.

No detection of B.1.351, P.1, and B.1.427 lineages in wastewater
At the time of this study, other VOCs in the United States included B.1.351, P.1, and B.1.427. Therefore, we investigated whether these VOCs could be identified in wastewater samples collected between November 2020 and February 2021. These lineages bear mutations of particular concern at spike protein N501Y (shared with B.1.1.7) and E484K (23012 G>A; not shared with B.1.1.7). The full descriptions of B.1.351 and P.1 in nucleotide and amino acid consequence are provided in Table S4. The E484K mutation was not detected in the sequenced wastewater samples. As such, our data do not support the detection of B.1.351 nor P.1 in Southern Nevada wastewater (as of February 2021). Finally, an additional B.1.427 variant had been identified as a VOC and shares many mutations with the original B.1.429, but B.1.427 also includes two additional mutations (Orf1a.S3158T and Orf1b.P976L), neither of which was observed in the Southern Nevada wastewater samples.

Whole genome sequencing (WGS) of viral genomes from clinical samples
We also sequenced SARS-CoV-2 genomes from 823 individuals that tested positive for the virus by qPCR in Southern Nevada between December 2020 and April 2021. Using a Clear Labs WGS sequencing platform, we assembled 575 samples for which 90% of bases were called (>27,000 out of 29,903 bases), mean coverage ranged from 40 to 15,000fold, and median depth was 730-fold.
Genotyping of SARS-CoV-2 clinical samples collected in December 2020 (n = 13) identified only the major clades 20A, 20C, and 20G (Fig. 4A). By January 2021 (Fig. 4B), clinical sampling reached the greatest depth of the study period (n = 246), resulting in the first six clinical observations of VOC B.1.1.7/20I and 81 samples (33%) from the B.1.427 + 429 lineage. Interestingly, these observations occurred one month after their detection in the corresponding wastewater samples. In February 2021 (Fig. 4C), the proportion of identified B.1.427 + 429 increased to 44% of clinical samples (65/149), B.1.1.7/20I was again observed, and VOI lineage B.1.526 was found in five samples. All major clades observed in January were also found in the February collections. Summed together, VOCs and VOIs comprised approximately half of the genotypes observed in the study area by February 2021 (48%; 73/149). Sampling of clinical specimens was lower in the first two weeks of April 2021 (n = 38; Fig. 4E), but the proportion of VOCs and VOIs increased further (79%; 30/38), representing the majority of all SARS-CoV-2 lineages in Southern Nevada.
As noted in the previous section, B.1.351, P.1, and B.1.427 were not detected in Southern Nevada wastewater during the study period. This is consistent with the clinical data for B.1.351, for which there were no detections. The clinical data did capture a small number of P.1 cases in March 2021 (1%; Fig. 4D) and April 2021 (3%; Fig. 4E), but there was no corresponding wastewater sequencing data for those months. Therefore, P.1 may not have been detected in wastewater simply because it was not present as of February 2021. Alternatively, insufficient sensitivity for wastewater sequencing may have resulted in false negatives, or those infected individuals may have been isolated to septic systems. This further highlights the complementary nature of clinical and wastewater surveillance. Finally, it was not possible to confidently distinguish B.1.429 and B.1.427 in wastewater, potentially due to insufficient sensitivity for the characteristic mutations. However, it is unclear whether distinguishing these particular variants would actually be necessary or beneficial in a public health context.

Estimating total COVID-19 infections in Southern Nevada
Since the onset of the pandemic, multiple wastewater surveillance, seroprevalence, and probability analysis studies have highlighted discrepancies between estimated COVID-19 infections and confirmed cases (Angulo et al., 2021). Due to limited testing capacity early in the pandemic, the high frequency of mildly symptomatic and asymptomatic infections, and the potential for false negative results, confirmed COVID-19 case counts likely underestimate the true number of infections. In fact, studies indicate the actual infection total may have been up to 20 times higher than the confirmed case count in some areas at certain points in the pandemic (Angulo et al., 2021).
By coupling observed wastewater concentrations across all facilities (Fig. 2), wastewater flow rate (Table 1), and an initial assumption of 11.1 log 10 gc/infection (Fig. 1), our preliminary calculations indicated that 73% of Southern Nevada had been infected at some point between March 2020 and April 2021, with three of the seven sewersheds yielding infection ratios (or COVID-19 prevalence) >100%. Because there was no reason to believe that nearly the entire population had been infected one or more times by that point in the pandemic, our initial assumption for fecal shedding was assumed to be too low. To resolve this issue, we modified the total SARS-CoV-2 load by calibrating our wastewater calculations to the U.S. Centers for Disease Control and Prevention (CDC)'s reported seroprevalence for Nevada, which was approximately 25% as of spring 2021 (CDC, 2021; https://covid.cdc.gov/covid-data-tracker/#national-lab). In other words, we increased the total fecal load per infection until the wastewater-derived estimate for total infections in Southern Nevada aligned with the CDC's seroprevalence estimate of 25% (Fig. 1).
Exact calibration to the CDC seroprevalence data resulted in an estimated number of infections for Facility 4 that was slightly less than the confirmed total for that sewershed. Therefore, the fecal shedding estimate had to be higher than the initial assumption of 11.1 log 10 gc/infection, but lower than the revised assumption of 11.6 log 10 gc/infection. The total SARS-CoV-2 load was then calibrated further to achieve an ascertainment ratio of at least 1.0 for Facility 4. This final calibration resulted in a revised total SARS-CoV-2 load of 11.4 log 10 gc/infection, which can be described with a decay approach (initial fecal load = 9.2 log 10 gc/g) or by assuming a constant fecal load (7.9 log 10 gc/g) over 25 days (Fig. 1). Interestingly, wastewater surveillance efforts in university dormitories  suggest that fecal shedding may be as high as 9.8 log 10 gc/g, which supports the higher shedding estimate in the current study.
Using this modified fecal load, our calculations suggest~38% of Southern Nevada was infected at some point between March 2020 and April 2021, with sewershed-specific estimates ranging from 7% for Facility 4 to 59% for Facility 5 (Table 1). In contrast, clinical testing suggests that only 10% of Southern Nevada had been infected during that same time period. The resulting sewershed-specific ascertainment ratios (i.e., wastewaterderived infection estimates divided by confirmed infections) ranged from 1.0 for Facility 4 to 7.7 for Facility 5. These ascertainment ratios were then used in conjunction with the updated fecal shedding trajectory (Fig. 1) to revise a previously published model for estimating wastewater SARS-CoV-2 concentrations based on confirmed case counts . The resulting sewershed-specific modeled concentrations aligned closely with the observed concentrations (Fig. 2), thereby indicating that the calibrated fecal shedding parameter and the calculated ascertainment ratios effectively corrected for the clinical undercount of COVID-19 infections.

Discussion
Molecular tools, including qPCR, ddPCR, and NGS, have proven effective for studying patterns of SARS-CoV-2 transmission in individual buildings, such as dormitories and community shelters, or across entire communities by surveilling large wastewater treatment facilities Betancourt et al., 2021;Bivins et al., 2020;Crits-Christoph et al., 2021;Gerrity et al., 2021;Hartley et al., 2020;Nemudryi et al., 2020;Tillett et al., 2021). Tracking SARS-CoV-2 VOCs is currently a major priority throughout the world, although efforts are expected to be hindered due to declining rates of diagnostic testing. With growing infection levels across more resilient subpopulations (Monod et al., 2021), specifically younger adults and children that are more likely to be mildly symptomatic or completely asymptomatic (Davies et al., 2020), and the emergence of less virulent strains of SARS-CoV-2, we anticipate a widening discrepancy between total infections and confirmed case counts (i.e., increasing ascertainment ratios). This scenario presents an opportunity-or even an urgent need-for communities to implement wastewater surveillance as a complement to clinical surveillance and outbreak mitigation strategies (McClary-Gutierrez et al., 2021).
The emergence of VOCs poses a clear threat to ongoing public health measures, particularly therapeutic strategies and vaccination efforts, considering that viral mutations have the potential to increase transmission and evade protection from natural or vaccine-induced immunity. For example, early in the pandemic, 55% of Pfizer or Moderna vaccine recipients showed reduced antibody titers to the B.1.427/B.1.429 variants in plaque reduction neutralization tests (Deng et al., 2021). In addition, the B.1.427/B.1.429 variants were 4-to 6-fold more resistant to antibodies from prior infection and 2-fold more resistant to antibodies from vaccination (Garcia-Beltran et al., 2021). Given that some monoclonal antibody treatments were shown to be ineffective against the B.1.427/B.1.429 variants, the U.S. government even recommended against distributing Eli Lilly's bamlanivimab to California, Arizona, and Nevada ("Coronavirus Disease, 2019 (COVID-19) Treatment Guidelines"). The U.S. Food and Drug Administration eventually revoked the emergency use authorization for this treatment entirely, due to its reduced efficacy against B.1.427/B.1.429 variants ("Coronavirus Disease, 2019 (COVID-19) Treatment Guidelines").
Our WGS data demonstrate that wastewater surveillance was able to detect the B.1.1.7 and B.1.429 variants in Southern Nevada as early as December 2020, approximately one month earlier than clinical testing. In fact, B.1.1.7 and B.1.429 were detected in the same sewershed within one week of each other, which suggests these variants were likely circulating in that community at the same time. The detection of these variants also coincided with a dramatic surge in confirmed cases in Southern Nevada. With greater sampling depth, these variants may have been detected in clinical samples as well, but sample limitation-only 13 samples were made available for public health surveillance in December 2020-likely hindered their early detection. Therefore, our data demonstrate the value of wastewater surveillance as an early warning system under resource-limited or sample-limited conditions, specifically by providing a pooled sample that effectively increases community sampling depth at significantly reduced costs. With advanced notice, public health officials can be prepared better to respond to emerging variantsand mitigate their impacts, for example by discontinuing use of ineffective therapeutic strategies. With higher resolution sampling campaigns (e.g., strategic selection of manholes or sewer collector/trunk lines), detection of VOCs in wastewater could potentially identify zip codes for targeted sequencing of clinical samples, to more rapidly identify VOC-infected individuals, and to prioritize contact tracing efforts. Interpretation of wastewater sequencing data can also be improved, considering the wide range of observed mutation frequencies (e.g., <10% to >90%) in individual samples. For example, benchmarking tests can be performed to assess the degree of PCR-induced bias in allele frequency data.
Due to the high costs associated with whole genome sequencing, municipal wastewater surveillance efforts may often be limited to detection and quantification of SARS-CoV-2 by qPCR or ddPCR. Even this more basic application of wastewater surveillance may yield critically important information for public health officials and policy makers. Importantly, facility and sewershed-scale wastewater concentrations provide an unbiased assessment of infection trends and the efficacy of COVID-19 mitigation measures. In resource-limited settings, such as community shelters, nursing homes, or even prisons, wastewater surveillance may also provide an early warning system for disease outbreaks within vulnerable populations.
It is important for public health officials to understand how sewershed characteristics and demographics affect interpretation of wastewater surveillance data, or how these factors might impact transmission. In this study, estimated prevalence (37 ± 3%) and ascertainment ratios (3.4 ± 0.4) were generally similar for the sewersheds represented by primary effluent or composite influent samples and having the largest service areas (i.e., Facilities 1-3). These three sewersheds comprise approximately 85% of the Southern Nevada population and encompass a diversity of demographics, thereby offering a reasonable approximation of communitywide public health conditions. On the other hand, the composite influent samples from Facility 4 resulted in low estimates for prevalence (7%) and ascertainment ratio (1.0), which suggests both a low infection rate and high clinical testing coverage. Facility 4 primarily serves two zip codes with high median household incomes (>$90,000) and older, retirement-age populations (15-20% in the 65-74 age range; https://healthsouthernevada. org, accessed 3/30/21), each of which may have contributed to more favorable public health outcomes. With respect to Facility 7, the wastewaterderived infection estimate supported its relatively low confirmed prevalence, thereby increasing confidence in the clinical surveillance data. This could potentially be explained by the sewershed's more isolated geographic location, which may have helped control the spread of COVID-19 in that area.
Early in a pandemic, wastewater-derived ascertainment ratios can provide an unbiased assessment of health disparities and an opportunity for strategic public health action, including targeted testing (e.g., mobile test sites), public information campaigns, contact tracing, and other mitigation efforts. For example, Facilities 5 and 6 exhibited both high prevalence and high ascertainment ratios based on wastewater surveillance data, potentially highlighting opportunities for targeted intervention. With better characterization of fecal shedding, ascertainment ratios could even be estimated with confidence on a rolling basis to provide public health officials with real-time information with high spatial and temporal resolution.
In later stages of a pandemic, estimating total infection levels by sewershed may provide critical information for strategic vaccine rollouts. Specifically, wastewater-derived infection estimates may provide an indication of how many vaccinations are needed in a given area to achieve herd immunity targets. Our analysis suggests that nearly 850,000 people in Southern Nevada had been infected through mid-April 2021, in contrast with the~240,000 confirmed infections. Assuming previously infected individuals retain adequate levels of protection (Pilz et al., 2022), herd immunity targets of 70-90% would require vaccinations of an additional 715,000 to 1.3 million SARS-CoV-2 naïve individuals. As of mid-April 2021, approximately 500,000 vaccinations had been completed in Southern Nevada. Moreover, sewershed-specific estimates of prevalence and ascertainment ratio could be used to inform vaccine distribution. Simply in a susceptibility context, confirmation of low prevalence (e.g., Facility 4) would suggest a higher risk of outbreak potential as communities relax mitigation measures. These areas could potentially be prioritized during the early rollout phase. There are other risk factors beyond susceptibility that must be considered, including high density living arrangements (e.g., correctional facilities) and vulnerable populations (e.g., long-term care facilities), but wastewaterderived metrics can at least better inform these complex decisions. Importantly, these recommendations would have to be re-evaluated over time to account for waning immunity and the emergence of new VOCs, particularly those with the ability to evade protection and 'reset' herd immunity targets (e.g., Omicron).
Due to the lack of robust fecal shedding data in the literature, our study relied on seroprevalence to calibrate our fecal shedding assumptions (Wölfel et al., 2020). This offers a starting point for others to assess COVID-19 incidence based on wastewater surveillance data, but it also highlights the critical need for fecal shedding data for SARS-CoV-2 and for other pathogens of interest in the future. Until those data are available, our study serves as a proof-of-concept for estimating total COVID-19 infections using wastewater-based epidemiology. This study also highlights several additional factors that should be considered when implementing wastewater surveillance for public health decision-making efforts. Grab influent samples proved to be adequate for SARS-CoV-2 detection and general trend analysis, but these samples also appeared to be more susceptible to intermittent spikes in concentration that might be influenced by factors other than COVID-19 incidence (e.g., diurnal variability) (Gerrity et al., 2022). This can have significant implications for wastewater-based epidemiology, particularly for small systems. Based on this experience, our data suggest that wastewater-based epidemiology efforts should rely on 24-hour composite influent samples, if possible, and follow-up sampling whenever a dramatic change in concentration is observed (e.g., the Facility 7 spike). Grab samples may be adequate when monitoring a large sewershed or when collected from a system with high levels of dispersion, such as a sewer collection system with significant mixing or after a primary clarifier.
Looking to the future, two challenges are likely to constrain clinical surveillance of SARS-CoV-2 in the U.S. and around the world: (1) declining rates of reportable diagnostic testing as individuals become unwilling or unable to visit testing sites (Becker et al., 2021) or turn to at-home testing options and (2) lack of access to archived samples from diagnostic labs (Abbasi, 2021). This poses particular concerns for identification and characterization of variants of concern or interest. Our data suggest that implementation of wastewater surveillance, particularly with a combination of qPCR, ddPCR, and NGS tools, can mitigate the impacts of these constraints and lead to the development of an early warning system for SARS-CoV-2, or other pathogens in the future. Overall, such measures can help determine what response to an outbreak is appropriate and evaluate the progress of mitigation or containment efforts. Our data illustrate how wastewater and clinical analyses can be integrated in a large community to evaluate trends in SARS-CoV-2 infections, estimate ascertainment ratios and assess testing adequacy, and rapidly detect the introduction of VOCs at the facility and community-scale. Van Vo, Richard L. Tillett, Katerina Papp are co-first authors and contributed equally.
Daniel Gerrity and Edwin Oh are co-corresponding authors.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.