A wastewater-based epidemic model for SARS-CoV-2 with application to three Canadian cities

The COVID-19 pandemic has stimulated wastewater-based surveillance, allowing public health to track the epidemic by monitoring the concentration of the genetic fingerprints of SARS-CoV-2 shed in wastewater by infected individuals. Wastewater-based surveillance for COVID-19 is still in its infancy. In particular, the quantitative link between clinical cases observed through traditional surveillance and the signals from viral concentrations in wastewater is still developing and hampers interpretation of the data and actionable public-health decisions. We present a modelling framework that includes both SARS-CoV-2 transmission at the population level and the fate of SARS-CoV-2 RNA particles in the sewage system after faecal shedding by infected persons in the population. Using our mechanistic representation of the combined clinical/wastewater system, we perform exploratory simulations to quantify the effect of surveillance effectiveness, public-health interventions and vaccination on the discordance between clinical and wastewater signals. We also apply our model to surveillance data from three Canadian cities to provide wastewater-informed estimates for the actual prevalence, the effective reproduction number and incidence forecasts. We find that wastewater-based surveillance, paired with this model, can complement clinical surveillance by supporting the estimation of key epidemiological metrics and hence better triangulate the state of an epidemic using this alternative data source.


Introduction
Wastewater has been used previously for monitoring of a wide range of behavioural, socio-economic and biological markers including: medical and illicit drugs Feng et al. (2018); Zuccato et al. (2008Zuccato et al. ( , 2005; antibiotic and antimicrobial resistance Christou et al. (2017); Rizzo et al. (2013); Laht et al. (2014); and industrial pollutant chemicals Rousis et al. (2017Rousis et al. ( , 2016. Spatial and temporal screening of the wastewater collection system or "sewershed" can provide qualitative and quantitative information on the marker of interest within the population in a given sewer catchment contributing to the wastewater. The wastewater data when used as an index of disease burden can be incorporated into traditional surveillance tools for monitoring disease prevalence that is purposeful, economical and action-oriented for public health . Wastewater-based surveillance (WBS) has also proven to be a low-cost and non-invasive tool for the management of infectious disease pathogens such as norovirus Lun et al. (2018); Fioretti et al. (2018) and poliovirus Asghar et al. (2014);Duintjer Tebbens et al. (2017); Brouwer et al. (2018) where viral concentration in wastewater served to supplement clinical surveillance. Since the start of the COVID-19 pandemic, SARS-CoV-2 RNA has been detected and quantified in sewage in many locations worldwide Naughton (2020) and was employed successfully in correlating the concentration of SARS-CoV-2 in wastewater to clinical cases reported in the sewershed Medema et al. (2020); ; Wurtzer et al. (2020); Peccia et al. (2020); La ; Larsen and Wigginton (2020); Randazzo et al. (2020); D' Aoust et al. (2021); Hata et al. (2021). In some instances of institutional surveillance, the leading wastewater signal (measured as SARS-CoV-2 RNA concentration in wastewater) compared to the clinical reports provided an early sign for the introduction or resurgence of COVID-19 into a community Gibas et al. (2021); Peiser (2020); Paglinawan (2020); D' Aoust et al. (2021) enabling rapid deployment of public health response and mitigation efforts.
Despite numerous successes with wastewater-based surveillance during the pandemic, utilizing wastewater surveillance data as a publichealth tool for quick response remains challenging for some jurisdictions, especially at the municipal level National Institute for Public Health Netherlands (2021); Public Health Ottawa (2021a). A major hurdle is the lack of a quantitative framework to assess and interpret the wastewater data generated and to translate that into public health action Public Health Ontario (2021b); WHO (2020). The common practice is to use the detection of SARS-CoV-2 in wastewater as a signal for COVID-19 (re)introduction in a community and/or perform trend analysis in parallel with clinical surveillance of COVID-19. At the time of this manuscript, it is generally not recommended to use SARS-CoV-2 WBS for direct inference of key epidemiological indicators such as prevalence of active infections Public Health Ontario (2021b); WHO (2020); Medema et al. (2020); Foladori et al. (2020).
Public health response guided by SARS-CoV-2 levels in wastewater is currently hindered by a lack of structured interpretive criteria, which is at present obscured by the inherent complexity and variation imparted by diverse sewersheds and their contributing populations Li et al. (2021); Zhu et al. (2021); Rabson (2021). Sources of data variability includes individual's shedding dynamics, sampling frequency of wastewater, non-standardized laboratory methods, sewershed-specific viral degradation and signal attenuation during its journey from the site of faecal shedding (and potentially from urinary or sputum deposit Jones et al. (2020); Wang et al. (2020)) to the sampling point. Attenuation of RNA signal in wastewater involves several factors, such as dilution in municipal wastewater constituents (e.g., storm water effects in combined sewers and infiltration effects in both combined and separated sewers), RNA degradation (e.g., due to household detergents and industrial wastewaters) and viral degeneration in the harsh wastewater environment due to temperature, bioactive chemicals, pH, etc. Solids sedimentation and resuspension may also play a key role in the transportation and decay of SARS-CoV-2 RNA because of the hydrophobic characteristics of the viral envelope and its strong associations to solids Gundy et al. (2009) ;Foladori et al. (2020). In addition, concentration methods for detection enhancement and minimization of inhibitory substances of molecular tests can result in some loss of the viral target ; Michael-Kordatou et al. (2020).
Here, we present a modelling framework that attempts to link quantified SARS-CoV-2 levels in wastewater with estimates of infections in the population within the sewershed, and to support policy decisions. The model incorporates both the viral transmission within the population via a standard epidemiological SEIR-like model ("Susceptible -Exposed -Infectious -Recovered") Anderson and May (1991) and the fate of SARS-CoV-2 in wastewater using a simplified hydrological transport framework. To illustrate potential applications, we fit our model to WBS data and traditional clinical reports gathered from six wastewater treatment plants (WWTPs) located in three Canadian cities (Edmonton, Ottawa and Toronto) and provide wastewater-informed estimates of key epidemiological metrics. We also perform exploratory simulations to investigate how the wastewater signal can be mechanistically associated with clinical surveillance of COVID-19.

Methods
We develop a mathematical model that mechanistically describes both the transmission at the population level ("above ground") and the concentration of SARS-CoV-2 in wastewater as a result of faecal shedding from the infected individuals ("below ground").

Transmission between individuals
To model SARS-CoV-2 transmission in the population, we use a SEIRtype epidemiological model. The disease progression of individuals is captured through several compartments that reflect their epidemiological states and disease outcomes (Table 1). Individuals can be susceptible (S); exposed (infected but not yet infectious, E); symptomatically infected who will later become hospitalized (J) or recovered without hospitalization during active COVID-19 (I); asymptomatically infected (A); hospitalized (H); those recovered and no longer infectious but still shedding virus in faeces (Z); fully recovered and permanently immune but not shedding anymore (R) and deceased (D). We ignore any migration movements, so at any given time the total population is constant and equal to N = S + E + J + I + A + H + Z + R + D. Infection occurs at a time-dependent transmission rate β t between infectious (states I, J or A) and susceptible individuals (S). Once infected, susceptible individuals enter the latent (non-infectious) state (E) for an average duration of 1∕ϵ days, where no faecal shedding occurs. A proportion α of all infections are asymptomatic. A fraction h of symptomatic individuals are hospitalized (H) for an average duration of 1∕ℓ days and for those, the COVID-19-associated mortality is δ. After their infectious period ends, patients enter the post-infection shedding state Z where SARS-CoV-2 faecal shedding still occurs for 1∕η days on average. The exposed (E), infectious (A, I and J) and post-infection shedding (Z) states are modelled with a series of sub-compartments in order to have their Table 1 Description of the model's compartments and parameters for the SARS-CoV-2 transmission within population and disease outcome.

Symbole Definition
S susceptibles E k exposed susceptibles but not infectious in k th subcompartment A k asymptomatic infectious cases in k th subcompartment I k symptomatic infectious cases in k th subcompartment J k symptomatic infectious cases in k th subcompartment who later admits to hospital Z k non-infectious cases but fecal shedding SARS-  Li et al. (2020). The transmission dynamics are represented by the system of differential equations 1a-1n and illustrated in Fig. 1.
ψ k J k . We use the dot notation to symbolize derivation with respect to time (e.g., Ṡ = dS∕dt).
The parameters ϕ k and ψ k are multiplicative adjustments to the baseline transmission rate β t to represent the infectious profile during the course of infection. The values for ϕ k and ψ k were chosen to represent the best estimate of the temporal infectiousness profile (hence their values are distinct, not constant) given the different results published (see Appendix A-2 and A-3 for details). The parameter ξ models the relative infectiousness of asymptomatic cases compared to symptomatic ones. The effective reproduction number of this model is (see Appendix A-4 for details on its calculation):

SARS-CoV-2 viral concentration in wastewater 2.2.1. Deposited viral concentration
The daily concentration of SARS-CoV-2 in wastewater is directly calculated from the total number of individuals that are actively shedding into the sewage system. SARS-CoV-2 faecal shedding varies according to the infected individual's clinical state and disease outcomes. Depending on the disease progression, infected individuals shed a variable amount of SARS-CoV-2 while they are in the shedding states (A, I, J and Z). We make the simplifying assumption that hospitalized patients (H)-assumed mostly bedridden and a small fraction of the shedding population-do not contribute to faecal shedding. Moreover, we assume that individuals in their latent period (E) do not contribute to faecal shedding: if they are not shedding enough through the respiratory tract be infectious, they may not shed significantly through the faecal route too.
The total concentration of SARS-CoV-2 RNA entering the wastewater at time t is given by The parameters λ k , represent SARS-CoV-2 faecal shedding dynamics per capita when the infected individual is in any of the epidemiological Fig. 1. Diagram of compartmental model. See main text for a description of the epidemiological states. The notation 1: n • indicates a modelling using n • subcompartments to obtain a gamma-distributed sojourn time in the associated epidemiological state. states (i.e., I, J, A and Z). Given the current lack of observational data, we used the same parameters λ k for all epidemiological states. Values for λ k were set at mid-range values of published studies (see Appendix A-3). Note that we assume the same reduction in faecal shedding as in respiratory shedding for asymptomatic cases (parameter ξ). The parameter ω implies that our model can only determine up to a constant the concentration of SARS-CoV-2 in wastewater Brouwer et al. (2018), even if the limit of detection of the assay is known. This reflects our current inability to quantify the various complex processes that affect the concentration, from patients' shedding to the concentration measured in laboratories (e.g., frequency and timing of sampling, RNA degradation in the sewer system, recovery efficiency of assays).

RNA transport and sampled viral concentration
We use a simple advection-dispersion-decay model to simulate the fate of SARS-CoV-2 along its journey in wastewater from the shedding points to the sampling site. This model is a combination of an exponential viral decay  and a τ-day dispersed plug-flow function, g(τ), representing all possible hydrodynamic processes (e.g., dilution, sedimentation and resuspension) that leads to RNA degradation as well as decrease and delay of signal at the time of sampling. The dispersed plug-flow g(τ) acts as a transformation function, which reshapes the initial deposited concentration, W* , into a delayed viral distribution over τ days as a result of the transit of SARS-CoV-2 in the sewer system. Hence, we defined the sampled viral concentration at time t as: where κ is the daily first-order decay rate of SARS-CoV-2 due to the harsh, complex and bioactive environment of wastewater. Because of the lack of decay-specific data for any of our sampling locations, we naively set its value based on the literature ; Silverman and Boehm (2020) at κ = 0.18 day − 1 for all locations. The SARS-CoV-2 RNA concentration entering the sewage system daily is modelled as a single hydrodynamic pulse per day and the plug-flow function, g, is obtained by the analytical solution of the axial dispersed plug flow differential equation Kayode Coker (2001). We then re-parametrize the analytical solution with the mean delay time τ and its standard deviation σ into a Gaussian distribution Note that our advection-dispersion-decay model of the RNA transport is not a spatial model and the physical/biochemical characteristics of each WWTPs and their unique wastewater matrix are not modelled in Equation 4 and Equation 5. The simple dispersed plug flow model (Equation 5) only modelled the average transit time of the viral particles (τ) in a given sewershed influenced by the unknown viral chemical partitioning and hydrodynamical sedimentation. See Appendix A-5 for more details.

Wastewater reported sample
The sample transportation, laboratory processing time and reporting lags, introduce reporting delays of RNA concentration in wastewater. Hence, we define the reported wastewater concentration as where ℓ ww is the reporting lag between wastewater sampling and concentration report after laboratory analysis. (Note that the reporting delay of the wastewater measurement is independent from the delay caused by the transport of RNA particles in the sewer system as defined in Equation 4).

Clinical reported cases
We also model surveillance data derived from laboratory confirmed and clinically diagnosed COVID-19 cases, acknowledging that instantaneous identification and complete reporting after initial infection is not possible. We assume that a fraction ρ of symptomatic incidence is reported with a lag of a ℓ clinical days from the time of infection. If i(t) is the total incidence at time t, we define the number of clinical cases reported at time t as:

Data collection
Wastewater samples were collected approximately two (Edmonton and Toronto) to seven (Ottawa) times a week. The sampling location was at the influent of the wastewater treatment plants. Wastewater samples were collected before de-gritting in Toronto, and after for Edmonton and Ottawa.
Wastewater samples from Edmonton and Toronto were shipped to the National Microbiology Laboratory (NML) in Winnipeg, Manitoba, where SARS-CoV-2 RNA concentration was measured. RNA from wastewater samples was purified using two methods. Prior to February 12th 2021, 15 mL of clarified supernatant (after 4000 × g centrifugation for 20 min at 4 ∘ C), was concentrated using an ultracentrifugal filter device (4000 × g for 35 min at 4 ∘ C) (Amicon Ultra-15, 10 kDa MWCO, Millipore-Sigma, St. Louis, MO, U.S.A). Total RNA was extracted from the resultant concentrate ( ~ 200 μL) using the MagNA Pure 96 DNA and Viral NA Large Volume Kit (Roche Diagnostics, Laval, QC) using the Plasma External Lysis 4.0 protocol as per manufacturer instructions. After February 12th 2021, the pellet resultant from clarifying (4000 g for 20 min at 4 ∘ C) 30 mL of wastewater was resuspended in 700 μL Qiagen Buffer RLT (Qiagen, Germantown, MD) containing 1% 2-mercaptoethanol. To this, 200 μL of 0.5 mm zirconia-silica beads (Biospec, Bartlesville, OK) were added and the sample was processed with a Bead Mill 24 Homogenizer (Fisher Scientific, Ottawa, ON) using 4 × 30 s pulses at 6 m/s, then clarified by centrifugation (12000 × g, 3 min) and the resultant lysate used for RNA extraction using the MagNA Pure 96 instrument as described above. Viral RNA was quantified using RTq-PCR with the US-CDC N1 and N2 primers.
For Ottawa, daily 24-hour composite primary sludge samples, consisting of four discrete samples collected at 6 h intervals and subsequently mixed, were collected and transported on ice to the University of Ottawa, where samples were analyzed within 24 h of reception. Samples were concentrated by centrifugation at 10,000 g for 45 min and RNA was extracted from a 250 mg portion of the resulting pellet using a modified version of the Qiagen RNeasy PowerMicrobiome kit D' Aoust et al. (2021). Quantification was performed using singleplex probe-based RTq-PCR for the N1 and N2 gene regions of the virus.
For Edmonton and Toronto, SARS-CoV-2 RNA was extracted from the solid fraction of influent samples. The solid concentration can be affected by the influent flow volume and environmental factors such as precipitation and sedimentation Bertels et al. (2022); Amoah et al. (2022). Hence, for these locations, SARS-CoV-2 concentrations in wastewater was normalized by the total solid suspension (TSS) measured on the sampling day at the treatment plant. For Ottawa, SARS-CoV-2 RNA concentration was normalized by the concentration of the Pepper Mild Mottle Virus (PMMoV) to account for the total human faecal mass present in the sample. Importantly, the goal of the normalization is mainly to control for the variation of other unobserved variables that can affect the viral concentration in wastewater. TSS and PMMoV may be a better proxy to control for the variations of our measured viral concentrations than daily flow because our the laboratory procedure extracts viral RNA from the solids fraction of the influent wastewater (Toronto and Edmonton) and from primary clarified sludge (Ottawa). For all cities, the reported viral concentration used in the model (W) was the average normalized concentration across all technical replicates for both the N1 and N2 genes.
We obtained clinical cases and hospital admissions (except for Toronto) for the catchment area of each of the six wastewater treatment plants. Hence, we were able to link clinical and wastewater surveillances. The data sets for the three cities are plotted in Fig. 2 Seroprevalence values at the city and province level were obtained from Canadian Blood Services (CBS) Canadian Blood Services (2021). The wastewater and clinical surveillance data used in this study are available in Supplementary File S1.

Fit to data
We use an Approximate Bayesian Computation (ABC) algorithm Beaumont et al. (2002) to fit the unknown or unobserved model parameters to the available data. Our observations consisted of i) SARS-CoV-2 RNA concentration in wastewater, ii) reported clinical cases of COVID-19, and iii) hospital admissions (unavailable for Toronto). For each ABC prior iteration, the error function is defined as a weighted trajectory matching We use 50,000 prior ABC iterations and retain the 100 smallest errors to generate posterior distributions (acceptance ratio 2 × 10 − 3 ). For every site, we fitted the time-dependent transmission rates (β t ), basic reproduction number (R 0 ), mean viral traveling time in sewer (τ) and the scaling factor for SARS-CoV-2 viral concentration measured at the sampling location (w). We chose slightly informative prior distributions to reduce the computing time (Table 2 and Appendix A-9).
The rest of the model parameters are fixed to a value based on the literature ( Table 2). The parameter β t is modelled as a piecewise linear function of time where the time partition was chosen manually to reflect the changes in trends of the reported incidence and wastewater signals. More details about the fitting procedure is given in Appendix A-1.
We define three types of fitting-to-data procedures. "Clinical" when w C = w H = 1 and w W = 0, to use data from clinical sources only; "WW" when w W = 1 and w H = w C = 0, to use wastewater data only; and finally "Combined" by adjusting the weights w C , w H and w W such that the contribution of each of the three error terms (the squared difference in Equation 8) are, on average, approximately equal. The "Combined" fitting procedure aims to have approximately the same contribution from clinical and wastewater data sources despite the differences in observation frequencies and values (in practice, we find that w W is about twice the value of w C in order to reach equal contribution). The simulations are initialized with 10 infectious individuals in the compartment I 1 at the date of the first reported incidence for each location.

Inference of unobserved epidemiological quantities
For a given location, once the model is fitted data, we can infer unobserved quantities of epidemiological importance by generating epidemic trajectories from the posterior samples. The posterior prevalence distribution (at each time point) is defined by simply adding the populations from the compartments representing active infection, that is The posterior cumulative incidence is obtained by summing Equation 1a until time t The fitted model can also provide an estimate of the effective repro-

Fig. 2.
Data sets used in this study for Edmonton, Ottawa and Toronto. Each horizontal panel is a city and colors represent the type of data (reported cases, hospital admissions and SARS-CoV-2 RNA concentration in wastewater). All curves were normalized to 1 (dividing by their respective maximum value) to plot them in one single panel to facilitate visual comparison. All data sets used in this study are available in Supplementary File S1. duction number from the different data sources (e.g., clinical and/or wastewater) using Equation 2.

Detection timing differential
In order to explore wastewater-based surveillance as a leading indicator of infection in the community, the time when SARS-CoV-2 is first reported from wastewater is noted d ww and defined as W(t = d ww ) = LOD where LOD is the limit of detection of the laboratory method. In addition, the time d clinical when COVID-19 is first reported is defined as C (t = d clinical ) = 1. Finally, we define the reported detection differential Δ = d ww − d clinical . As a result, the wastewater signal can be classified as a leading indicator over traditional clinical surveillance when Δ < 0. We assess how the reported detection differential Δ can be impacted by varying model parameters that would typically differ from one community-sewer system to another. We select only three combinations of parameters (among many) to illustrate how Δ can be affected, and most importantly how its sign can change indicating its transition between a leading and lagging indicator. We consider two levels of COVID-19 reporting, with ρ = 30% to reflect the approximate level of clinical under-reporting estimated from seroprevalence studies in Canada Canadian Blood Services (2021), and ρ = 70% that simulates a much more efficient surveillance.

Impact of vaccination
Although the model presented here does not explicitly have a vaccination process, we can mimic the main effects of an infectionpermissive vaccine (despite high efficacy against infection, the currently available COVID-19 vaccines do not induce sterilizing immunity Focosi et al. (2022); Yewdell (2021); Reynolds et al. (2022). We model a simple scenario that rolls out an infection permissive vaccine by gradually decreasing the transmission rate (β) by 70% over 50 days and increasing the proportion of asymptomatic infection (α) from 30% to 90%. This reflects the growing protection of the population from severe outcomes of COVID-19 as well as decrease in transmissions as the vaccine is administered. To assess the differential impact of vaccination on clinical and wastewater observations, we consider the ratio of the level of SARS-CoV-2 in wastewater over the reported clinical cases, W(t)∕C(t).

Results
We present our results in two sections. First, we apply our modelling framework to wastewater and clinical surveillance data from six sampling sites located in three Canadian cities (Edmonton, Ottawa and Toronto) and infer epidemiological parameters such as prevalence, effective reproduction number and incidence forecast. The second section is based on exploratory simulations (not fitted to data of a specific location) that highlight important mechanistic aspects between clinical (a) Each colour represents the different data sources used to fit the model (dark red: "Clinical", pink: "Combined", blue: "WW"). (b) The shaded ribbon indicates the 95% CrI for the estimate fitted on the "Combined" data set (CrIs for other data sources are shown in Appendix A-6). and wastewater surveillances.

Application to Canadian surveillance data
In this section, we compare the inferences made on key epidemiological variables by fitting the model to different data sources. Our goal is to assess the added-value of the wastewater-based data stream. Hence, in the following, we present inferences by fitting the model to different data sources available ("Clinical", "WW" and "Combined") and comparing the outcomes. The fitting outputs from the ABC algorithm are shown in Appendix A-8 and A-9. Fig. 3 shows, for four selected locations (EGB, OTW, TAB and THC), the SARS-CoV-2 prevalence estimated by sampling the posterior distributions fitted to the various data sources "Clinical", "WW" and "Combined" and the evaluation of Equation 9. Estimates of cumulative incidence (Equation 10) from "Combined" is also displayed and compared to available SARS-CoV-2 seroprevalence levels estimated from surveys by the Canadian Blood Services performed on banks of blood donors in Edmonton, Ottawa and Toronto Canadian Blood Services (2021). Note that our model was not fitted to seroprevalence data and this comparison acts as a crude check that prevalence estimates from the model follow the same trends as other independent data sources. Cumulative incidence from wastewater data source ("WW"), compared to seroprevalence levels is also presented in Appendix A-6.

Prevalence estimates
For all locations, as expected, wastewater-only prevalence estimates are close to the clinical-only ones when the levels of SARS-CoV-2 in wastewater mimic the COVID-19 trends in the population (Fig. 2). For example, the prevalence estimated from wastewater-only and clinicalonly are comparable for the December 2020 wave in Edmonton and April 2021 wave in Ottawa. However, when the clinical and wastewater signals are discordant, prevalence estimates can be significantly different. For example, wastewater-based prevalence estimates in January 2021 for Toronto Highland Creek (THC) do not show the peak seen from clinical observations. On the other hand, this January peak was captured in Toronto Ashbridges Bay (TAB)-another part of the city-and the subsequent March-May 2021 wave in Toronto Highland Creek (THC) was identified by both wastewater and clinical surveillance. Further studies are needed to understand the cause(s) of the discordance observed in THC in January 2021. Finally, we note that, because of the larger variability of SARS-CoV-2 WBS and/or their lower sampling frequency as compared to daily clinical surveillance, credible intervals of our wastewater-only inferences can be larger than the clinical-only ones (see Appendix A-6).

Effective reproduction number
The effective reproduction number (R t ) is a key epidemiological parameter that has gained recognition beyond public health circles during the COVID-19 pandemic Hollingsworth et al. (2020);Brauner et al. (2021). Using the same approach as for prevalence estimates, we inferred R t from epidemic trajectories generated from posterior distributions fitted to the three different data sources (i.e., "Clinical", "WW" and "Combined") and Equation 2.
For comparison, we also calculate R t using reported COVID-19 cases using the R package EpiEstim (version 2.2) Cori et al. (2013) as a separate approach based on clinical data exclusively.
Results shown in Fig. 4 exhibit the same behaviour as for the prevalence estimates, that is, mean estimates of R t are similar when trends of clinical and wastewater surveillance are comparable. Despite being based on a different modelling framework, estimates from EpiEstim are consistent with clinical-only estimates from our model. Finally, like for prevalence inferences, R t estimates from wastewater-only data (Fig. 4, blue solid line) tend to have broader uncertainty interval compared to R t from clinical-only data (see Appendix A-7 for associated 95% credible interval widths). Fig. 4. Effective reproduction number. Each panel represents a wastewater treatment plant. For wastewater-based R t (blue curve), only estimates after 2020-Nov-15 are shown for Edmonton and Toronto to avoid the initial assay setup period. The R t estimates from our model are spline-smoothed, see Appendix A-7 for details. (a) Solid lines represent the mean of effective reproduction number estimated with our model for given data sources, and with the R package EpiEstim (gray). (b) The ribbon indicates the 95% credible interval for the "Clinical" data source.

Forecasts
Our modelling framework allows to generate forecasts based on clinical, hospital (if available) and wastewater data.
In Fig. 5 we show three 1-month-ahead forecasting examples for Edmonton, Toronto Highland Creek and Ottawa using either wastewater data only, or clinical data only. The Edmonton example (Fig. 5, left panel) shows forecasts made as of April 1st, 2021. In this case, the forecasts are relatively similar because both the clinical reports and the wastewater signals are comparable and the model fits were similar using either wastewater or clinical data. The Toronto Highland Creek example forecasts as of December 20th, 2020 (Fig. 5, middle panel). For this location, the wastewater signal and clinical reports are discordant from December 2020 to February 2021. During this period the wastewater concentration is low and approximately flat whereas the clinical reports indicate a new wave of infections. As a result, the model fitted on these two data sources interprets the epidemic differently for this period and hence provides contrasting forecasts. The forecast for Ottawa as of December 15th, 2020 (Fig. 5, right panel) illustrates the case when the wastewater forecast is more accurate than the one based on clinical surveillance only. At that time, the wastewater signal in Ottawa has picked up a resurgence earlier than clinical surveillance. This resurgence is then captured by the model fit and hence the wastewater-based forecast correctly projects the resurgence.
Here, we merely intended to demonstrate examples of successful and unsuccessful forecasts using clinical and/or wastewater data, depending on the factors influencing these estimations (including variations in testing policy, sewershed characteristics, environmental events).

Simulations
In this section, we report results from simulations that provide general insights in interpreting WBS.

Leading signal and reported detection differential
We vary the LOD of the wastewater assay across a broad, but realistic, range Pecson et al. (2021) and calculate Δ, the detection time difference, for each simulation for a given value of LOD and the clinical reporting rate ρ. Because SARS-CoV-2 RNA concentration in wastewater can only be determined up to a constant in our model, the LOD values chosen here are rescaled to the parameters used to run our simulations and cannot be directly interpreted as RNA copies per mL, the traditional unit for LOD. Panel A in Fig. 6 shows that, depending on the LOD of the laboratory assay, the wastewater concentration of SARS-CoV-2 RNA can either be a leading (Δ < 0 for assays with low LODs) or a trailing indicator of cases (re)introduction when compared to reported clinical cases. This is the case whether the clinical surveillance system in the population is efficient or not (coloured curves, Fig. 6A).
We also vary the decay rate of RNA SARS-CoV-2 in wastewater within a broad realistic range ; Bivins et al. (2020); Mandal et al. (2020), as well as the transit time of SARS-CoV-2 between the shedding and sampling sites. Fig. 6B shows that, here again, the relative timing of (re)introduction detection by WBS compared to clinical surveillance can be affected by both the harshness of the wastewater (represented by the decay rate) and the transit time of SARS-CoV-2 in the sewer system. We note, as expected, that with a fast transit time (illustrated by a 1-day travel time in the left-most panel of Fig. 6) the decay rate will not have a significant impact on Δ clinical surveillance (efficient, ρ = 70%, or not, ρ = 30%), but as the transit time increases to 3 days (an upper bound considering strong sediment and recirculation effects) the effect of RNA decay becomes more important (increasing slope for the 1-day and 3-day transit times, Fig. 6B).

Decreasing trend in wastewater signal following an epidemic peak
We use a simple simulation approach to compare a declining trend of SARS-CoV-2 RNA in wastewater following a public-health measure (such as gatherings limits, compulsory face covering and lockdown) with traditional surveillance indicators such as reported cases. Previous studies Medema et al. (2020) Hata et al. (2021) suggested that wastewater surveillance could confirm declining trends in clinical infections, while other studies reported an apparent decoupling Weidhaas et al. (2021). We use our mechanistic model to explore this decoupling effect with simulations.
To mimic public health interventions, we use the constant transmission rate (β) in our model and modify it to decrease linearly by 1/3 of its pre-intervention value (β∕3). The time duration by which interventions can reduce the transmission rate to a steady value (time between starting date of imposed interventions and time of β∕3) is captured through T interv . This variable defines the speed of interventions' impact on slowing the transmission rate. For an example, a complete lockdown causing a sudden change in social contact rates and transmission rates, yet, wearing masks requires time to affect the transmission rate. We compared the relative reduction in the number of COVID-19 cases to the SARS-CoV-2 viral concentration in wastewater following 7 days of the epidemic peak by s cl (t) = C(t + 7)∕C(t) − 1 and  Using our baseline parameters (Table 2), we simulate an intervention that reduces the contact rate to a third of its pre-intervention value at different time of the simulations, ranging from 20 to 90 days after the introduction of the index case. Fig. 7A shows that, overall, the effect of an intervention that significantly reduces transmission yields a larger relative decrease in number of COVID-19 cases than the level of SARS-CoV-2 in wastewater. The post-peak relative reduction observed in clinical surveillance (s cl ) is consistently larger than the one from WBS (s ww ) even as the timing of the intervention changes (Fig. 7B). The difference is more pronounced as the intervention reduces transmission more rapidly (T interv small). Hence, given the observation noise typically encountered, we expect that the effect of a sudden change in transmission rate would be more clearly observable from clinical surveillance than from WBS. Viral concentration in wastewater tends to decline slowly following an epidemic peak and indicate an apparent decoupling with COVID-19 cases in the downward trend, caused-in our model -by the prolonged faecal shedding after infection.

Differential impact of vaccination
In Fig. 8 shows how W(t)∕C(t), the ratio of reported wastewater concentration over reported cases, increases following vaccination with a infection-permissive vaccine. Indeed, while an infection-permissive vaccine does reduce transmission, it still allows for infections to occur (mostly asymptomatic) and in particular, faecal shedding. Hence, a smaller proportion of infections are reported (because most of them are asymptomatic or too mild to be reported) but faecal shedding is less affected by this reporting bias and level of SARS-CoV-2 in wastewater decreases less steadily than COVID-19 surveillance. In this simulation, the approximately constant ratio before the start of vaccination (Fig. 8B) indicates that reports from clinical or wastewater data sources provide a similar picture of the epidemic for that period. However, once vaccination is implemented, the increasing ratio (Fig. 8B, green curve) highlights a discordance between the two data sources. In Appendix A-10, a short sensitivity analysis on the timing of vaccination and the duration of viral clearance shows the results presented above remain similar.

Discussion
Surveillance through detection and quantification of targeted pathogens in wastewater has been a noteworthy tool for public health across the world Sims and Kasprzyk-Hordern (2020) Here, we have provided a modelling framework to improve the understanding of the mechanisms at play between the viral transmission in the population and viral concentration shed in wastewater. This model can also provide estimates of key unobserved epidemiological parameters. We demonstrated the applicability of our model by fitting it to data from three Canadian cities and made wastewater-informed inferences of important epidemiological metrics (prevalence, effective reproduction number and forecasted incidence). Our estimates for cumulative incidence were above seroprevalence levels in all locations ( Fig. 3 and Appendix A-6). Although serosurveys bring key insights on the progression of an epidemic, they may not be considered as gold-standard as their sample may not be representative of the general population. We presented their levels as an interesting comparison exercise rather than validation of the model. We also note we did not model seroreversion (patients who were infected but subsequently test seronegative because of loss of immunity or antibodies falling to undetectable levels).
Importantly, we observed that estimates based on wastewater-only data usually provide a similar picture of the epidemic trajectory ( Fig. 3) but discordant signals can occur and lead to drastically different interpretations. This was the case, for example, in January 2021 in Toronto Highland Creek where the wastewater signal did not indicate a resurgence of infections, despite the wave observed from the reported clinical cases. We believe this muted peak in wastewater signal was not caused by a laboratory issue (an independent laboratory confirmed the same observation for this location), but rather from undetermined events in this particular sewershed at that specific time that need to be further investigated.
Similarly, the effective reproduction numbers inferred from wastewater data only are consistent with more traditional methods, such as using clinical reports with the software EpiEstim (Fig. 4). The fact that wastewater data can potentially act as a substitute for clinical surveillance (albeit with more uncertainty) to provide critical epidemiological metrics is encouraging, although more realistically, it will likely act as a complementary data source. The possibility to estimate epidemiological metrics using wastewater surveillance represents a step forward in Fig. 6. Simulations were run varying selected parameters to show their impact on the reported detection time differential (Δ). Panel A: effect of the limit of detection of the quantification assay performed on wastewater. Values below the 0-intercept horizontal dashed line indicate a leading signal from wastewater concentrations than from clinical reports. Panel B: effect of the SARS-CoV-2 RNA decay rate in wastewater for different transit times between the shedding and sampling site. The colour of the curves represents the proportion of clinical cases reported (ρ) out of the total symptomatic incidence.
attempting to use wastewater data for actionable public health metrics, if they are available to public health in a timely manner. In addition, the ability to triangulate the state of an epidemic using alternative data sources would help ensure additional confidence in the estimation of relevant parameters or forecasting. Indeed, the COVID-19 pandemic has consumed public-health resources at levels that are probably not sustainable for long-term surveillance of this pathogen. However, the current wastewater surveillance performed in many communities can probably be continued as long as necessary given its relative low cost . But clearly, as our study illustrates, more research is needed to reach a level of confidence in wastewater-based surveillance that could match clinically-based surveillance.
Our modelling framework provides a more principled alternative to simpler smoothing techniques (e.g., moving averages, polynomial interpolations) that have been used to support the interpretation of WBS D' Aoust et al. (2021). However, we note that less complex modelling options are possible if the focus is on specific epidemiological metrics (Huisman et al. (2021); Xiao et al. (2021)). We also note recent efforts to use machine learning techniques and artificial neural network that incorporate WBS Li et al. (2021). While those methods are promising, they cannot-by design-explain the epidemiological mechanisms at play.
Our model enables in silico experiments on the epidemic/wastewater system to identify key parameters and processes that can play an important role for the epidemiological interpretation. We showed, using simulations, that the relative timing of the wastewater signal (whether it is leading or not) compared to traditional clinical surveillance is actually influenced by the characteristics of both systems (Fig. 6). On the one hand, the laboratory analysis of wastewater samples may not detect the presence of SARS-CoV-2 because, for example, its limit of detection is too high, or prevalence of infection in the community is very small, or the viral RNA has degraded before reaching the sampling site. Shipment time of wastewater samples can also be significant (e.g., several days) for remote sampling locations without any laboratory capacity. On the other hand, the delay in clinical cases reports is usually caused by the incubation period and the reporting time of an infection by the health system (turnaround time for contact tracing and/or laboratory results of individuals' swab) or availability of testing. Some communities would typically have a longer lag for clinical reporting than for wastewater surveillance La ; Randazzo et al. (2020);D'Aoust et al. (2021); Hata et al. (2021), while others may have the opposite (for example when a very effective contact tracing system is in place Mettler et al. (2020) or rapid testing is implemented). Moreover, a community may experience both situations, that is a period when clinical surveillance is extremely efficient at detecting cases so rapidly that it leads wastewater surveillance while, at other times, it can lag (for example when incidence is high, overwhelming contact-tracing and clinical testing capacities). The model presented here allows to quantify how various factors can impacts the relative timing between clinical and wastewater surveillances.
It can be tempting to monitor the effect of public health interventions using changes in the levels of SARS-CoV-2 in wastewater given its noninvasive nature. Indeed, WBS should be less affected by sampling bias than clinical surveillance (for example the latter may miss most of the  fitted subclinical infections). However, our simulations showed that wastewater surveillance may be inferior to clinical surveillance to identify sharp declines in transmission, as typically seen after a lockdown is implemented (Fig. 7). The long period of faecal shedding creates a lag in comparison to the sudden drop of incidence caused by the public health intervention, inhibiting a prompt signal in wastewater. This effect is visible on the Canadian data sets presented here (Fig. 2). We note that this result may not be valid for laboratory assay with a high limit of Fig. 7. Detectability of a sharp transmission reduction. Panel A: example of how the postpeak relative changes are calculated. The colour-coded dashed lines represent the time series of reported clinical cases and SARS-CoV-2 RNA concentration in wastewater. The grey shaded area indicates when the transmission rate decreases (here, T interv = 10 days). The segment illustrates the relative change between the peak value and 7 days later (s ww and s cl ),i.e., how we would typically assess the efficacy to reduce transmission. Panel B: the horizontal axis represents the time (since the start of the epidemic) when transmission begins to reduce to a third of its value. The vertical axis represents the post-peak relative changes from clinical reports (s cl , red lines) or wastewater (s ww , blue lines). Each subpanel indicates a different value (3, 10 and 20 days) for T interv , the time it takes to reduces the transmission rate to a third of its initial value. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) detection of SARS-CoV-2 RNA in wastewater.We also highlight the potential for an infection-permissive vaccine to generate discordant signals between wastewater and clinical surveillances (Fig. 8). Indeed, vaccination implies a larger proportion of asymptomatic infections which are less likely to be detected by clinical surveillance, but still picked up in wastewater because of continued faecal shedding. Note that we use our model to highlight this potential effect, whereas detecting it in real data is probably challenging without studies purposely designed to detect this.
Here, we used a mixed approach regarding the normalization of the levels of SARS-CoV-2 in wastewater, with Ottawa using PMMoVnormalization versus TSS-normalization for Toronto and Edmonton. It is likely that some normalization is necessary to discount the in-sewer daily fluctuations resulting from environmental factors (PH, ammonia, temperature, precipitation, etc.), sampling methods and total mass of faecal shedded, but it is still not clear which normalization is the most appropriate for a given sewershed and the type of sampled wastewater (influent liquid versus solid). A recent study Kim et al. (2022) showed that SARS-CoV-2 concentration measurements from solid samples with PMMoV normalization produce comparable results between different laboratory methods and across sampling locations but comparability was not observed for viral measurements from liquid influent wastewater, perhaps due to multiple recovery methods for liquid wastewater which each extracts different recovered fractions of the SARS-CoV-2 RNA concentration at different efficiencies. This might explain the reduced correlation between COVID-19 case counts and SARS-CoV-2 N1/N2 genes in influent wastewater samples which are normalized with human faecal markers (e.g., PMMoV) Feng et al. (2021). Hence, the choice of normalization might be more relevant to the type of wastewater sample analyzed (liquid versus solid fraction).
Our modelling approach has several weaknesses. Forecasts and R t estimations depend on the quality of the model fitting to data and here, we used a simple ABC algorithm that could certainly be improved. However, the model reasonably fit to data for most locations (Appendix A-8 and A-9). In general, fitting the model to the combined clinical and wastewater surveillance ("Combined") might not be the best practice in the long run because of the reduction of traditional surveillance to prepandemic levels. Given the multiple sources of uncertainties associated with the wastewater data, we think that informing the model with combined data sets may provide a better triangulation of the pandemic.
We did not precisely model the transport and fate of SARS-CoV-2 in municipal sewer systems. The lack of data about flow dynamics and particles binding of SARS-CoV-2 in wastewater hampered a more detailed approach. Hence, we took a simple approach to model the below-ground component and assumed the flow dynamic followed a low-dispersion plug flow model with a plausible fixed decay rate  and let vary the mean transit time (from shedding to sampling sites) within a range of possible values. As more research focuses on the fate of SARS-CoV-2 in wastewater, the transport module of our model can be enhanced.Our model does not model vaccination explicitly. We made this choice to keep the first version of our model relatively simple. However, we believe that we can appropriately approximate the effects of infection-permissive vaccination by reducing the transmission rate and increasing the proportion of asymptomatic infections. As the proportion of vaccinated individuals increases, modelling an explicit vaccination process is necessary. We note that for the Canadian cities studied here, the vaccination coverage was either null or low during the study period. We model SARS-CoV-2 as a single-strain pathogen which is an oversimplification of reality, given the numerous variants circulating in Canada since late 2020 McLaughlin et al. (2021). However, it is not clear how (or if) multi-variants modelling would affect our results, given that the difference of viral shedding (respiratory and faecal) between variants is still not fully understood Kidd et al. (2021); Kissler et al. (2021). Because of ordinary differential equations, this model is not well adapted to either small communities or very low prevalence settings. While its epidemiological structure (Fig. 1) would still be valid for such environments, a more advanced statistical modelling would be preferable to handle low incidence counts and observation uncertainty King et al. (2015); Li et al. (2018). A further limitation of our model is the use of a scaling coefficient for the amount of SARS-CoV-2 shedded in the wastewater by the infected population (parameter ω in Equation 3). This scaling coefficient embeds all the uncertainties associated with sampling strategy and laboratory analysis, such as assay recovery efficiency, limit of detection, and total faecal mass normalization. Most of those processes are currently poorly known for SARS-CoV-2 and, as long as more observational data is not available, will constrain modelling (note that this limitation has already been identified for polio models Brouwer et al. (2018)). An ultimate goal of wastewater surveillance may be to measure all the components of the scaling coefficient (here, ω) in order to estimate infection prevalence in a community directly from viral concentration readings.
To conclude, the model presented here-built upon previous similar approach for other pathogens McMahan et al. (2020);Brouwer et al. (2018); Kraay et al. (2018)-is a first step to better understand the mechanistic relationships between the COVID-19 epidemic spreading in a community and the SARS-CoV-2 RNA concentration in wastewater caused by faecal shedding of infected individuals (and potentially from urinary or sputum shedding). Future developments should explicitly incorporate vaccination and multiple variants/strains given the ability of new assays to detect variants from wastewater samples La Rosa et al. (2021); Agrawal et al. (2021); Jahn et al. (2021). This model can be the basis of quantitative tools to support public health decision making that embraces wastewater-based epidemiology.
Beyond the SARS-CoV-2/COVID-19 pandemic, WBS coupled with the type of model presented here could be leveraged and applied to other transmissible pathogens where urinary or faecal shedding occurs, such as other respiratory diseases (e.g., influenza, respiratory syncytial virus, adenovirus) and some enteric diseases (e.g., norovirus, rotavirus, shigellosis).

Funding
Robert Delatolla acknowledges funding by Ontario Ministry of Environment Conservation and Parks Wastewater Surveillance Initiative Transfer Payment Agreement 2020-11-1-1463261970.

Declaration of Competing Interest
All authors declare they do not have any competing interests.