Uncertainty analysis in a large-scale water quality integrated catchment modelling study

Receiving water quality simulation in highly urbanised areas requires the integration of several processes occurring at different space-time scales. These integrated catchment models deliver results with a significant uncertainty level associated. Still, uncertainty analysis is seldom applied in practice and the relative contribution of the individual model elements is poorly understood. Often the available methods are applied to relatively small systems or individual sub-systems, due to limitations in organisational and computational resources. Consequently this work presents an uncertainty propagation and decomposition scheme of an integrated water quality modelling study for the evaluation of dissolved oxygen dynamics in a large-scale urbanised river catchment in the Netherlands. Forward propagation of the measured and elicited uncertainty input-parametric distributions was proposed and contrasted with monitoring data series. Prior ranges for river water quality-quantity parameters lead to high uncertainty in dissolved oxygen predictions, thus the need for formal calibration to adapt to the local dynamics is highlighted. After inferring the river process parameters with system measurements of flow and dissolved oxygen, combined sewer overflow pollution loads became the dominant uncertainty source along with rainfall variability. As a result, insights gained in this paper can help in planning and directing further monitoring and modelling efforts in the system. When comparing these modelling results to existing national guidelines it is shown that the commonly used concentration-duration-frequency tables should not be the only metric used to select mitigation alternatives and may need to be adapted in order to cope with uncertainties. © 2019 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Meeting the established environmental regulations (e.g. The Water Framework Directive, 2000/60/EC 200) of the European Union) is still a challenge in many densely urbanised catchments, as it often requires the implementation of intensive investment and regulatory plans (i.e. infrastructure construction, control systems or user limitations). Model-based decision-making is applied more frequently to explore and optimise the effect of different alternatives, aiming towards an efficient resource allocation. Therefore Integrated Catchment Modelling (ICM) has become an essential tool in the water quality management process over the last decades (Andr es-Dom enech et al., 2010;Langeveld et al., 2013b;Willems and Berlamont, 2002). ICMs are, by definition, abstractions of highly complex water systems, usually constituted by the joint modelling of two or more subsystems of the urban water system (Keupers and Willems, 2017;Rauch et al., 2002). This often involves the joint simulation of sewer hydrodynamics, wastewater treatment processes, rural hydrology and river physical-biochemical dynamics .
ICMs, like every other modelling process, contain various sources of uncertainty, due to the inherent system characteristics. Complex processes are represented with limited knowledge, relationships are calibrated with reduced data sets (which may lead to a poorly identifiable parameterisation) and linked simulations are carried out over a wide range of spatiotemporal scales. Also, the stepwise process of abstraction from reality to model representation with its necessary simplifications and idealisations of the real systems includes the unavoidable occurrence of uncertainties (Muschalla et al., 2009).
These uncertainties encompass the errors introduced by model parameterisation, model-forcing data (e.g. precipitation), model input data (e.g. digital elevation model, soil or sewer conduit maps), model validation data (e.g. use of incorrect water level rating curves) and model structures (e.g. different mathematical representations).
The definition, recognition and consideration of these uncertainties is therefore of the utmost importance for the application of such models and for the interpretation of the hereby obtained results (Pappenberger and Beven, 2006;Schellart et al., 2010). At present however, a comprehensive uncertainty analysis is mainly applied in science and less in planning practice (Kleidorfer, 2010;Vanrolleghem, 2011).
Several frameworks have been proposed to facilitate the quantification and handling of these uncertainties in integrated urban water systems specifically Tscheikner-Gratl et al., 2017) or environmental modelling in general (Refsgaard et al., 2007). The quantification of modelling statistical uncertainties is often carried out by encoding system knowledge through a probabilistic description (Reichert et al., 2015) and sampling (i.e. Monte-Carlo) to describe the variability at the targeted output variables. Additionally, the identification of the contribution of each uncertainty source is essential in the system analysis process, since it directs the modeller towards a rational reduction of epistemic uncertainties. For instance, Willems (2012) presented a variance decomposition methodology to quantify the partial contribution of uncertainty by source. Reichert and Mieleitner (2009) described the use of time-dependent parameters as a proxy to detect temporal windows of structural mismatch, pointing therefore at examining particular physical processes. Yang et al. (2018) used GLUE to extract the sensitivity from processdriven parameter structures. Inline with this, Gupta et al. (2008) discussed the need of diagnosis tools to guide the model conceptualisation process.
Examples of uncertainty analysis applications in integrated catchment modelling for water quality estimation are still scarce (Tscheickner-Gratl et al., 2019). This is partially due to the significant amount of effort to monitor and set up large-scale modelling studies. Also, computational constraints have severely limited the applicability of proposed formal uncertainty analysis methodologies (see Schellart et al. (2010)). Therefore, only few examples are available in literature which deal with relatively small systems (Freni and Mannina, 2010) or with individual sub-systems Radwan et al., 2004).
Consequently, this work describes the application of a formal uncertainty analysis scheme to quantify dissolved oxygen modelling uncertainties in a large-scale (4400 ha of draining urban areas, a 750,000 p.e. WWTP and a sensitive receiving water body) water quality ICM study. The estimation of the relative contribution of different relevant uncertainty sources is also shown, which served towards directing further modelling and monitoring efforts in the system.
Initially, the forward propagation of all sources of uncertainty was proposed using the best available literature-expertmeasurement derived parametric-input probability ranges. This will be therein referred as prior propagation. Upon analysis of the uncertainty contribution by source, river water quality and quantity parameters captured most of the variability of yearly dissolved oxygen (DO) dynamics. Thus, a dedicated inference scheme was proposed to update the river parametric distributions using local flow and DO measurements. The propagation of all parameterinput uncertainties and the updated parameter set for the river submodel is here presented (referred as posterior propagation). Comparison of the forward propagation from both prior-posterior uncertainty distributions with system observations, along with the current prioritisation of uncertainty sources is presented in this study. Complementarily, the impact of modelling uncertainties in the concentration-duration-frequency environmental assessment metrics is discussed, highlighting the possible implications of using such metrics in the selection of mitigation alternatives in environmental systems.
The discussion arising from this experience also serves to put into context the applicability of proposed uncertainty analysis techniques in real-world scale ICM studies.

Materials and methods
The Dommel is a stream (discharge of 4e30 m 3 /s) located in the south of the Netherlands. It receives the discharge of a wastewater treatment plant (WWTP) of 750,000 p.e. (population equivalent) and the intermittent discharge from 192 combined sewer overflow structures (CSOs). The water system covers and area of~800 km 2 , 29 combined sewer systems (4400 ha of urban connected area) and roughly 110 km of river tributaries. Fig. 1 provides the layout of the system and the measurement locations. Pollution loads from the WWTP and CSO discharges on the river are relatively common under wet weather conditions. This causes oxygen depletion events, which deteriorate the water quality status of the receiving water body.
The Dommel water management authorities operate an integrated catchment model aiming to better understand local pollution dynamics and to test alternatives for improving the ecological status. This integrated model consists of: a) A set of lumped urban drainage system models (conceptualised as in Solvi (2006)) characterizing the 29 municipalities sewer gravity and pressurised transport network. CSO structures were represented by 30 clusters (spatially lumping structures which shared the same sewer system section). Conceptual sewer models were derived using pre-existing detailed hydrodynamic urban drainage models, which have been checked on systematic errors during a prior model calibration, using hydraulic monitoring data of the sewers. Such lumped structures highly simplify the underlying dynamics and their use should be justified for each application. The system under consideration is nevertheless highly controlled and characterised by high in-sewer static volumes, also the simulation of dissolved oxygen processes at the river scale results in smoothed dynamics and are more sensitive to the overall discharged volume than to small errors in the timing and shape of the discharged hydrograph. Similar lumped approaches have been presented which correct for different dynamic phenomena (e.g. back-water effects or dynamic storage, see Wolfs et al. (2013) or van Daal-Rombouts et al. (2016)). b) A fully detailed WWTP model, including primary, secondary and tertiary treatment with capacity for 26,000 m 3 /h and a controlled storm settling tank with a treatment capacity of 9,000 m 3 /h. c) A simplified river model consisting of a lumped conceptualisation of well-stirred tank volumes as a 65 tank-in-series scheme which simulates water flow, pollutant mass fluxes and conversion rates. The physical and biochemical processes modelled at the river scale are presented in Table B1 (Annex B), which were adapted from the water quality module of DUFLOW. The full-integrated model was build using WEST (DHI) simulation platform for ICMs. A detailed description of the integrated catchment model structure and its development and validation process can be found at Langeveld et al. (2013b). A description of the main spatial and temporal characteristic scales in the Dommel system was reported in Moreno-Rodenas et al. (2017a).
The model structure was revisited sequentially, identifying and analysing the major uncertainty sources for each submodel independently. This is reported in Tscheikner-Gratl et al. (2017), where the application of the Quantifying Uncertainty in Integrated Catchment Studies (QUICS) uncertainty framework was described for the Dommel case-study.

Experimental data
Integrated catchment studies require extensive observational datasets for their development . Table 1 shows the main characteristics of the measurement series made available by the Dommel water management authority. The WWTP effluent represents roughly 50% of the total river flow during dryweather flow in the summer period . WWTP and CSOs discharges have the potential to locally saturate the river volume (up to 90% of flow contribution at different river sections, since urban areas present a faster runoff response than rural areas), leading to large pollutant concentration peaks under heavy rainfall conditions (up to 10e80 gBOD/m 3 peaks at several sections of the river). These heavy urban loads result in low dissolved oxygen concentrations at the receiving water body (5 gO 2 / m 3 average during summer) and strong dissolved oxygen depletion processes leading to temporary anoxic conditions with recovery time ranging from hours to days. CSO flow data were estimated from depth measurements at the weir structures and approximated discharge relationships (although such estimation may contain significant errors, see Van Daal-Rombouts et al. (2017)). Additional details on characteristic spatial and temporal scales observed in CSOs, WWTP and the Dommel river can be seen in Moreno-Rodenas et al. (2017a).
Rainfall data was derived from two main sources; a) a network of 13 rain gauges, mixing tipping buckets and weighting gauges from the Dutch national meteorological institute (KNMI), the municipality of Eindhoven and the Waterboard of the Dommel. b) KNMI distributed rainfall estimations from bias-corrected singlepolarisation C-Band Radar (Overeem et al., 2009). Table 2 provides the characterised parameter probability distribution for the urban drainage water flow submodel. The most influential catchments were selected based on connected area and discharged volume contribution (Moreno-Rodenas et al., 2017a). Uncertainties in wetting losses (volume and availability), total connected area, in-sewer maximum storage, wastewater generation per inhabitant and reservoir constants (of the lumped sewer system conceptualisation) were considered. Those uncertainties were derived from expert knowledge in physical plausible constrained ranges. Table 3 presents the parameter distributions considered at the  CSO water quality generator. This contains four fractionation parameters, which were identified as a truncated Gaussian distribution (range [0, 1]) with a mean value provided by a non-formal expert elicitation and a standard deviation of 10% the mean value. Modelling in-sewer water quality dynamics is still a challenge (Willems, 2006) and sufficient data was not available for a reliable submodel calibration. The sewer transport system is characterised by long conduits (up to 20 km mixing gravity and pressurised sections) and it is heavily controlled, thus the measured water quality at the WWTP influent is expected to render a low representativity of the conditions at the CSOs, thus limiting the use of data-driven generators (Keupers and Willems, 2015).

Parametric uncertainty
In order to produce a robust estimation, a mean pollutant vector multiplier (EMC) was used to approximate CSO loads from modelled flow dynamics. A monitoring campaign in the Dommel system reported measured pollutant series for various water quality variables relevant for the integrated model (BOD, COD and NH4) at several CSO events (Moens et al., 2009). This allowed estimating pollutant probability distributions and correlation structures (Fig. 2). A Gaussian copula stochastic model was proposed to generate random pollutant event concentrations (which respect the non-Gaussian marginal distributions and its correlation structure). The stochastic model was based on the following hypotheses; a) Pollutant mean concentration remains fairly constant over the CSO event duration in the system of the Dommel (reported by van Daal-Rombouts (2017)), b) The studied CSO locations have   comparable pollution dynamics over time (van Daal-Rombouts, 2017) and c) the instantaneous pollution concentration was assumed to be spatially independent since we could not establish a clear pattern from the available data. These assumptions are case study dependent. Also, the inclusion of a by-location correlation could be readily accounted for by the copula joint probability distribution, provided additional system measurements would prove its existence. A comparison of the measured and copula sampled pollutant estimation can be found in Moreno-Rodenas et al. (2017b). Since no onsite measurements for CSO dissolved oxygen concentrations were available, literature values were applied (Boomen and Icke, 2004;Diaz-Fierros et al., 2002). Thus CSO dissolved oxygen concentrations were characterised by a uniform distribu-tion~U(3,6) gO 2 /m 3 . Table A1 and Table A2 (Annex A) show the uncertainty distributions associated with the WWTP submodel. Parameter distributions for the WWTP influent model (Table A1) were taken from Langeveld et al. (2017) which used several observed influent timeseries to estimate the model parameters in a Bayesian inference scheme. These parameters were described using a normally distributed density function (truncated between 0 and 1) with mean, the average of each posterior sample chain of the MCMC and standard deviation of 5% from the mean value. The full posterior chains were not available so an estimated normal distribution was used to represent the influent parameters around the inferred mean values estimated from Langeveld et al. (2017). The effect of this approximation in the results of this work was nevertheless considered negligible. Table 4 shows the prior parameter probability distributions assigned to the hydrologic flow and biochemical process models of the river. Prior distributions were defined from expert knowledge and literature. Sediment oxygen demand (SOD) was measured in the system by the Waterboard De Dommel. The river model parameters were later inferred using a Bayesian inference scheme from flow and DO measurements at the closing section of the catchment (between 15-Jan-2012 until 04-Aug-2012) and validated with additional observations (05-Jan-2012 until 31-Dec-2012). A polynomial chaos expansion emulator was used to accelerate the sampling of the computationally expensive model during the inference process (Moreno-Rodenas et al., 2018b). This rendered an updated joint-parameter distribution set, which was later propagated through the full ICM by drawing correlated samples from the posterior parameter chains. Table 5 presents the parameter distributions used for the ruralto-river baseflow inflow pollutant loads. Values were estimated by expert elicitation since no measurements were available.

Dynamic input uncertainty
Errors in measured or estimated time-dependent inputs were represented as stochastic processes. Random sampling was applied to generate input realisation ensembles. Table A3 presents the selected most relevant input processes in the system (Tscheikner-Gratl et al., 2017).
Rainfall uncertain realisations were sampled at the 9 largest catchment areas (which covered 3,930 ha of a total connected area of 4,400 ha, approx. 90%). Rainfall intensity at the spatial blocksupport of each individual catchment was extracted from the KNMI corrected radar measurements (1 km 2 , 5 min) using hourly accumulation (Moreno-Rodenas et al., 2017a). An additive error model (dependent on rainfall intensity) was applied as proposed by Freni and Mannina (2010): in which the instantaneous estimated rainfall intensity, R Rad; i ðtÞ at each urban drainage location (i) was corrected by a random normally-distributed error ε $ Nðm ¼ 0; s ¼ 0:12Þ, for a ¼ 0:2323.
This error model approximates a normal dispersion around the measured value, which parameters were fitted by comparing radarrain gauge series at the same location. The comparison between estimated rainfall from rain gauges and Radar sources showed large differences (especially during heavy convective storm processes, see bottom-left event at Fig. A1, Annex A). This measured difference  (1) could not be represented by the additive error structure alone. Thus a miss-detection error structure was also applied. The system characteristics cause CSOs in the area to be usually activated after rainfall volumes larger than 8e10 mm, and pumping capacities are designed to empty in-sewer storage in 10e12 h. Therefore rainfall estimation differences were considered large when data sources (Radar and rain gauge inverse distance weighted interpolated values) had a cumulative difference larger than 10 mm within 12 h. In such cases, the rainfall input ensemble is updated by sampling from a uniform distribution covering both estimations (interpolated rain gauge network e Radar, Equation (1)). Figs. A1 and A2 (Annex A) show the rainfall input ensemble generated at four characteristic rainfall periods for the municipalities of Eindhoven (2000 ha) and Bergeijk (110 ha). Biochemical processes at the river stretch are highly influenced by water temperature, which results in daily and seasonal variation. There were five temperature stations along the river section of interest with hourly measurements. A spatially homogeneous Gaussian Process (GP) was used to characterize the water temperature input uncertainty: with the average temperature between the five sensors TðtÞ as the mean of the process and covariance matrix described by a squared exponential structure: where s ti is the measured by-location temperature standard deviation at time i and j and t d the time-lag. The river solar radiation input (IO measured ) was measured at the KNMI meteorological station at the city of Eindhoven. This measured data were used as a spatially homogeneous and timedynamic input for the entire river domain. An error model consisting in a normally distributed multiplier was used: Water temperature at the wastewater treatment works was measured and used as input in the treatment process model. A multiplicative independent Gaussian error was implemented as: where 12ℝ d is a vector of ones and I2ℝ dxd the identity matrix, being d the number of time-steps in the modelled series. The parameters s Io ¼ 0:15 and s TempWWTP ¼ 0:05 were assigned based on expected sensor representativity.

Forward uncertainty propagation
A Monte-Carlo (MC) based forward uncertainty propagation was performed using measured and expert-elicited parametric-input uncertainty probability distributions for the full year of 2012. Parametric samples were generated from a Latin hypercube sampler (LHS) to generate a low-discrepancy set. Dynamic inputs were sampled independently from the proposed stochastic processes. A first set of 600 samples was drawn from the fullintegrated model (in parallel model instances) to characterize modelled river flow and dissolved oxygen concentration uncertainties at the closing section of the catchment (M_0121, Fig. 1) using the best-available prior knowledge on the system. A Bayesian inference scheme was used to update the parametric distribution of several hydraulic and biochemical river model parameters (Table 4) given local observations in the system (Moreno-Rodenas et al., 2018b). First, four river flow parameters were inferred using a heteroscedastic, independent Gaussian log-likelihood distribution and drawing 50,000 posterior samples. Secondly, setting the inferred flow parameters, an independently, identically distributed Gaussian log-likelihood was used to sample from the posterior distribution of 8 water quality parameters (50,000 samples). The posterior residual structure respected the assumptions of the assumed error generating process, although a certain residual correlation was still present.
The updated river submodel parameters was used to generate 600 additional samples from the model by drawing correlated samples from the river submodel inferred parameter MCMC chains, from the copula distribution at the CSO pollutant stochastic model, ensembles from the time-dependent inputs and from a LHS for the rest of the parameters. Both prior and posterior parameter distributions were compared with monitoring data available in the system.

Uncertainty analysis by variance decomposition
A variance decomposition scheme was proposed following Willems (2012). This is based on defining independent groups of submodels or parameter-input clusters and analysing its contribution to the total residual error variance. Seven contributing groups were defined by selecting each submodel most relevant parameters-input sources (Table A4, Annex A). A period of~2 months (05-Aug-2012 e 07-Oct-2012) was simulated which captured several summer oxygen depletion processes representative of the dynamics of interest. 15 time-points, which were temporally independent, and represented relevant dynamics, were selected to perform the decomposition of variance as: for which s 2 EY ÀEY 0 represent the variance of the model-observation residuals (which were computed during the forward propagation of all considered stochastic input and parametric uncertainty sources). The total residual variance is assumed to be composed of s 2 EY 0 (variance of the measurement error), s 2 EY jclusteri the partial variance of each selected parameter-input group (propagating the effect of the group meanwhile fixing the rest to their average value) and a term s 2 EY jrest var which represents the rest variance not explained by measurement or input-parameters. s 2 EY jrest var was computed by estimating all other terms.
All variances should be homoscedastic (a Box-Cox transformation is often applied when data shows variance dependency). A very mild heteroscedasticity was found in the DO simulated O2_in inflow from rural~U (5,9) residuals, which was corrected using a l ¼ 0:9 (Box-Cox). A dedicated data quality validation was performed to the DO measurement series in this period and the quality was expected to be high. Tolerances between 3 and 10% are often accepted in DO concentration measurements. Thus, a multiplicative random Gaussian error (s DOt ¼ DO t ,0:05Þ was applied to estimate s 2 EY 0 : Samples were drawn from a LHS (250 samples) varying every parameter-input cluster to characterize the partial variance (s 2 EY jclusteri Þ. The reader is directed to Willems (2012) or Freni and Mannina (2010) for further detail in the variance decomposition approach.

Results and discussion
Uncertainty ranges for all relevant parameters and input sources were propagated through the full integrated model structure. Fig. 3 presents the observed series of flow and DO at the final section of the system (M_0121, Fig. 1) along with the mean of the simulated series and the 95% uncertainty range steming from the inputparametric variability when using all available prior knowledge in the system. The proposed model structure captures the flow dynamics in the river submodel, which are mainly driven by the baseflow inputs from the rural hydrology and the discharges from the WWTP (c.a. 40e50% of baseflow during summer) reasonably well (Nash-Sutcliffe efficiency of 0.82e0.86 at the inferred and validation series respectively). The prior forward propagation of DO dynamics resulted in a relatively large dispersion; the 95% interval distance between model samples had a yearly average range of 3.7 gO 2 /m 3 (s ¼ 0.76 gO 2 /m 3 ). The 95% range was 3 gO 2 /m 3 for DO concentration lower than 2.5 gO 2 /m 3 . Using the inferred water quality parameters, the 95% range was reduced to 1.8 gDO/m 3 (s ¼ 0.42 gDO/m 3 ). Fig. 4 provides the comparison of measured and modelled DO using the posterior forward propagation for a timewindow during the summer period (between July and October). Figure C1 and Figure C2 (Annex C) present the simulated rainfall, DO (measured-modelled) and the BOD dynamics at two sections of the river. Discharged BOD can reach high concentrations (up to 10e60 gBOD/m 3 ) at the receiving water body. Local measurements of BOD were not available at this period, but a monitoring campaign carried out during 2007e2008 (Moens et al., 2009) reported peaks of the same order of magnitude (up to 10e80 gBOD/m 3 ). Consumption of high BOD loads is the dominant process in the acute DO depletion. A fraction of the BOD load settles and degrades in the river bed, which dominates the speed of the DO recovery process (which can take between hours to days).  The depth and recovery pattern of most DO depletion processes are captured by the simulated and measured series (root-mean squared error of 1.25e1.6 gO 2 /m 3 during the inferred and validation time-periods respectively), yet a few events show insufficient depletion in the model (Fig. 4),. This can suggest that a certain process might be missing in the model structure description; i.e. insufficient WWTP loads under certain conditions or a stronger rural contribution. However, not enough system data was available to further validate those hypotheses.
For instance, during the timespan between the 19th and the 29th of November 2012 three acute oxygen depletion processes occurred which were not captured by the model response in both scenarios (Fig. 3). CSO discharges were not reported during this time. The same depletion process was captured by an upstream DO sensor located between the WWTP and the M_0121 sensor, but not by other sensors distributed throughout the system. During this period only minor inflow (c.a. 12,000 m 3 /h) was recorded at the influent of the WWTP (with 35,000 m 3 /h maximum capacity). Thus, this oxygen depletion events could only be explained either by a major disturbance in the WWTP operations (which is not supported by effluent WWTP measurements of TSS, NH 4 or COD) or a disturbance in the sediment bed of the river (i.e. dredging or mowing, which has been confirmed to have occurred by the Waterboard De Dommel). Those processes were not described in the model structure and hence could not be captured by the simulation response. This type of events were not reported in monitoring data for previous years thus were neglected in this study. Yet this can become a relevant source of structural uncertainty and should be further investigated. Fig. 5 presents the cumulative probability density function of the yearly measured and modelled DO/flow series, along with the 95% ranges. When performing the posterior propagation (after inferring the parameter distributions for the river quantity and quality submodel parameters) the uncertainty band was significantly reduced; the average 95% uncertainty range was reduced from 3.7 gO 2 /m 3 to 1.8 gO 2 /m 3 and the two sample Kolmogorov-Smirnov (KS) test rendered that for 99.5% of the time-steps the null-hypothesis (both forward propagations render the same probability distribution) is rejected with p-value < 0.001 and mean K-S value of 0.27. This can be explained by the fact that the ranges attributed to the river parameters prior knowledge where narrowed down by the information provided in river DO-flow measurements. The mean simulated series for flow in both cases represented reasonably well the occurrence probability of high and medium hydrographs. Yet there was a systematic overestimation of low flows (river flow below 3e4 m 3 /s), which is likely caused by an overestimation of the rural dry weather contribution. Yet, this is expected to have a limited influence in DO depletion dynamics which occur mainly during wet weather conditions. The model exhibited a systematic underestimation of high dissolved oxygen concentration. This is seen during winter months (Fig. 3). The seasonal DO variation in the model structure was captured by several factors; i.e. a constant sediment oxygen demand (SOD, see Table B1, process number 5) and temperature driven inhibition coefficients for oxidation rates and reaeration patterns. SOD dynamics were estimated from system observations and we tried to respect the reported values. The inferred probability distributions for the biochemical river parameters could still not match the high DO concentration (Fig. 5 b) well enough. This suggests that either the base DO inflows are underestimated in the current version of the model (e.g. too low WWTP DO effluent/Rural DO for which there were not reliable measurements) or that there is a structural process missing in the river conceptualisation (e.g. a stronger winter-summer sediment oxygen demand variability). Nonetheless, the accurate simulation of high DO concentration is of little interest for the model application. On the other hand, low DO level yearly probability was better matched as can be seen in Fig. 5.
The information from modelled-observed time-series in river water quality assessment studies is often compressed for system evaluation down to a low number of performance indicators. A common reporting method for river water quality status within the EU water framework directive compliance is the use of concentration-duration-frequency (CDF) tables (FWR, 2012). Limit levels are commonly extracted from an ecological assessment study, which defines survival conditions for critical species. Table B2 presents the environmental CDF tables for three levels of water quality compliance (Basic, Critical and Salmonid) in the river Dommel. Exceedance frequencies are computed and contrasted with the tolerated ones from which five status classes are derived: • Class 1: less than 0.5 times the tolerated frequency. • Class 2: less than 1 time • Class 3: more than 1 time • Class 4: more than 1.2 times • Class 5: more than 2 times the tolerated frequency Measured and modelled water quality status classes for three DO CDF tables are shown in Fig. 6, where the black dot refers to the estimated class in the measurement dataset and the histogram shows the modelled status occurrence probability density. Low frequency (i.e. 0.1 and 1 times/year) would require a longer time series to be estimated reliably (3e10 years). Yet the current status of the river presents a fairly low quality status for low frequency-high magnitude oxygen depletion processes, which was captured in both measured-modelled series. Low magnitude-high frequency events with short duration had a higher degree of uncertainty. This Fig. 5. Cumulative probability density of flow-DO measured (black dashed), simulated mean (black solid) and 2.5e97.5% percentiles (grey solid). a) Forward uncertainty propagation of all prior inputs-parameters. b) The resultant parametric-input uncertainty using the updated river water quality-quantity parameters.
concurs with the hypothesis that large DO events have often a lower degree of uncertainty associated, since the system is oversaturated and albeit the uncertainty sources, a strong DO depletion occurs. Yet low-medium intensity events are often more sensitive to variability. Also, the information compression in the status classes for CDF metrics creates a differential sensitivity. Class 5 captures a large range of system responses (>2 times the allowed frequency), meanwhile the occurrence of Class 1 to Class 4 has a shorter range (0e1.2 times the allowed frequency). Therefore systems with a poor environmental water quality status (Class 5) are easier to identify than those falling between good and medium status. This implies that environmental status (from CDF tables) might not be the most appropriate metric to discriminate between the effects of different correction strategies (i.e. when selecting between a new real time control strategy or infrastructure investment to reduce DO depletion events). A modeller could argue that the model performs well on the current system water quality status (poor environmental conditions, Class 5) but stochastic predictions of system improvements (Class 1e4) have the potential to result in a wider uncertainty range (probably beyond the marginal gain between alternatives).

Uncertainty source analysis by variance decomposition
A variance decomposition scheme was applied to estimate the uncertainty contribution of different sources to the DO simulated series. The variance decomposition method provides a picture of the uncertainty contribution by source. However, it has certain limitations, which should be acknowledged and carefully addressed; first, it provides a lumped variance contribution, and thus identification of contribution at characteristic dynamic points proves difficult (limiting its diagnostic power). Secondly, it relies on certain hypotheses that might not be always met, as independence of error sources or homoscedasticity, which have the potential to distort the outcome, thus checking their influence is recommended. The design of this study tried to minimize the effect of these characteristics (following the recommendations of Willems (2012)). Fig. 7 provides the relationship between the estimated variance at DO and flow series in three representative points and the number of simulation samples. This shows that the number of samples selected in the uncertainty decomposition scheme (250 per cluster) was sufficient to provide a robust variance estimation. The decomposition of the prior parameters and input uncertainty (Fig. 8) shows that the river flow and water quality parameters are the contributors of roughly 70% of the total uncertainty. Meanwhile rainfall uncertainty and CSO pollution parameters accounted for about 10% each. On the other hand, WWTP parameters and river inputs had a negligible effect on the global DO uncertainty. This shows that water quality and quantity river parameters captured most of the variability, which is caused by a high sensitivity of DO dynamics to the in-river biological processes and the relative poor knowledge on the actual parameter distributions. Therefore we proposed a parametric inference process for the river parameters, updating the prior assigned pdf's using local DO and flow measurements in the river. The inference process narrowed down the prior distribution of water quantity and quality parameters in the river, thus reducing the contribution of the parametric uncertainty in the river section to 16%. This reduction of parametric uncertainty was also transferred to the selected model-measurement error model in the likelihood formulation, which was captured in the variance decomposition scheme by the increase in the rest uncertainty term (~18%).
In the posterior variance decomposition, the contribution of rainfall uncertainty and the CSO pollution modules increased to 20e30% respectively. This shows that the DO dynamics are overall heavily influenced by rainfall driven discharges (WWTP and CSO) and especially by the water quality characteristics of the discharged volumes. Modelling CSO pollution concentrations is a challenge in urban drainage modelling and render highly uncertain results (Sandoval et al., 2018). Thus further monitoring and modelling efforts are still required in this area. Rainfall data uncertainties are shown to be relatively influential in the DO dynamics. The selected rainfall input error model (Equation (1)) was motivated by the deviations found at rainfall estimations from the radar and the interpolated rain gauge network. Nevertheless, additional efforts should be directed to improve the quality of estimated rainfall inputs in the system.
The influence of the river submodel parameters is still not Fig. 6. Water quality status assessments for the forward posterior propagation (histogram density) and measured data status (black circle). Basic, critical and salmonid tolerated dissolved oxygen depletion duration and yearly frequency (2012, excluded 19 th -29th of November). negligible (~16%), which is mostly due to the relatively uncertain river sediment oxygen demand parameter (SOD). Urban drainage parameters had a comparatively low contribution to the DO variance (~7%). This might be explained by the fact that UD parameters largely influence CSO discharge timing and shape, yet have relatively low effect on total discharged volume during large storm events. DO dynamics in the receiving water body are known to be intensively time-buffered by the nature of the process (Moreno-Rodenas et al., 2017a), thus pollutograph timing-shape errors are less dominant than errors in volume/mass estimation. The comparison between lumped and hydrodynamic sewer water quantity modelling structures was not considered in this study. The selection of model structure and associated uncertain parameters may affect these results, yet it is expected that uncertainties in the water quality routine are significantly larger than those of water quantity for urban drainage modelling . The WWTP parameter set rendered a low contribution to the total variance (~0.5%). WWTP is however a highly relevant process in the system dynamics and its proper conceptualisation and calibration is of foremost importance. Also, uncertainties associated with changes in regulations (Dominguez and Gujer, 2006) or operational changes can become dominant in certain scenarios and were here neglected. This study only focused on the effect of WWTP influent and fractionation parameters, which are reported to be some of the most relevant uncertainty sources (Belia et al., 2009) in WWTP outputs. Yet WWTP influent-fractionation parameters showed little influence in the modelling uncertainty for receiving water body DO concentration in the river Dommel. Few other studies exist that deal with uncertainty analysis of DO dynamic modelling in urbanised river catchments. For instance, Radwan et al. (2004) presented a variance decomposition scheme for the modelling of DO in a river catchment in Belgium. Only the receiving water quality was modelled and CSO and rural pollution sources were considered as model inputs. They showed that roughly 30% of the variance of the process could be attributed to the river water quality parameters; whereas the input pollution loads explained~60% of it. Also, Freni and Mannina (2010) and Freni and Mannina (2012) performed an uncertainty analysis of a small integrated system containing two urban drainage systems (~115 ha, 9,000 inhabitants), a WWTP and a river section in Sicily, Italy. The variance decomposition results showed that WWTP BOD discharge uncertainties were dominated by upstream submodels (e.g. sewer system or rainfall), whereas WWTP parameters had a lower influence. On the other hand, they reported that uncertainties in the water quality-quantity river parameters and the rest of the upstream submodels contributed 40% and 60% respectively to DO uncertainty at the receiving water body. Willems (2008) also presented a small-scale integrated catchment model (simulating urban drainage and WWTP but not receiving water) in which an uncertainty analysis scheme was proposed for several water quality variables (TSS, SS, BOD and NH 4 ) reporting that rainfall uncertainty contribution represents 10e20% of the variance in all variables simulated at the outlet of the WWTP. Although uncertainty contribution is largely a case-dependent process, our results showed a similar structure as in previously reported studies; combined sewer overflow water quality characteristics and rainfall variability are the most relevant sources of uncertainty in DO simulation for the river Dommel in the studied period, provided that a detailed study is directed to identify and calibrate the river water quality dynamic processes.
The applicability of many uncertainty analysis techniques Jakeman and Jakeman, 2017) is limited when dealing with large-scale modelling applications. This is mainly due to insufficient observational data and the computational effort required. The example provided in this work shows as forward MC and variance decomposition schemes can be readily applied even for a computationally expensive systems (i.e. using code parallelisation and efficient sampling schemes). Performing Bayesian parameter inference is however prohibitive in most real-world cases, yet if carefully selecting a reduced number of parameters, model emulation can be used to accelerate the inference sampling (see emulators for hydrology; Machac et al. (2016), Machac et al. (2018) hydrodynamic modelling;Moreno-Rodenas et al. (2018a), or for urban drainage inference; Wani et al. (2017)). In this study, the effect of updating prior knowledge in a set of highly influential parameters using the full ICM structure on the uncertainty contribution analysis was also shown.
Nevertheless, emulation strategies can only deal with lowdimensional parameter sets, thus inferring the full ICM parametric space falls beyond current data and computational capabilities for most real-world scale cases. In this line, Muschalla et al. (2009) discussed the process of abstracting ICM studies, reflecting about the need of partially calibrating (or inferring) sections of the full model based on intermediate measured state variables. Also, available datasets render many processes unidentifiable. This lack of knowledge should be accounted for during the uncertainty analysis scheme, however, process identifiability (especially insewer water quality dynamics) still represent a major obstacle in the deployment of functional ICMs for environmental decisionmaking support.

Conclusions
This study presents an uncertainty analysis scheme of a largescale integrated catchment model. The selected ICM accounts for urban drainage (4400 ha), WWTP (750,000 p.e.) and receiving water processes in an intensively urbanised catchment in the south of the Netherlands with the aim to simulate dissolved oxygen dynamics for ecological assessment.
An uncertainty decomposition scheme showed that the river water quality and quantity parameters are responsible for~70% of the uncertainty in DO river simulations when using expert elicitedliterature based river parameter ranges. A first reduction of the uncertainty was achieved through inferring the river parameter set using local measurement data (which reduced its contribution tõ 16%). CSO water quality parameters were the most relevant source of uncertainty (~30%), indicating that monitoring and modelling efforts should be directed in that direction. Rainfall uncertainty accounted for roughly 20% of the variance in DO simulations, which was resultant of the relative disagreement of local rain gauge and radar rainfall estimations for the area. Our results showed similarities with previous reported studies for DO uncertainty analysis in integrated catchment systems. However, the generalisation of these results to other cases should be performed carefully since the structure and mechanistic relationship of the system may vary significantly and this has the potential to influence the uncertainty distribution by source, thus further uncertainty analysis studies in ICMs are still necessary.
The forward uncertainty propagation showed that the model structure lacks flexibility to accommodate seasonal winter DO levels. A suggested approach is to propose a time-dependent sediment oxygen demand process, yet this requires additional system observations. Also, the study of uncertainty propagation in concentration-duration-frequency ecological status tables showed as low water quality status systems are easier to identify (lower uncertainty associated) than good-moderate status. However, good and moderate water quality status are more sensitive to low DO variability and thus the uncertainty tends to be larger, limiting the identifiability of correction effects. This should be acknowledged when reporting modelling results and should be accounted for by decision-makers when dealing with simulation-based pollution mitigation measure selection.

Author contributions
AMR carried out the experimental design the simulations and data analysis. AMR and FTG contributed to write the manuscript. The work was performed under the active supervision and discussion of JL and FC. All authors copy-edited the manuscript.

ANNEX B: ICM submodels and environmental metrics
The urban drainage submodel processes are described at in detail at Solvi (2006). Table B1 provides the process matrix for the biochemical river water quality model. The river model represents the degradation and sedimentation of four BOD fractions (fast-slow and particulate-dissolved), nitrification processes, degradation of sediment BOD, reaeration and photosynthesis of the river macrophyte. The process equation parameters relate to the river water quality parameters of Table 4.
River flow was approximated with a tank-in-series river scheme using trapezoidal sections. The outflow at each river reach was approximated by the Gauckler-Manning equation. Table 4 covers the four parameters relating river flow; river roughness (n), and three global multipliers for the estimated river bottom width (k W b), trapezoidal bank-slope (k z) and rural baseflow input (k h).
The WWTP submodel was represented by an ASM2d biokinetic model (Gernaey et al., 2004), and a urban drainage to WWTP empirical influent model , which relate to the parameter set at Table A1.  A2. Example of the rainfall error model at four different periods. Estimated rainfall input mean (1000 samples) and 95% range (blue), KNMI radar estimate (dashed black) and interpolated (inverse-weighted-distance) value from the rain gauge network (black solid) at the urban system of Bergeijk. System dynamics detail comparing rainfall variability (KNMI rain gauge 370, KNMI Radar at the same location and the estimated intensity at the city of Eindhoven, c_24), measured-modelled dissolved oxygen, BOD_T (the sum of the four fractions of BOD), BOD_sed (sediment BOD concentration) at two locations of the river M_0121, and M_0002 ( Fig. 1).

Fig. C.2.
System dynamics during 2012 comparing rainfall variability (KNMI rain gauge 370, KNMI Radar at the same location and the estimated intensity at the city of Eindhoven, c_24), measured-modelled dissolved oxygen, BOD_T (the sum of the four fractions of BOD), BOD_sed (sediment BOD concentration) at two locations of the river M_121, and M_0002 (Fig. 1).