HydroFATE (v1): A high-resolution contaminant fate model for the global river system

. Pharmaceuticals and household chemicals are neither fully consumed nor fully metabolized when routinely used by 10 humans, thereby resulting in the emission of residues down household drains and into wastewater collection systems. Since treatment systems cannot entirely remove these substances from wastewaters, the contaminants from many households connected to sewer systems are continually released into surface waters. Furthermore, diffuse contributions of wastewaters from populations that are not connected to treatment systems can directly (i.e., through surface runoff) or indirectly (i.e., through soils and groundwater) contribute to contaminant concentrations in rivers and lakes. The unplanned and unmonitored 15 release of such contaminants can pose important risks to aquatic ecosystems and ultimately human health. In this work, the contaminant fate model HydroFATE is presented which is designed to estimate the surface-water concentrations of domestically used substances for virtually any river in the world. The emission of compounds is calculated based on per capita consumption rates and population density. A global database of wastewater treatment plants is used to separate the effluent pathways from populations into treated and untreated, and to incorporate the contaminant pathways into the river network. The 20 transport in the river system is simulated while accounting for processes of environmental decay in streams and in lakes. To serve as a preliminary performance evaluation and proof of concept of the model, the antibiotic sulfamethoxazole (SMX) was chosen, due to its widespread use and the availability of input and validation data. The comparison of modelled concentrations against a compilation of reported SMX measurements in surface waters revealed reasonable results despite inherent model uncertainties. A total of 390,000 km of rivers were predicted to have SMX concentrations that exceed environmental risk

humans, thereby resulting in the emission of residues down household drains and into wastewater collection systems. Since treatment systems cannot entirely remove these substances from wastewaters, the contaminants from many households connected to sewer systems are continually released into surface waters. Furthermore, diffuse contributions of wastewaters from populations that are not connected to treatment systems can directly (i.e., through surface runoff) or indirectly (i.e., through soils and groundwater) contribute to contaminant concentrations in rivers and lakes. The unplanned and unmonitored 15 release of such contaminants can pose important risks to aquatic ecosystems and ultimately human health. In this work, the contaminant fate model HydroFATE is presented which is designed to estimate the surface-water concentrations of domestically used substances for virtually any river in the world. The emission of compounds is calculated based on per capita consumption rates and population density. A global database of wastewater treatment plants is used to separate the effluent pathways from populations into treated and untreated, and to incorporate the contaminant pathways into the river network. The 20 transport in the river system is simulated while accounting for processes of environmental decay in streams and in lakes. To serve as a preliminary performance evaluation and proof of concept of the model, the antibiotic sulfamethoxazole (SMX) was chosen, due to its widespread use and the availability of input and validation data. The comparison of modelled concentrations against a compilation of reported SMX measurements in surface waters revealed reasonable results despite inherent model uncertainties. A total of 390,000 km of rivers were predicted to have SMX concentrations that exceed environmental risk 25 thresholds. Given the high spatial resolution of predictions, HydroFATE is particularly useful as a screening tool to identify areas of potentially elevated contaminant exposure and to guide where local monitoring and mitigation strategies should be prioritized.

Introduction
Compounds of emerging concern (CECs) are deemed to be an important source of risk due to their potential adverse 30 environmental impacts in the global water system (Gavrilescu et al., 2015;Noguera-Oviedo & Aga, 2016). For instance, pharmaceutically active compounds such as analgesics, antibiotics, estrogens and antiepileptics which are in widespread use globally, are not fully metabolized by the human body; thus, after their excretion and subsequent delivery into the wastewater collection and treatment system, they may ultimately reach the aquatic environment (Aydin et al., 2019;Kümmerer, 2009;Palli et al., 2019;Patrolecco et al., 2018;Praveena et al., 2018). The ongoing release of these compounds and other household 35 chemicals through wastewater discharges often has unknown or poorly understood effects on the environment and human health. Importantly, most wastewater treatment plants (WWTPs) are not specifically designed to remove these contaminants before discharging effluents into receiving waterbodies, such as rivers, lakes, or oceans (Rizzo et al., 2019). As such, wastewaters that are collected from domestic sources and delivered via sewer systems to a WWTP may be only partially -or not at all -treated for such substances, thereby causing the WWTP to serve as a concentrated point source of contamination 40 of CECs into aquatic ecosystems (Daughton & Ternes, 1999;Petrie et al., 2015;Meyer et al., 2019). In addition to these point sources, diffuse sources of contaminants from populations who are not connected to the sewage system can add to the pollution of waterbodies (Lapworth et al., 2012). Risks associated with these contaminants are further exacerbated due to the limited monitoring of their presence in wastewaters and receiving waterbodies into which they are discharged, and incomplete assessment of their impacts downstream. In turn, this lack of information leads to poor regulatory oversight to safeguard the 45 health of aquatic ecosystems and that of populations that rely on them as a source of water (Daughton, 2014). Moreover, robust estimates of current and future changes in water quality are needed to achieve sustainable management of water resources to ensure clean and accessible water for all, as promoted by the Sustainable Development Goal (SDG) 6 Tang et al., 2019;van Vliet et al., 2019).
When measurements of waterborne contaminants are unavailable or insufficient to make informed decisions regarding water 50 pollution arising from CECs, simulation models can be used instead to represent the hydrodynamic and water quality conditions of the waterbody. Contaminant fate models (CFMs), also known as environmental exposure models or georeferenced river models, focus on instream processes such as transport and degradation after the compounds' release from point and non-point sources. CFMs are specifically designed to predict realistic distributions of contaminants in a river catchment (Aldekoa et al., 2016). Examples of models operating at regional to global scales include GREAT-ER (Aldekoa et 55 al., 2013;Feijtel et al., 1997), LF2000-WQX (Johnson et al., 2007), GIS-ROUT (Wang et al., 2000), PhATE (Anderson et al., 2004), Mike 11 (Havnø et al., 1995), WorldQual , ePiE (Oldenkamp et al., 2018), and GWAVA (Johnson et al., 2013). These models require information about the hydrological characteristics of the catchment, consumption rates of the chemical substances, and fate parameters that describe their instream decay. These requirements can limit the performance of the models in regions where this information is unreliable or scarce (Grill et al., 2016). 60 https://doi.org/10.5194/egusphere-2023-1590 Preprint. Discussion started: 25 July 2023 c Author(s) 2023. CC BY 4.0 License.
Water pollution caused by CECs is an issue of global concern, and water quality assessments must therefore be spatially consistent and comparable across the world to be able to identify locations of high contaminant concentration and regional trends in water pollution over time at a global scale. One of the challenges for global contaminant fate modelling is the lack of spatial consistency in datasets for model inputs, especially in regions where data are insufficient to support detailed assessments. (Kroeze et al., 2016;Strokal et al., 2019;Tang et al., 2019). For this reason, only a few global-scale CFMs models 65 exist, and those that do are typically limited to certain substances and relatively coarse spatial resolutions. For example, GLOBAL-FATE (Font et al., 2019), which was created as an open-source down-the-drain model that includes lake and reservoir modules as well as wastewater input information at a global scale, operates at a 7-km spatial grid. The Global TCS model (van Wijnen et al., 2018) was created to simulate the transport of the antibacterial agent triclosan in global rivers at a 0.5-degree spatial resolution (i.e., corresponding to approximately 55-km grid cells at the equator). 70 To our knowledge, all currently existing global CFMs that require the quantification of the load of wastewater into the river system use population density and national sanitation statistics as proxies to derive the necessary input data (e.g., Beusen et al., 2015;Font et al., 2019;Hofstra et al., 2013;Mayorga et al., 2010;Strokal et al., 2019;Van Drecht et al., 2009;van Puijenbroek et al., 2019;Williams et al., 2012). More specifically, calculations are based on the fraction of the population connected to sewage systems per country. The main source of these statistics is the World Health Organization and the United 75 Nation Children's Fund (WHO/UNICEF) Joint Monitoring Program (JMP) for Water Supply, Sanitation and Hygiene (WASH), which provides regular global reports on drinking-water and sanitation coverage for tracking progress toward SDG 6 (WHO & UNICEF, 2021). This dataset allows for differentiation of wastewater treatment services between countries and over time, but it does not account for spatial variability inside national boundaries, except for an assumed correlation with population density. Herrera (2019) also points out several discrepancies between national-level data and JMP-WASH data. In 80 addition, the dataset does not contain specific locations of wastewater discharge, which can have important implications with respect to the distribution of contaminants in the river system.
Another important limitation of existing global water quality models is that they do not account for diffuse sources of pollution arising from populations who are not connected to WWTPs or for the natural attenuation of contaminants that occurs along their pathway from a source in the landscape through the soil or subsurface before reaching a waterbody. The contribution of 85 diffuse pollution can be substantial as revealed by the high aquatic concentrations of pharmaceuticals that have been measured in countries with low rates of sanitation (Hanna et al., 2020;K'Oreje et al., 2012;Khan et al., 2013). Grill et al. (2016;2018) introduced a regional CFM that estimates the emission of household contaminants and their subsequent transport in river networks at high spatial resolution (river network derived from 500-m grid cells). In this model, transport in the river system is simulated using the global river routing model HydroROUT (Lehner & Grill, 2013). It has been applied 90 and evaluated with respect to its ability to model the fate of several pharmaceuticals in the Saint Lawrence River Basin, Canada (Grill et al., 2016), the pharmaceutical diclofenac in India (Shakya, 2017), and human hormones in China (Grill et al., 2018).
These assessments included not only WWTPs as point sources but also accounted for diffuse sources of contamination from populations not served by WWTPs while accounting for natural attenuation.
In the present work, the CFM by Grill et al. (2016;2018) is fully developed to operate at a global scale in order to: (1) serve 95 as a large-scale screening tool for assessing CECs from domestic sources, especially as a precursor for potential risk assessments; (2) predict critical locations in river networks of potentially high aquatic contaminations; and (3) inform the development and implementation of guidelines, regulations and mitigation strategies that aim to limit chemical pollution and safeguard human and ecosystem health. The model enhancement and expansion are performed by integrating a global WWTP database (HydroWASTE;Ehalt Macedo et al., 2022) and by distinguishing the pathways of contaminants from their population 100 source to the river network depending on whether they are treated (i.e., either in centralized WWTPs or in decentralized facilities) or untreated (i.e., either from urban or rural diffuse sources). The capability of this global model, hereafter called HydroFATE, is then evaluated by applying it to estimate the global distribution of the antibiotic sulfamethoxazole (SMX) in the river network and by then comparing the resulting predictions of environmental concentrations to field measurements reported in the literature. SMX was selected for this proof-of-concept case study due to the abundance of SMX field 105 measurements in surface waters reported globally and the broader availability of model input parameters in the literature compared to many other CECs.
Given the broad goals, the main focus of the model development presented herein is to predict spatial variations in contaminant exposure and to achieve a level of model performance where estimates of concentrations in the river network are generally within an order-of-magnitude of reported field measurements, which is generally considered adequate for these types of 110 screening models (Johnson et al., 2008, Oldenkamp et al., 2018. HydroFATE, with its inherent global applicability due to its reliance on pre-existing data in addition to its high spatial resolution, aims to provide a tool for scientists, practitioners, and regulators to advance and focus their work, especially in regions where data are lacking.

River and lake network 115
The various raster and vector layers representing the river network and catchment boundaries in HydroFATE were obtained from the global hydrographic database HydroSHEDS (Lehner et al., 2008), which was derived from digital elevation data provided by NASA's Shuttle Radar Topography Mission (SRTM) at 90 m (3 arcsecond) resolution. For the present study, we used a derivative of this database in vector format, termed RiverATLAS (Linke et al., 2019), which was extracted at 500 m (15 arcsecond) grid cell resolution and represents all rivers and streams where the average discharge exceeds 100 L s −1 or the 120 upstream catchment area exceeds 10 km 2 , or both. The resulting global river network comprises 8,477,883 individual river reaches with an average length of 4.2 km, representing a total of 35.8 million kilometers of rivers. Each river reach has an associated contributing catchment with an average area of 15.7 km 2 .
Every river reach in RiverATLAS is provided with a series of precalculated hydro-environmental characteristics. From this database, we used the long-term (i.e., 1971 to 2000) average naturalized river discharge in our study. The discharge estimates 125 were derived from the global hydrological model WaterGAP version 2.2 (Müller Schmied et al., 2014), which were downscaled https://doi.org/10.5194/egusphere-2023-1590 Preprint. Discussion started: 25 July 2023 c Author(s) 2023. CC BY 4.0 License. from their original resolution of 0.5° grid cells to the RiverATLAS resolution of 500 m using geostatistical techniques (Lehner & Grill, 2013). In addition to annual average discharge estimates, minimum discharges (i.e., the lowest monthly flow value within an average year) were also used for assessments under low-flow conditions.
To account for lake processes, a global database called HydroLAKES was used that provides the shoreline polygons of 1.4 130 million lakes with a surface area of at least 10 ha (Messager et al., 2016). All lakes in HydroLAKES are associated with RiverATLAS via their lake pour points.

WWTP information
HydroFATE incorporates the locations and characteristics of wastewater treatment plants (WWTPs) as provided by the HydroWASTE database (Ehalt Macedo et al., 2022). This database contains information on 58,502 WWTPs and provides 135 details for each on the actual location of the plant, the estimated outfall location, and attributes that are relevant for the purposes of this study including: population served, treated-wastewater discharge, and level of treatment; i.e., primary treatment, such as solids removal through mechanical cleaning and sedimentation; secondary treatment, which includes biological processes; and advanced (tertiary or higher) treatment through extra filtration or chemical treatment. HydroWASTE was developed by combining regional and national WWTP datasets and adding auxiliary information, including Open Street Map data, global 140 population data, and the high-resolution river network from RiverATLAS which was used to georeference WWTP outfall locations.
With respect to its implementation in HydroFATE, of the 58,502 WWTPs in the database, the following were excluded (note that some records fall into more than one category): (1) 1,682 WWTPs that were labeled as closed, non-operational, decommissioned, projected, proposed, or under construction; (2) 379 WWTPs that have their outfall location outside of any 145 catchment that is associated with the river network of RiverATLAS (e.g., small islands); (3) 199 WWTPs that serve a population of zero according to records; and (4) 10,196 WWTPs that have their outfall location within 10 km from the ocean coast. The latter category was excluded to avoid overestimation of contaminant loads in coastal rivers as, given the locational uncertainties in HydroWASTE, effluents from WWTPs with estimated outfall locations near the coast might, in reality, discharge directly into the ocean. Of the remaining 46,425 WWTPs, some share the same estimated outfall location, and thus 150 were aggregated to the final number of 45,348 point sources of wastewater discharge into the global river network.
To account for small or decentralized wastewater treatment systems (DWTS) not included in the HydroWASTE database, such as septic tanks, HydroFATE uses country-level statistics provided by the JMP-WASH program (WHO & UNICEF, 2021). For the purposes of our study, sanitation data for each country were acquired for the year 2015 and the information termed 'Proportion of population using improved sanitation facilities (wastewater treated)' was selected. 155

Population and urban area grids
Global gridded population distributions of the year 2015 were provided by the WorldPop dataset (WorldPop & CIESIN, 2018), which was produced using a combination of census, geospatial, and remotely sensed data in a spatial modelling framework https://doi.org/10.5194/egusphere-2023-1590 Preprint. Discussion started: 25 July 2023 c Author(s) 2023. CC BY 4.0 License. (Tatem, 2017). The WorldPop data were disaggregated from their original spatial resolution of 1 km to the same resolution as the HydroFATE model (500 m) to allow for spatially consistent calculations. 160 Information on the location of global urban areas is determined in HydroFATE according to a land use classification derived from remote sensing imagery. The dataset has a 500 m spatial resolution and is based on data from 2001 to 2002 from the Moderate Resolution Imaging Spectroradiometer (MODIS) (Schneider et al., 2010). Although more recent urban area grids exist today, this version was implemented during earlier model development stages and was kept to ensure model integrity.

Methodology 165
The regional CFM previously developed by Grill et al. (2016;2018) simulates both the emission of household contaminants and their subsequent transport towards and within the river system. Building on this earlier work, we here enhance and then expand this CFM, hereafter termed HydroFATE, to the global scale. Figure 1 provides a conceptual representation of the HydroFATE model. Contaminant emissions are determined based on population distribution, per capita consumption of the modelled substance, human metabolism, and wastewater treatment removal or natural attenuation, depending on the pathway 170 from the source to the waterbody. Emissions from populations served by a WWTP or by smaller and decentralized wastewater treatment systems (DWTS) are reduced in proportion to the treatment efficiency, which is based on the level of treatment, i.e., primary, secondary, or advanced (tertiary or higher), that is provided by the WWTP. Emissions arising from populations that are not served by any type of wastewater treatment system are attenuated by a direct discharge coefficient depending on the distance from the river network and whether the emission is located in a rural or urban area (Grill et al. 2018). The combined 175 loads from all pathways of contaminants inside the catchment boundaries of an individual river reach are aggregated as the total local contaminant load of the reach. HydroFATE then employs the generic river routing model HydroROUT (Grill et al., 2014;2019;Lehner & Grill, 2013) to simulate the transport of the chemical substance in the river system, accumulating the contaminant load downstream and accounting for instream decay and removal in lakes. Finally, the Predicted Environmental Concentration (PEC) for every river reach is calculated by dividing the sum of the local total contaminant load plus the 180 incoming load from upstream reaches by the long-term river discharge of the reach.   The methodologies used to simulate the amount of contaminant emissions and the routing of contaminant loads along rivers and through lakes were previously described at the regional scale (Grill et al., 2016;2018). While these basic processes do not change when applied at the global scale, the model was expanded in the present study by incorporating novel global-scale input data. Furthermore, the model was enhanced by introducing a spatially explicit differentiation of various wastewater and 190 contaminant pathways depending on the access of global populations to wastewater treatment (see dotted rectangle in Figure   1). Within each pathway, contaminants are removed following different removal efficiencies offered by treatment facilities, or different levels of natural attenuation in the soil and subsurface.

Determination of contaminant pathways 195
HydroFATE calculates contaminant emissions using contaminant-specific information (i.e., the annual per capita consumption and the excretion fraction) and the number of people connected to the river system. This connection occurs through different pathways depending on the sanitation system at the location in question. Using the global WWTP database HydroWASTE (Ehalt Macedo et al., 2022), a population grid, an urban extent grid and additional sanitation data, six types of contaminant pathways from populations into the river network were determined and incorporated into the HydroFATE model (see Figure  200 1). These are: point sources of treated wastewater from populations connected to WWTPs that provide (1) primary level of treatment, (2) secondary level of treatment, or (3) advanced (i.e., tertiary or higher) level of treatment; (4) decentralized sources of treated wastewater from populations not connected to a WWTP but served by DWTS; (5) diffuse sources of untreated wastewater from populations in urban areas; and (6) diffuse sources in rural areas. The methods described in more detail below assign a contaminant pathway for every pixel in a global population grid. Figure   First, populations are allocated to individual WWTPs. Although HydroWASTE provides details on the number of people served by a WWTP, it does not specify the spatial distribution of the population served nor the service area associated with it; that is, it does not provide explicit information that is required to spatially allocate the populations that are served versus those not served by WWTPs (top panel of Figure 2). The service area of a WWTP depends on several local factors not easily obtainable at the global scale, such as decisions of the administrative unit responsible for the facility, and the distribution of 220 underground pipes that transfer the wastewater to the facility. Studies have presented different approaches to associate the area contributing to a WWTP. For instance, Keller et al. (2006) defined it as the nearest upstream contiguous urban area from the WWTP discharge point within 2 km, estimating the population served by the WWTP based on the number of people in this contributing area. However, the largest WWTP in their study served only 32,000 people (expressed as population equivalent), whereas HydroWASTE contains almost 5,000 WWTPs that serve more than 100,000 people. Kapo et al. (2017) and Grill et 225 al. (2018) associated the WWTP service area to an administrative unit, but these studies were developed in countries where the information on administrative units is widely available (i.e., USA and China, respectively), which is not typically the case at a global scale.
To allocate explicit spatial population distributions to individual WWTP locations in HydroFATE, we developed a method that follows the approach of Shakya (2017). This approach assumes that a WWTP can serve populations both upstream and 230 downstream as wastewater can be pumped and directed in complex underground sewage systems. It also assumes that the service area of a WWTP can exceed the nearest contiguous urban area, with larger WWTPs typically serving larger distances and populations. Shakya (2017) tested different buffer sizes (i.e., from 5 km to 30 km at 5 km increments) in India, to determine the best-fit service area for different WWTP sizes by comparing the population within the buffer to the reported number of population served. Since distribution and characteristics of WWTPs in different regions of the world can vary substantially, 235 we expanded upon this approach by using an iterative process instead of pre-defined buffer sizes.
The final WWTP allocation method assigns populations from the WorldPop population grid (WorldPop & CIESIN, 2018; see Section 2.3) to the point locations of WWTPs using a ranking system as described in detail in Section S.1 of the supplementary material. The method considers the distance of each population pixel from the WWTP, the size of the population served by the WWTP, whether a population pixel is categorized as 'urban' or not, and whether candidate pixels are clustered in 240 contiguous areas. The settings and thresholds applied in this method were initially set to those reported by Shakya (2017) and were then refined and finalized in a successive trial-and-error approach in which intermediate results were mapped, visually inspected for plausibility, and statistically tested to verify whether they led to further improvements. The final allocation procedure assigns population pixels to individual WWTPs until the reported total of served population of each WWTP was reached, or until maximum distance thresholds are exceeded. Once the allocation is completed, the contaminant pathway from 245 each allocated population pixel to the river reach is defined by the WWTP discharge location and can be separated into one of three treatment levels (primary, secondary, or tertiary/advanced) as specified in HydroWASTE (purple colors in bottom panel of Figure 2). https://doi.org/10.5194/egusphere-2023-1590 Preprint. Discussion started: 25 July 2023 c Author(s) 2023. CC BY 4.0 License.
Besides explicit WWTP pathways, HydroFATE also accounts for sources of potential contamination from decentralized wastewater treatment systems (DWTS) that are not included in HydroWASTE, such as household septic tanks. To this end, 250 for every country, the difference was calculated between the aggregated population served by WWTPs according to HydroWASTE and according to the global database on sanitation JMP-WASH (WHO & UNICEF, 2021). If JMP-WASH reported higher numbers of population served, this difference was assigned successively to the pixels with highest population numbers within the respective country borders that have not been allocated to WWTPs. The wastewater pathway type of these pixels thus defaults to that of DWTS (orange color in bottom panel of Figure 2) and include a specific removal efficiency. In 255 the absence of explicit information, it is assumed that after DWTS treatment the effluent discharge directly enters the surface drainage system at the pixel's location within a catchment and then flows to the catchment's associated river reach.
Finally, all remaining population pixels that were not assigned in any of the previous steps were considered to be diffuse wastewater sources and were classified as 'untreated' (green colors in bottom panel of Figure 2). They were separated between rural and urban using an urban area grid (see Section 2.3). All population pixels classified as diffuse sources thus have a defined 260 contaminant pathway that goes from the pixel's location within a catchment to the catchment's associated river reach. The contaminant removal along this pathway in soils and the subsurface is determined through distinct urban vs. rural attenuation functions.

Incorporation of contaminant pathways into HydroFATE
The results of the various population allocation steps described above are used as inputs into the HydroFATE model. The total 265 input of contaminants from treated pathways into each river reach is the sum of the contributions from all WWTPs (i.e., point sources) releasing wastewater into that reach and the contribution from populations served by DWTS (i.e., decentralized sources): where , is the total load of the contaminant in river reach r originating from treated pathways in the reach catchment c 270 contributing to r (g day -1 ); , is the population (persons) served by each WWTP i connected to river reach r; , is the population (persons) served by DWTS from pixel m inside catchment c; is the per capita load (excreted) of the contaminant (g cap -1 day -1 ); and , and are the removal efficiencies (%) of WWTPs and DWTS, respectively, releasing wastewater into the reach at treatment level j (primary, secondary, or advanced).
To estimate the diffuse contributions from populations in urban and rural areas who are not served by wastewater treatment 275 systems, it is assumed that not all human releases of untreated wastewater enter directly into surface waterbodies. This is due to various processes of natural attenuation such as absorption in soils or deposition in land surface depressions (Lapworth et al., 2012). Unfortunately, the factors affecting the natural attenuation and partial release of effluents are currently not well understood. Therefore, a proxy variable termed the direct discharge coefficient is incorporated into the model to represent the https://doi.org/10.5194/egusphere-2023-1590 Preprint. Discussion started: 25 July 2023 c Author(s) 2023. CC BY 4.0 License. fraction (dimensionless) of contaminant load from untreated pathways that reaches a waterbody after processes of natural 280 attenuation. For example, for baseline model applications (see Section 4), the direct discharge coefficient for urban populations was set to 0.8 and for rural populations to 0.5, respectively, following Grill et al. (2018). The higher coefficient value for urban areas is due to the presence of impervious surfaces leading to more direct disposal of wastewater to nearby rivers and streams.
The total input of contaminants from untreated pathways to each river reach is then calculated as: where , is the total load of the contaminant arriving at reach r from all untreated pixels m inside the reach catchment c (g day -1 ); is the per capita load (excreted) of the contaminant (g cap -1 day -1 ); , and , are the total count of population (persons) following the untreated pathway from pixel m, in urban and rural areas, respectively; and and (dimensionless) are the direct discharge coefficients representing the proportion of contaminant load from untreated pathways that are discharged into the river reach r from urban and rural areas respectively.
(dimensionless) is a factor by 290 which loads from rural populations are additionally reduced based on an inverse distance relationship that accounts for limited connectivity in areas that are further away from the river network, following the approach by Grill et al. (2018): where (dimensionless) represents the fractional distance-based contribution factor for pixel m; and , (kilometers) is the Euclidean distance between the pixel m and the river reach r. This equation delivers fractional contribution values between 0 295 and 1, with 1 for locations closest to the river, 0.5 at a distance of 1 km, and continuously decreasing values as the distance increases. In contrast to the original method used in Grill et al. (2018), we refrained from normalizing the factor (i.e., by constraining to 0 at the furthest distance in each reach catchment), considering that contaminant contributions from any distance can reach the river system. Also, we used Euclidean distances rather than distances along the surface hydrological flow path (as proposed by Grill et al. 2018) assuming that contaminants can also travel through soils and groundwater. We 300 tested the sensitivity of the parameter settings by doubling and halving both the distance value and the exponent in Equation 3, finding that the resulting uncertainty ranges were below those of other model parameter settings.

River and lake routing
The mass transport in HydroFATE follows a 'plug-flow' approach (Pistocchi et al., 2010). That is, a 'plug' of substance mass is accumulated downstream as the sum of the input from the current and all upstream reaches flowing into the current reach 305 (Grill et al., 2018): where , represents the total load of the contaminant accumulated at the end of river reach r (g day -1 ), calculated as the mass influx from treated pathways ( , ) plus the mass influx from untreated pathways ( , ) plus the total load (after decay) from those upstream reaches n ( ∑ ) that directly discharge into reach r, multiplied by the instream decay 310 factor , (dimensionless) and the lake decay factor , (dimensionless) that apply at reach r. The instream degradation of a chemical substance in the river body, if applicable, is expected to decrease at a rate proportional to its mass, and is calculated assuming first-order decay: where , (dimensionless) is the instream decay factor for reach r, tr is the time a plug of water needs to travel through the 315 river reach r (days), and k is a first-order decay constant specific to the contaminant (day -1 ), which determines the rate of environmental decay in the river (Grill et al., 2018). Note that the inverse of k represents the half-life of the chemical in the environment.
As an important partial contaminant sink, lakes were integrated and modelled as 'completely stirred reactors' (CSTR) (Anderson et al., 2004). The degradation of a chemical substance in lakes within the river network is calculated: 320 where , (dimensionless) is the instream decay factor for reach r, is the river discharge (L day -1 ) at the river reach r; k is the first-order decay constant specific to the contaminant (day -1 ); and is the combined volume (L) of all lakes along river reach r. The locations and characteristics of lakes in HydroFATE are derived from the global lake database HydroLAKES (Messager et al., 2016). If there are no lakes in the river reach, , is equal to 1. 325 To calculate the Predicted Environmental Concentration ( ) of the contaminant at river reach r (ng L -1 ), the final contaminant load (after any accumulation or removal) at reach r ( , ; g day-1) is divided by the river discharge ( ; L day -1 ) at the same location: 4 Model application and performance evaluation: concentration of sulfamethoxazole in the global river network 330 To evaluate the global applicability and general performance of the HydroFATE model, including the new distinction into six contaminant pathways as described in Section Error! Reference source not found., the model was used in a proof-of-concept study to predict the distribution of the antibiotic sulfamethoxazole (SMX) in the global river network. SMX is considered a contaminant of emerging concern (Wilkinson et al., 2022), and selected due to a relatively high level of data availability, in surface waters were calculated based on long-term naturalized discharge as provided for all river reaches in the RiverATLAS database (Linke et al., 2019). The resulting PECs were compared to MECs to evaluate the model's predictive ability.
Furthermore, PECs were also used to assess the exposure associated with SMX based on a comparison of PECs in surface waters relative to the reported Predicted No-Effect Concentration (PNEC), which is the concentration threshold below which 340 no adverse effects of exposure are observed in laboratory-based toxicity tests (Archundia et al., 2018;Hernando et al., 2006).
Finally, to further assess the model's performance under a range of alternative conditions, HydroFATE was run for a total of four scenarios based on plausible ranges of configuration settings and parameters extracted from literature sources.

SMX properties 345
Sulfamethoxazole (SMX) is a sulfonamide antibiotic, usually sold in combination with trimethoprim. When consumed, SMX is rapidly absorbed upon oral administration, with metabolism mainly hepatic (Rudy & Senkowski, 1973). Residues are mostly excreted in urine, and the proportion of unchanged substance can be between 10 and 30%, depending on urine pH (Straub, 2016). Based on a comprehensive literature search, Straub (2016) found 36 publications that reported 190 removal efficiencies of SMX in WWTPs, with an average of 21% removal, a median of 49%, and an interquartile range from 2% to 73%. Archundia 350 et al. (2018) compiled 6 different studies that measured environmental decay in rivers, finding an average first-order decay constant of 0.73 day -1 , a median of 0.13 day -1 , a minimum of 0.034 day -1 , and a maximum of 2.88 day -1 .
For the main mode run in this study, the average excretion fraction and the median values of wastewater removal efficiency and instream decay constant were used (see Table 1, baseline scenario). But we also explored the ranges of possible values and how they affect the model outputs (see Table 1, alternative scenarios; for more details see Section 4.2.1. below). Since 355 Straub (2016) does not provide specific removal efficiencies for SMX for different treatment levels of WWTPs, the same removal efficiency was assumed for primary, secondary, and advanced treatment levels.

SMX global consumption
Country-level averages of annual consumption per capita of SMX were necessary to estimate emissions to the river network. Klein et al. (2018) analyzed and estimated the consumption of various antibiotics in the world based on the IQVIA database, 360 which reports annual sales for the period of 2012 to 2015 for 91 countries. For our study, the annual consumption of SMX for each country was assumed to be the average between the four available years as provided by Klein et al. (2018). For countries not included in the IQVIA database, the consumption rate was extrapolated based on the average per capita SMX use from the same income group (World Bank, 2019) following the methodology described by Klein et al. (2018). https://doi.org/10.5194/egusphere-2023-1590 Preprint. Discussion started: 25 July 2023 c Author(s) 2023. CC BY 4.0 License.

Measured Environmental Concentrations (MECs) 365
To evaluate the predictive ability of HydroFATE, 227 data points of MECs were compiled from literature sources (see Figure   3 for their location and see Section S.2 in the supplementary material for literature sources). MECs are reported from every continent except Oceania. The average of reported values is 390 ng L -1 , the median is 28 ng L -1 , the minimum is 0.23 ng L -1 , and the maximum is 21,000 ng L -1 . In addition, 134 non-detects (i.e., concentrations reported were lower than could be detected based on the analytical method used to measure SMX in water samples) were compiled. In order to be selected for inclusion 370 as a MEC in our model evaluation, the literature source had to: (1) report the specific location (i.e., in the form of coordinates, river names, or river intersections) of the measurements; and (2) confirm that in the catchment feeding the river the dominant use of antibiotics is not through veterinary or industrial activities since the current version of HydroFATE is not adapted to account for these sources. Most MECs reported do not include corresponding comprehensive information on the river characteristics, such as width or average discharge which would have helped to identify the precise location in the river 375 network, or on the conditions under which the measurement was taken, such as actual discharge amount or flow season (i.e., low flow, average, or high flow). Therefore, the actual location and amount of the reported MEC may not always accurately correspond to the referenced river network location and/or the assumed flow conditions in HydroFATE.

Predicted No-Effect Concentration (PNEC)
Since reports of SMX detection in rivers and streams first began to appear, its potential impact on the environment and human 380 health has been assessed in several ways. That is, for exposure assessments, PNEC values often serve as thresholds to evaluate the level of environmental exposure to contaminants (Hernando et al., 2006). There are two values of PNEC for SMX published to-date in literature. First, the PNEC-Minimum Inhibitory Concentration (PNEC-MIC), for which a value of 16,000 ng L -1 was estimated by Bengtsson-Palme and Larsson (2016), is intended to be protective of antibiotic resistance. Second, the PNEC-Environment (PNEC-ENV), for which a value of 600 ng L -1 was estimated by Ferrari et al. (2004), is based on eco-toxicology 385 data and is intended to protect aquatic function. For the purposes of the present case study, the lower value of PNEC (600 ng L -1 ) was selected to protect against any possible impact in the exposure analysis.

Scenarios
A total of four main scenarios were created to portray plausible settings of parameters and model configuration, as outlined in 390 Table 1. Scenario 1 represents baseline conditions using input parameters and model configurations that are expected to yield the most plausible predictions for average-flow conditions based on reported values in literature as described in Section 4.1.1. Scenario 2 represents low-flow conditions, but otherwise maintains the same parameters and configuration settings as Scenario 1. Scenarios 3 and 4 represent low-end and high-end settings, yet still within plausible ranges. That is, Scenario 3 (conservative case) uses parameter and configuration settings that represent minimum load emissions and maximum removal efficiencies 395 https://doi.org/10.5194/egusphere-2023-1590 Preprint. Discussion started: 25 July 2023 c Author(s) 2023. CC BY 4.0 License.
(including full substance removal in lakes) to represent low-end contaminant concentrations in the river network, and vice versa for Scenario 4. Additional scenarios (see Table S-3 in the supplementary material) were designed for the model performance evaluation to analyze the individual contributions of selected parameters and model configurations on the output, including a worst-case scenario assuming that no removal processes affect the contaminant load.

Exposure assessment
The ratio of PEC to PNEC was used as an indicator to designate levels of SMX in the global river network that can lead to 405 environmental health concerns. For that purpose, risk quotients, , (dimensionless), were calculated for every river reach r and every scenario configuration S using the value of PNEC for SMX of 600 ng L -1 and the calculated , (ng L -1 ) at every reach r for the respective scenario S: In instances where the risk quotient is greater than or equal to 1 ( , ≥ 1), it is assumed that this exposure level can cause 410 negative environmental impacts (Archundia et al., 2018;Hernando et al., 2006).

Performance evaluation
The performance of the model was evaluated by comparing reported MECs of SMX (see Section 4.1.3) and PECs calculated using HydroFATE at the coinciding river reaches. The goodness-of-fit indicators used to quantify model performance included the normalized root mean square error (NRMSE), the percentage of bias (PBIAS), the Nash-Sutcliff efficiency coefficient 415 (NSE; Nash & Sutcliffe, 1970), and the Kling-Gupta efficiency coefficient (KGE; Gupta et al., 2009)

Global emission of SMX to rivers
The global consumption of SMX from the world's population is estimated at 2.6 million kg y -1 . From these, 2.4 million kg y -1 are consumed by populations with emission pathways that can potentially reach the river and lake system (including processes of metabolism, excretion, treatment, and/or natural attenuation), while the remaining 0.2 million kg y -1 are consumed by 425 populations with direct emission pathways to the ocean. For the baseline scenario (Table 1), a total of 214,000 kg y -1 of SMX (9% of global consumption) are estimated to reach rivers and lakes (Table 2). From this, 29% are from pathways with some form of wastewater treatment (WWTP or DWTS) versus 71% from untreated pathways. The results show that although most of the consumption occurs among rural populations without access to treatment (i.e., 52% of total consumption), natural attenuation, as simulated in this study, has a high potential to remove the substance in rural areas before it reaches the rivers 430 (i.e., resulting in only 30% of total emission to rivers and lakes). In contrast, populations in urban areas without access to wastewater facilities were modelled to have the highest emissions to the river and lake network (i.e., 42% of total emissions).
The processes simulated in this study that are responsible for removing portions of the substance along its way from the consumer to the final destination at the ocean or an inland sink are, in order of quantity removed: metabolism (80% of total consumption is removed), natural attenuation in rural areas (7.5%), instream decay (3.5%), wastewater treatment in WWTPs 435 and DWTS (2.4%), lake removal (1.7%), and natural attenuation in urban areas (1.0%). The total SMX reaching the ocean or an inland sink through rivers amounts to 92,200 kg y -1 (3.9% of global consumption). Table 2 shows the 20 countries that are estimated to have the largest emissions of SMX. India accounts for the highest national emission to rivers and lakes (30,500 kg y -1 ), despite its lower-than-average per capita consumption (801 µg day -1 ). This is due to a combination of large populations and a lack of access to wastewater treatment in urban areas. South Africa shows the 440 highest per capita consumption (6,220 µg day -1 ), while China shows one of the lowest (67 µg day -1 ), but it is still among the top 20 emitters.  Overall, the spatial patterns of contaminant emissions to rivers and lakes are very similar to the global patterns of consumption, with an average emission to consumption ratio of 9.0% (Table 2). In Ethiopia, the ratio of emission to consumption was predicted to be on the low end (6.1%), which is mostly due to the population being predominantly rural without access to 460 treatment (89%) contributing to 72% of the total emission of SMX. In contrast, other countries such as Mexico, Brazil, and South Africa have a ratio above the global average (12%, 11%, and 11%, respectively) due to the main source of SMX being untreated urban pathways (i.e., impervious surfaces with less attenuation) or treated pathways (i.e., wastewater treatment removal efficiency for SMX is lower than the assumed proportion of SMX removed by processes of natural attenuation in rural areas). 465 Figure 3 illustrates the resulting spatial distribution of SMX concentrations in the global river network for the two baseline scenarios: Scenario 1 corresponding to average-flow conditions and Scenario 2 corresponding to low-flow conditions. Generally, higher concentrations are predicted for rivers in countries with high emissions, such as India, United States, Pakistan, and South Africa. Nonetheless, even in countries that are not among the highest emitters, such as many African countries, low river discharges can cause high concentrations of contaminants in the rivers, especially during low-flow 470 conditions. https://doi.org/10.5194/egusphere-2023-1590 Preprint. Discussion started: 25 July 2023 c Author(s) 2023. CC BY 4.0 License.

Exposure assessment
The results of baseline Scenario 2 (low-flow conditions) predict aquatic exposure to SMX concentrations above the PNEC 480 (i.e., risk quotient ≥ 1) for 1.6% (i.e., 390,000 km) of all rivers in the world with long-term annual average discharge above 0.1 m 3 s -1 (Table 3). India, Pakistan, and Sudan show the largest extent of rivers in this category which indicates a potential risk for environmental health. This percentage decreases to 0.1% (corresponding to 30,100 km) for baseline Scenario 1, i.e., when average-flow conditions are assumed. Pakistan has a particularly high percentage of rivers in the risk category for both scenarios (i.e., 33.5% and 8.4% of all rivers for Scenarios 2 and 1, respectively). 485 Table 3. Top 20 countries by total length of rivers with a predicted risk quotient (RQ) ≥ 1 for SMX for Scenarios 2 (low-flow conditions) and 1 (average-flow conditions).See Table 1   To assess the contribution of instream decay processes (i.e., decay in rivers and removal in lakes) to the reduction in contaminant concentrations, the increase in length of rivers with SMX concentrations exceeding PNEC was calculated assuming that these processes are not taking effect (i.e., the respective first-order decay constant k is set to 0). Globally, if both 495 river and lake environmental decay processes were omitted, there would be a combined increase of 23.7% and 39.1% in the length of rivers that fall in the risk category compared to Scenarios 2 and 1, respectively (Table 3). If only lake removal processes were excluded, there would be an increase of 14.6% in Scenario 2 and 25.3% in Scenario 1. Lake removal is predicted https://doi.org/10.5194/egusphere-2023-1590 Preprint. Discussion started: 25 July 2023 c Author(s) 2023. CC BY 4.0 License.
to be particularly important in rivers in South Africa, United States, Mexico, and China. For instance, without lake removal, there would be an increase of 51% of rivers in South Africa falling in the risk category at low-flow conditions. 500 Finally, to demonstrate the utility of a contaminant fate model operating at high spatial resolution, Figure 4 depicts the risk distribution under low-flow conditions (Scenario 2) for the region of Southeast Asia, including the two countries (India and Pakistan) with the longest total length of rivers in which SMX concentrations exceed PNEC. The high spatial resolution permits the detection of local increases in risk immediately downstream of individual WWTPs, which then can diminish along the flow paths once inflowing tributaries cause dilution effects. Model results also reveal the exposure of individual rivers 505 receiving contaminant discharge without any treatment (i.e., areas without any black dots but presenting a high density of rivers at risk).  Table 1

Performance evaluation
Modelling results were evaluated by comparing predicted SMX concentrations with available measurements in river reaches across the world using 227 MECs with values above the detection threshold. Figures 5(a) to (d) show an analysis of results for 515 baseline Scenario 1 (average-flow conditions; see Table 1 for scenario settings), stratified by certain characteristics of the measurements. The different colors of points in the overall scatter plot shown in Figure 5(a) illustrate the global distribution of measurements. The African continent presents the highest SMX concentrations (both measured and predicted) and the predicted concentrations in Asia, Europe and Central America are in their majority below reported measured concentrations.
These results confirm that model predictions for Scenario 1 are generally reasonable, with 79% of the predicted values being 520 within one order of magnitude of the measured concentrations reported in literature (Johnson et al., 2008, Oldenkamp et al., 2018.   concentrations were too high are located downstream of urban populations on rivers with low discharge, which is a challenging combination to model; that is, urban streams can be heavily modified by anthropogenic activities that influence their flow quantities and water quality, such as channelization, dams, and sewers. Besides, if the urban population is not served by WWTPs, the predictions were based on the assumption of a constant direct discharge coefficient, which in all probability is variable in reality. Despite these uncertainties, only two measurements were predicted erroneously above the PNEC threshold, 550 resulting in a risk quotient that is falsely predicted to be above 1. On the other hand, 17 measurements were erroneously predicted to be below the PNEC threshold, which is in accordance with the overall conservative approach and scenario configuration used in this case study. Finally, Figure 5(e) shows substantial uncertainties in PEC calculations (i.e., extent of error bars) when using the parameter and configuration settings of Scenarios 3 and 4, representing low-end and high-end simulations that were within plausible 555 ranges. Taking these uncertainties into account, 78% of all MECs fell within the range of error bars of HydroFATE; i.e., they were reproducible by the model within at least one of the chosen parameter and configuration setting.
In addition to the 227 MECs, 134 measurements were classified as 'not detected' or 'not quantified.' To evaluate these cases, PECs at the same locations were verified to determine if they were correctly predicted to be below the detection or quantification limit (LOD or LOQ, respectively), depending on the limit reported by the study. The rate of success was 60%, 560 and if allowing an error of one order of magnitude, the success rate increased to 93%.
In addition to the performance evaluation presented in Figure 5, three additional scenarios (including 14 sub-scenarios) were analyzed to specifically test the uncertainty ranges introduced by individual parameter and configuration settings (see Section S.3 in supplementary information for details). The results indicate that the model reacts sensitive yet within reasonable boundaries to the permutations of individual parameter settings. Of particular importance are the settings related to substance 565 removal simulations in the model which depend on local characteristics and on the dominating pathway of the contaminant.

HydroFATE: strengths and limitations
In this study, the first global application of the contaminant fate model (CFM) HydroFATE was presented, building upon previous stages of the model that were used to assess the distribution of pharmaceuticals at the regional scale in Canada, China 570 and India. One of the main characteristics that distinguishes HydroFATE from other global CFMs, besides its high spatial resolution, is that contaminant pathways are differentiated based on whether the wastewater undergoes treatment or not and, if so, at what treatment level. Contaminants generated by populations connected to wastewater facilities are partially removed by treatment processes, whereas contaminants generated by populations not connected to any wastewater system are assumed to undergo natural attenuation processes, which in the case of rural populations also depend on how distant they are from any 575 https://doi.org/10.5194/egusphere-2023-1590 Preprint. Discussion started: 25 July 2023 c Author(s) 2023. CC BY 4.0 License. waterbody. The model application showed that different regions indeed respond differently to pharmaceutical drug consumption depending on the main pathway of the contaminant before reaching the river system. Despite its high spatial resolution, HydroFATE has primarily been designed as a CFM that operates at large scales, including at the global scale, and it can be readily applied with existing input data. Due to the necessary model simplifications to enable such an approach, it is recognized that, even with anticipated future refinements, substantial uncertainties will remain with 580 respect to the model's predictive capability. As such, HydroFATE is intended to serve as a screening model whose primary purpose is to identify critical areas where detailed field studies should be performed.
To apply any model appropriately and interpret its output, it is essential to understand its limitations. The main limitations of HydroFATE stem from its steady-state approach, the difficulty of capturing some underlying processes at the global scale, the lack of information on the behavior, use and disposal of most CECs, and unaccounted variability regarding most input 585 parameters of the model. HydroFATE transport processes are based on long-term average discharge or long-term monthly minimum discharge, which does not account for the seasonality of river flows or any shorter-term fluctuations that affect the dilution capabilities (or lack thereof) of contaminant concentrations. The decay of contaminants along rivers and in lakes is assumed to follow a first-order process, lumping and simplifying complex processes such as deposition, adsorption, photodegradation, and bioaccumulation that occur over time. These processes also depend on local environmental and 590 biological characteristics that are currently very difficult to capture at a global scale. Therefore, more experiments and measurements are needed to reduce the uncertainties inherent in quantifying the decay constants for different substances and under different conditions, especially contaminants of emerging concern. In the presented case study application, an average decay constant arising from only a few reports for sulfamethoxazole was used, which likely represents an overly narrow range that does not adequately capture what happens under different conditions in rivers worldwide. 595 Wastewaters from almost half of the world's population are untreated. The uncertainties in HydroFATE related to contaminant simulations from untreated sources have two main sources: (1) the difficulty to spatially distinguish wastewater contributions from populations as treated or untreated in the first place, based only on WWTP characteristics, country-level statistics, and a global population grid; and (2) the generalization of the pathways of untreated contaminants into only two types (i.e., distinguishing only rural versus urban conditions) based on simplified assumptions and very little evidence from field 600 experiments (Grill et al., 2018). In the presented case study, the untreated pathways contributed an estimated 72% of the global emission of sulfamethoxazole. Nonetheless, despite the large uncertainties, the baseline scenario was able to reproduce field measurements reasonably well, especially considering the large range of possible values for the critically important direct discharge coefficients (see Table 1). Panel (d) of Figure 5 suggests a general overestimation of contaminant concentrations in regions with substantial urban extents upstream (larger bubbles) and a general underestimation for rural areas (i.e., smaller 605 bubbles representing areas with smaller urban extents), an observation which could be used to revise the direct discharge coefficients in future model runs. However, to ensure that HydroFATE is generally applicable to a range of substances, it is recommended that the model be first tested when applied to other substances before a potential calibration of different direct discharge coefficients is carried out to improve model performance. In addition, as further discussed below, the current model https://doi.org/10.5194/egusphere-2023-1590 Preprint. Discussion started: 25 July 2023 c Author(s) 2023. CC BY 4.0 License. version only accounts for one type of source (domestic), which excludes veterinary and industrial contributions that can be 610 present in waters, making any uncontrolled measurements inadequate for calibration.
Besides the river network, the pathways of the contaminants determine most of the spatial contaminant distribution of the model. The method developed in this study distinguishes the wastewater contributions from the global population as treated or untreated by relying mostly on global population and urban extent grids, and the global WWTP database HydroWASTE.
Both population and urban extent grids have their own uncertainties related to the way they have been developed, their spatial 615 resolutions, and their representative years, potentially misrepresenting actual conditions especially in sprawling cities in developing countries (Sridhar & Mavrotas, 2021). HydroWASTE is a data compilation that contains estimated characteristics instead of official records for 9% of the WWTPs, and it does not include small DWTS that are more common in rural areas.
In the absence of data, country-level statistics on sanitation were used to minimize these uncertainties regarding HydroWASTE. 620

Performance evaluation of HydroFATE using sulfamethoxazole
As a first case study application of HydroFATE, country-level consumption data of sulfamethoxazole (SMX) were used to assess its distribution in the global river and lake network. Results predicted that a total of approximately 214,000 kg of SMX are released into rivers and lakes every year from domestic sources. A cursory exposure assessment shows that this release may potentially result in a risk of environmental impact (i.e., defined as PEC ≥ PNEC) during low-flow conditions throughout 625 390,000 km of the global river network.
In terms of the input information regarding SMX, uncertainties can derive from various assumptions incorporated into the model parameters, including the country-level consumption rate, the excretion fraction (after metabolism), the wastewater treatment removal efficiency, and the instream decay constant. The contaminant emission is estimated based on country averages of consumption and population density based on the assumption that every person consumes the same amount of 630 SMX in a year across a country, which therefore does not account for regional, municipal, or personal (e.g., age-dependent) spatial variability of consumption. The excretion fraction has a relatively small range of uncertainty (Table 1) as the metabolism process of SMX inside the human body is well-known and was extensively studied by pharmaceutical companies before its release during the drug development phase (Zhang & Tang, 2018). The removal efficiency in treatment facilities, including WWTPs and DWTS, depends on the specific type of treatment being employed. The values reported in the literature vary 635 widely, possibly due to SMX being transformed to N4-acetyl-SMX and glucuronide conjugates (the most common SMX metabolites) and vice-versa during the treatment process (Straub, 2016). HydroFATE is, in principle, able to account for different levels of treatment provided by WWTPs (i.e., primary, secondary, or advanced) by using different removal efficiencies. However, due to a lack of consistent data, a choice was made in the presented case study to apply one single average value for the substance removal efficiency across all wastewater facilities, including DWTS. 640 To evaluate the general performance of HydroFATE regarding its simulation of SMX concentrations, PECs resulting from the model were compared with MECs reported in the literature. The results showed an overall reasonable predictive capability https://doi.org/10.5194/egusphere-2023-1590 Preprint. Discussion started: 25 July 2023 c Author(s) 2023. CC BY 4.0 License. with 79% of PECs being within one order of magnitude of reported MECs. This was despite the inherent uncertainties associated with assumptions made in the development of the model and those associated with estimates of the various model parameters and input datasets. Unfortunately, the lack of specificity of field measurements, for which literature sources 645 generally do not provide enough information on the precise locations of measurements nor river discharge conditions, does not allow for a conclusive evaluation of the model under different modelling scenarios.
Importantly, the PECs simulated by the present version of HydroFATE are limited in that they do not include contaminant contributions from veterinary use or pharmaceutical manufacturing operations due to a lack of available data. As it remains unclear whether MECs include SMX from only domestic or all sources, this uncertainty in both PECs and MECs could explain 650 some of the high negative bias found in the evaluation. Once data on veterinary use or manufacturing become available, they could readily be implemented to refine HydroFATE.

Conclusion
Despite its current shortcomings and inherent uncertainties, HydroFATE is the most spatially detailed global CFM currently available. It tracks multiple pathways of contaminants in the river and lake environment and has the potential to be used for 655 any CEC of domestic use. In its current version, HydroFATE is expected to be particularly useful to identify specific areas in the river network where high concentrations of contaminants may be found. As such, potential applications include the support of decision-making in order to prioritize and focus resources, regarding: (1) locations that should be the subject of detailed field measurements and local environmental impact studies; (2) the creation of scenarios for policy-making and management of water resources at regional or international scales; (3) the development of screening methods to inform new regulation or 660 guidelines for the pharmaceutical industry with respect to establishing markets for their products and performing regulatory compliance tests to safeguard ecosystems and human health; (4) the development of new or updated treatment standards for contaminants of emerging concern, including the establishment of design specifications for wastewater treatment system in specific regions; and (5) the deployment of new treatment technologies.

Code and data availability 665
Model predictions for the four main scenarios were obtained with a run time of 18 min using a Desktop PC with Intel Core i7-