Upscaling Tracer‐Aided Ecohydrological Modeling to Larger Catchments: Implications for Process Representation and Heterogeneity in Landscape Organization

Stable isotopes of water are ideal tracers to integrate into process‐based models, advancing ecohydrological understanding. Current tracer‐aided ecohydrological modeling is mostly conducted in relatively small‐scale catchments, due to limited tracer data availability and often highly damped stream isotope signals in larger catchments (>100 km2). Recent model developments have prioritized better spatial representation, offering new potential for advancing upscaling in tracer‐aided modeling. Here, we adapted the fully distributed EcH2O‐iso model to the Selke catchment (456 km2, Germany), incorporating monthly sampled isotopes from seven sites between 2012 and 2017. Parameter sensitivity analysis indicated that the information content of isotope data was generally complementary to discharge and more sensitive to runoff partitioning, soil water and energy dynamics. Multi‐criteria calibrations revealed that inclusion of isotopes could significantly improve discharge performance during validations and isotope simulations, resulting in more reasonable estimates of the seasonality of stream water ages. However, capturing isotopic signals of highly non‐linear near‐surface processes remained challenging for the upscaled model, but still allowed for plausible simulation of water ages reflecting non‐stationarity in transport and mixing. The detailed modeling also helped unravel spatio‐temporally varying patterns of water storage‐flux‐age interactions and their interplay under severe drought conditions. Embracing the upscaling challenges, this study demonstrated that even coarsely sampled isotope data can be of value in aiding ecohydrological modeling and consequent process representation in larger catchments. The derived innovative insights into ecohydrological functioning at scales commensurate with management decision making, are of particular importance for guiding science‐based measures for tackling environmental changes.

Of course, utilizing isotope tracers in modeling studies is strongly dependent on the availability of adequate monitoring data sets McDonnell & Beven, 2014). However, intensive monitoring has mostly been deployed at smaller scales, such as representative experimental plots (Beyer et al., 2020;Landgraf et al., 2022) or small headwater catchments with a drainage area <10 km 2 (Fenicia et al., 2008;Kuppel et al., 2018). At these smaller scales, isotopic signals in different compartments of the ecohydrological systems can be well captured by logistically feasible, intensive monitoring schemes. Meanwhile, small-scale catchments also allow a finer-resolution of spatio-temporal modeling scales, which are often needed to match the scale of localized hydrological processes  and related isotopic transformations through mixing and fractionation (Iorgulescu et al., 2007;Smith et al., 2021). As catchment size increases, water sources and mixing processes become more complex and heterogeneous; as a result, stream water isotopic signals often become highly damped (compared to highly variable precipitation input signals) and less directly reflecting small-scale hydrological processes (IAEA, 2012). These factors have hindered the upscaling of advanced tracer-aided modeling from small-scale, intensively monitored catchments to larger catchments (i.e., >100-1,000s km 2 ), even though current model developments (like EcH 2 O-iso) have prioritized improved spatial representations of catchment characteristics and process heterogeneity regardless of scales .
At larger scales, isotope monitoring is increasingly becoming available, although inevitably with coarser sampling resolutions (e.g., monthly grab sampled at locations with averaged drainage area >100 km 2 (Ala-aho et al., 2018;Gibson et al., 2021;IAEA, 2012;Wollschläger et al., 2016)). Such data sets have been successfully used in aiding larger scale hydrological modeling (e.g., the isoWATFLOOD modeling by Stadnyk et al. (2013)), which largely emphasizes the value of isotope information in improving model calibration/parameterization, thereby inferring improved internal processes and catchment-scale representation of storage-flux dynamics (Holmes et al., 2020;Nan et al., 2021). Given the emerging potential of upscaling of tracer-aided models like EcH 2 O-iso to mesoscale catchments (e.g., Smith et al. (2021) in a 66 km 2 , intensively monitored lowland catchment, northern Germany), it remains to be seen whether further upscaling can be supported by spatio-temporally coarser resolution isotope data in larger (>100 km 2 ), highly heterogenous catchments. Furthermore, it is unclear what additional insights into detailed ecohydrological process representation and heterogenous catchment functioning can be derived from upscaled tracer-aided modeling. Such evidence-based insights into larger-scale catchment functioning are of particular relevance to water management and decision making, especially for urgent needs of more efficient mitigation measures for water-related problems under the changing conditions (e.g., the recent European-wide droughts).
Here, we used a multi-site isotope monitoring data set from the Bode catchment (ca. 3,200 km 2 ), central Germany. The data set consists of 27 precipitation sites and 24 stream water sites, sampled on a monthly basis (Lutz et al., 2018). We adapted and upscaled the tracer-aided EcH 2 O-iso model to apply to the representative Selke sub-catchment (456 km 2 ), where stream water isotopes were sampled over 2012-2017 from seven sites encompassing the main stem and major tributaries of the Selke River. We particularly addressed the following research questions: 1. By integrating a cross-scale isotope data set with the EcH 2 O-iso model, what is the potential, and what are the key challenges, for process representation in the upscaling of advanced tracer-aided ecohydrological modeling in larger catchments? 2. Through comprehensive sensitivity analysis and multi-criteria calibration, what is the added value of using cross-scale stream isotopes in providing insight into modeled behaviors?

The Bode-Selke Catchment
The Bode catchment (ca. 3,200 km 2 ) is a part of the TERrestrial ENvironmental Observatories-TERENO network in Central Germany (Wollschläger et al., 2016) and contributes to the Modular Observation Solutions for Earth Systems-MOSES initiative, which employs specifically designed monitoring modules to investigate the interactions of short-term events and long-term trends across earth compartments with special emphasis on hydrological extremes (Weber et al., 2022). It is located in the transitional region between the Harz Mountain uplands (up to 1,142 m above sea level) and the northern lowland plain (<100 m a.s.l., Figure 1a). It exhibits high gradients of hydro-climatic and physiographic characteristics. Annual precipitation decreases Figure 1. The monitoring networks of the Bode catchment ((a), sampling sites for precipitation isotopes and the German Weather Service-DWD stations for precipitation and climate data) and the Selke sub-catchment ((b), four discharge gauging stations and seven stream water isotope sampling sites). The geographic characteristics of the Selke catchment for the EcH 2 O-iso model setup ((b)-Digital Elevation Model, (c)-land cover types, and (d)-soil types). The stream water isotope sampling covers four nested sites in the main stem of the Selke River (i.e., the discharge gauges HAUS, MEIS and SILB, and the MM site near station MEIS) and three tributaries (i.e., KB-Katzsohlbach, RB-Rodelbach, and HG-Hauptseegraben). The land cover (c) and soil type (d) abbreviations were also used for categorized model parameters.
YANG ET AL.
10.1029/2022WR033033 4 of 22 from >1,700 mm in the Harz Mountains to less than 500 mm in the lowlands, and mean annual temperature increases from 5.0°C to 9.5°C (Wollschläger et al., 2016). The upper Harz Mountain areas are dominated by spruce and beech forests, with significant agricultural areas in the southern lower mountains; while the lowland areas are characterized by intensive arable lands with major crops of winter wheat, winter rapeseed, winter barley and sugar beet (because of highly fertile chernozem soils developed on loess deposits (Altermann et al., 2005)) and urban areas. Climate data were collected from DWD (German Weather Service) climate stations within and surrounding the Bode catchment, including 51 stations for daily precipitation and 15 stations for daily air temperature (daily mean, maximum, and minimum), wind speed and humidity ( Figure 1a).
Our modeling focused on the Selke catchment ( Figure 1b) Figure 1c). The soil types (Figure 1d, adapted from BGR (2018)) change from mountainous cambisols, to mountain foot luvisols, and to loess lowland chernozems (with patches of histosols, gleysols, and highly sandy anthrosols). Here, we defined the drainage areas above the MEIS gauging station as the upland areas (184 km 2 , dominated by cambisols overlaying the shallow schist/claystone bedrock) and the rest as the lowland areas (272 km 2 , dominated by chernozems developed on the loess deposits). In line with recent European-wide changing climate, the catchment has experienced severe droughts in 2018-2019 (Yang, Rode, et al., 2022;Yang et al., 2021), exhibiting lower precipitation and higher air temperature than averages ( Figure  S1 in Supporting Information S1).

Multi-Site Isotope Sampling Data Sets
We used precipitation isotope data sampled from 27 sites throughout the Bode catchment ( Figure 1a) between 2012 and 2017. Monthly composite precipitation samples were collected from evaporation-protected precipitation collectors. Stream water samples were grab samples collected monthly from 24 sites in the Bode catchment. For details of the monitoring scheme, please refer to Lutz et al. (2018). We exclusively used four of these sites located within the Selke catchment ( Figure 1b,  Following earlier work (Yang et al., 2021), we defined four hydrological seasons as Wet (January-March), Drying (April-June), Dry (July-September), and Wetting (October-December). Precipitation isotopes showed clear seasonal patterns, that is, more depleted and more enriched in heavier isotopes during wet-drying and dry-wetting seasons, respectively ( Figure 2a); The signal of the stream water isotopes was more seasonally damped at each sampling site, and the isotopes were mostly located around the Local Meteoric Water Line ( 2 H = 7.3 × 18 O + 3.9 ), except the samples at lowland tributary site HG and the dry-wetting season samples at RB and HAUS (Figure 2b). A more detailed overview of the sampled data set is provided in Text S1 in Supporting Information S1.
YANG ET AL.

The EcH 2 O-Iso Model
The tracer-aided model, EcH 2 O-iso (Kuppel et al., 2018), is developed based on the fully distributed process-based ecohydrological model EcH 2 O (Maneta & Silverman, 2013). The model integrates modules of energy balance, water balance, vegetation dynamics, and flux tracking (evolving isotope compositions and water ages as tracers through model compartments). Here, we provide only a brief introduction of water balance and tracer related model structures (see the schematic diagram in Figure S2 in Supporting Information S1). For complete model details, please refer to Maneta and Silverman (2013) and Kuppel et al. (2018).
Driven by climatic forcing inputs (i.e., radiation, air temperature, humidity, and wind speed), the energy balance is solved at the canopy and the land surface levels, deriving latent heat fluxes that determine evapotranspiration and snowmelt. Canopy and soil evaporation are constrained by the canopy water storage and first-layer soil water storage, respectively. Transpiration is constrained by the canopy level environmental factors that are implemented via a Jarvis-type stomatal conductance model, with vegetation-dependent parameters representing stomatal properties (Table 1). The transpiration flux is further partitioned to the three soil layers based on soil water storages and root fraction distributions. Please refer to Text S2 in Supporting Information S1 for detailed descriptions of the evaporation and transpiration process conceptualization. The water balance uses linear bucket-type approaches for multiple water storages, including interception, snowpack, surface ponding, and soil water. Throughfall after exceeding canopy interception capacity and direct precipitation enter any surface snowpack or ponding storage, where snowmelt and infiltration processes are also considered. Surface infiltration into the first soil layer is calculated using the Green-Ampt equation (Mein & Larson, 1973), and further infiltrating into deeper layers once the upper layers get saturated. The vertical leakage from the third soil layer enters into the deeper groundwater storage implemented by Yang et al. (2021), which also considers hydrologically active storage and additional passive storage for solute-mixing.
Lateral exchanges (and runoff generation where grid cells link to the channel) are enabled at three levels, that is, the surface ponding storage (surface overland flow), the gravitational storage of the third soil layer (groundwater flow), and the deeper groundwater storage (deeper GW flow). The two subsurface flows are calculated using linear kinematic methods and transported to corresponding storage of the downstream grid cell, while ponded surface water is transported as the surface overland flow at the end of each time step. For grid cells where a channel is present, groundwater and deeper GW contributions to the channel are calculated using an exponential decay function of active storage with channel-seepage parameters. Similar for channel-connected cells, any surface ponding becomes surface overland flow to the channel. Stream water is routed along the river network using a kinematic-wave approach considering the channel roughness. Key model parameters are  Table 1 with initial ranges set based on reference values from EcH 2 O documentation (https://ech2o. readthedocs.io, last accessed 13 September 2022) and EcH 2 O-iso applications in a similar northern German catchment . Please also refer to the associated open data (Yang et al., 2022b) for the full parameter list.
The tracer metrics (i.e., isotopic composition and water ages) are fully integrated into the ecohydrological cycling based on the full-mixing assumption in each storage compartment (Kuppel et al., 2018). Water age is treated as a conservative tracer concentration and the age of each storage increases by one after each time-step. All tracer concentrations are updated at each time step according to inflows and previous-step storage, weighted by the corresponding water amount; the concentrations of outflows are equal to that of the storage. Throughfall water is assumed to have the same isotopic composition and age (i.e., zero) as that of incoming precipitation. Isotopic fractionation by evaporation is considered in the first soil layer using the Craig-Gordon fractionation model (Craig & Gordon, 1965) with modifications for saturation in a porous medium, detailed in Kuppel et al. (2018).
Transpiration flux is contributed from the three soil layers based on vegetation-dependent root distributions (parameter Kroot, Table 1) and availability of soil water content (Text S2 in Supporting Information S1). The transpiration water ages are calculated as the flux-weighted means over the soil layers. Here, we further evaluated the transpiration Exit Age Ratio (EAR; adapted from Soulsby et al. (2016)) calculated as the ratio of the transpiration water age and the age of soil-stored water (i.e., the volume-weighted mean age over the soil layers). The EAR (∈ [0, 1]) indicates the transpiration water age extraction characteristics, with lower values highlighting heterogeneity of source water ages (root water uptake typically from the upper, younger soil water) and higher values highlighting the preference of tapping into well-mixed, less-variable stores (Kuppel et al., 2020).

ManningRiv_S
River routing parameter related to channel roughness (Manning's n) [1,20] Note. The labels "_L" and "_S" indicate land cover type and soil type dependent parameters, respectively, and should be replaced for each type as abbreviated in Figures 1c and 1d. Note that the channel seepage and routing parameters were assigned as soil-type dependent in this study. 7 of 22

EcH 2 O-Iso Adaptations for Larger Scale Modeling
The EcH 2 O-iso ecohydrological model has mostly been applied to intensively monitored plots or smaller experimental catchments (up to 66 km 2 ). Hence, some of the technical aspects of implementation needed to be adjusted for larger-scale catchments and the coarser spatio-temporal resolution modeling at these larger scales: 1. An input map of areal proportion of each modeling cell was added, given that the boundary cells might not be fully covered by the catchment domain. 2. Topographic characteristics (i.e., slope, drainage network, and stream morphology) could not be derived directly from a resampled digital elevation model (DEM) due to the coarse resolution at the modeling grid scale (e.g., 1 km 2 ). Therefore, this study retrieved the information from Yang et al. (2018) in the modeled Selke catchment using multi-scale routing model implementations (Samaniego et al., 2010): first, the main drainage line of each modeling cell was derived according to the flow direction (D8 algorithm) and flow accumulation at the finer resolution DEM (e.g., 100 m); Second, the drainage direction at the 1 km 2 model resolution was assigned as the flow direction of the outlet of the main drainage line, and the slope was calculated as the elevation difference between the inlet and the outlet, divided by the main drainage length. 3. For a better representation of spatial variability, multiple soil types and land cover types were allowed for each modeling cell (rather than one soil or land cover type for each cell). This was supported by using input maps of type-specific areal proportions . Hydrological parameters were assigned as soiltype dependent, and the values in each grid cell were aggregated using the area-weighted geometric means if multiple soil types were present. 4. For grid cells with channels, the surface ponded water was partitioned into surface overland flow draining to the down-slope cell and surface overland flow to the channel, according to their relative lengths. Importantly, re-infiltration of any surface run-on into the first soil layer and the subsequent vertical tracer mixing (for both water isotopes and age) throughout the soil profile was activated, especially as the modeling resolution was coarse and the river network was relatively dense.

Sensitivity Analysis and Multi-Criteria Calibration
Sensitivity analysis was used for parameter ranking and diagnosing the information content of multiple data sources in revealing modeling behavior. The Morris method (Morris, 1991) was used: the elementary effect (EE-ratio between model output variation and parameter variation) of each parameter was calculated by changing one parameter at a time (i.e., one trajectory); then, multiple trajectories were repeated with randomly generated starting points, and the average of absolute EEs for each parameter was taken as the global sensitivity index. With low computation requirement, the Morris method is particularly suitable for screening and ranking purposes (Pianosi et al., 2016;Yang et al., 2019). The parameter sets were sampled using the radial-based Latin-Hypercube sampling strategy, due to its efficiency and coverage within the sampling space (Campolongo et al., 2011).
The most sensitive parameters were used in multi-criteria model calibrations using the Monte Carlo based method of Ala-aho et al. (2017) (see also flowchart in Figure S3 in Supporting Information S1). A large number of model runs (400,000 for this study) was first conducted with randomly sampled parameter sets. For each evaluation data source, model performance metrics were calculated for each model run and ranked among all runs (i.e., the data-specific empirical cumulative distribution function-CDF). A threshold quantile was iteratively determined among the CDFs of all data types and corresponding evaluation metrics, above which the common best 100 runs were detected. This method avoids potential uncertainty of objective aggregation between the multi-criteria and is flexible for analyzing the value of specific data in constraining the model performance (Yang et al., 2021). Moreover, the large number of model runs can be easily parallelized, which significantly reduces the computing time.
The evaluation metrics for sensitivity analysis were Kling-Gupta Efficiency-KGE (Gupta et al., 2009) for discharge data and Mean Absolute Error (MAE) for stream water 2 H . The MAE was used for isotopes because the coarse (∼monthly) sampling had limited information on the short-term variability of isotope dynamics. The global sensitivity indices were calculated from 100 trajectories, as the maximum number suggested by Pianosi et al. (2016). For the multi-criteria calibration, we used log-transformed Nash-Sutcliffe Efficiency (NSE) and percentage bias of the water balance (PBIAS) for discharge, to ensure comparable weights for both high-and low-flow periods (Krause et al., 2005) and the overall water balance performance, respectively, and MAE for stream isotopes ( 2 H ). In total, 400,000 model runs were computed, and a two-stage calibration strategy similar 8 of 22 to Yang et al. (2021) was adopted ( Figure S3 in Supporting Information S1): first, the best 40,000 runs according only to discharge metrics were selected to ensure the overall water balance modeling; the second stage selected the best 100 runs out of these 40,000 runs, using discharge data alone (the "discharge-only" calibration) and both discharge and isotope data (the "discharge plus isotope" calibration). All computations were performed at the High-Performance Computing Cluster EVE, a joint effort of both the Helmholtz Centre for Environmental Research-UFZ and the German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig.

Model Set-Up and Data Processing
The daily EcH 2 O-iso modeling (spatial resolution of 1 km × 1 km) was setup in the Selke catchment (Table S1 in Supporting Information S1) during 2012-2019. Topographic characteristics (i.e., slope and drainage network at the 1 km × 1 km resolution) were derived from Yang et al. (2018) based on the original 100 m DEM. The Selke river network was retrieved from the State water authority-LHW (https://lhw.sachsen-anhalt.de/, last accessed on 13 September 2022). Stream channel length was calculated directly from the LHW river network, and channel width was estimated using the empirical equation (i.e., 5.4 ⋅ 0.5 (Rode et al., 2016), where denotes the annual mean discharge obtained from Yang et al. (2018) for each stream reach).
To make most use of the whole monitoring data sets, model climate inputs (spatial resolution of 1 km × 1 km) were prepared at the Bode catchment level. Gridded precipitation input was spatially interpolated using the kriging method with elevation as external drift, while for other climate inputs, the Inverse Distance Weighted (IDW) method was used due to the relatively low number of stations (the "hydroTSM" package in R software (Zambrano-Bigiarini, 2020)). Incoming short-and long-wave radiation measurements are rarely available for DWD stations, therefore the daily radiation input data were retrieved from the ERA5 hourly reanalysis data with 0.25° × 0.25° spatial resolution (Hersbach et al., 2018), downloaded from the Copernicus Climate Change Service Climate Data Store.
The daily gridded precipitation isotope inputs were prepared based on the isotope monitoring data set from the Bode catchment. First, the multi-site monthly composite data were spatially interpolated for each 1 km 2 grid cell. For months having measurements from more than 13 sites, the kriging method drifted by elevation was used; for the remaining months, the IDW method was used instead. Second, the temporal interpolation in each grid cell was conducted in two steps: (a) given the seasonality of isotope signals, mean and standard deviation were calculated separately for January-December and weighted by recorded precipitation amount; and (b) the daily values were then randomly generated following the normal distributions. The interpolated daily and sampled monthly 2 H data fit well with each other, and comparisons at the Selke sampling sites were provided in Figure  S4 in Supporting Information S1.
Daily discharge data at HAUS, MEIS, and SILB in 2012-2014 and 2018 were used for calibration, covering both normal (before 2018) and recent drought (2018 and 2019) years. The discharge data at the above three stations in 2015-2017 and 2019 were used for temporal validation, and the whole period data at GUEN for spatial validation. Stream isotope data from six sites (i.e., HAUS, MM, MEIS, SILB, KB, and RB) in 2012-2014 and 2015-2017 were used for stream isotope calibration and validation, respectively. Additionally, EcH 2 O-iso requires initial conditions of water storages, as well as tracer concentrations and ages therein, therefore, the period 2010-2011 was used as the model warming-up period to stabilize these initial inputs.

Model Parameter Sensitivity Variations Based on Discharge and Isotope Data at Multiple Sites
The sensitivity of the EcH 2 O-iso parameters was ranked separately for discharge and isotope data at individual stations and then cross compared (Figure 3). For the upland discharge and isotope data (i.e., above the MEIS/MM station), the parameter rankings were generally similar, with the common sensitive parameters related to canopy interception (CWSmax), topsoil hydraulic properties (Ks), snowmelt (snowmeltCoe) and leakage to the deeper groundwater (leakance); interestingly, parameters related to vegetation stomatal properties (gsmax, gs_light_coeff, and gs_vpd_coeff) and soil field capacity (BClambda) were more sensitive based on discharge data, while parameters related to soil heat fluxes (Sheatcap), subsurface lateral fluxes (HyFrac_dgw, ChanDGWSeep, and ChanGWSeep) and vertical soil hydraulic conductivity (Aniso) were more sensitive based on isotope data. The rankings also exhibited slight variations at different upland 9 of 22 stations. For instance, at MEIS/MM, the channel routing parameter (ManningRiv) and vegetation parameters of deciduous forest (labeled with "_DF") become more sensitive primarily due to the increased importance of channel routing and deciduous forest area, respectively. In the lowlands, parameters related to hydraulic conductivity (Ks), deeper groundwater dynamics (ChanDGWSeep and HyFrac_dgw), soil heat fluxes (Sheatcap) and crop canopy-interception and transpiration were similarly sensitive to isotope simulations (isoHG, Figure 3; parameters labeled with 1-5 and "AG" represented different soil types and extensive arable lands in the lowlands, respectively).
The sensitivity rankings for discharge and isotopes diverged more significantly at the outlet HAUS (qHAUS and isoHAUS, Figure 3), compared to that of the upland gauging stations. The ranking for discharge generally followed that of the uplands, though with agricultural vegetation parameters ranked higher. In contrast, the ranking for isotopes showed more an equal combination of the most sensitive parameters from the uplands and the lowlands (isoMM and isoHG, respectively; Figure 3).

Discharge and Tracer Performance of the Upscaled EcH 2 O-Iso Modeling
The top 20% sensitive parameters for each sensitivity ranking shown in Figure 3 were then combined and included in model calibrations. For both "discharge-only" and "discharge plus isotope" calibration schemes, the EcH 2 O-iso discharge simulations captured the complex temporal variations at all four gauging stations across all the spatial scales reasonably well (Figure 4). At the two downstream stations HAUS and MEIS (Figures 4b and 4c), the two calibration schemes performed similarly well (KGEs mostly >0.55 and PBIAS mostly within ±15%), with a slight degradation for the "discharge plus isotope" scheme. At the upland stations SILB and GUEN (Figures 4d  and 4e), the simulations performed slightly better in terms of overall water balance (PBIAS mostly within ±10%), but with lower KGE values; especially for the "discharge-only" scheme, the simulations largely overestimated peak flows (e.g., at SILB, median simulations overestimated by 26.8% during periods with discharge >90th quantile, Figure 4d), followed by very quick recessions (KGEs decreased to around 0.40, with the lowest mean value of 0.19 at SILB in the validation period). Importantly, the inclusion of isotope information significantly improved the performance during the high-flow periods of each hydrological year, thereby significantly mitigating the degradation of discharge performance in the uplands (e.g., KGEs mostly >0.45, Figures 4d and 4e). Although both calibration schemes over-estimated the extreme low-flows during the drought 2018-2019, the prolonged Figure 3. Parameter sensitivity rankings based on discharge ("q") and isotope ("iso") data at different stations and sampling sites. The top 20% sensitive parameters for each criterion were combined for model calibrations (parameter names labeled in blue font color). Labels at the end of vegetation-and soil-type dependent parameters corresponded to abbreviations in Figures 1c and 1d, respectively. The full sensitivity ranking results were provided in the associated repository (Yang et al., 2022). 10 of 22 discharge decreases and the subsequent recovering patterns were captured, and the simulation uncertainty during the driest periods was better constrained with the isotope information.
The inclusion of isotope data in the calibration improved the isotope simulations, both in terms of goodness-of-fit metrics (i.e., smaller MAEs) and better constrained uncertainty (Figures 5a-5f), although the overall errors remained quite high at some sites. The median values of the "discharge-only" calibration were consistently biased to the upper boundary of the uncertainty ranges; moreover, the simulated median isotopes and ages failed to reflect the prolonged drought disturbances in last 2 years (solid blue and dashed gray lines, Figures 5a-5f). In contrast, the isotope simulations of the "discharge plus isotope" scheme reproduced the highly damped stream water signals relatively well (MAEs of 2 H mostly <7‰), especially during the validation periods with lower MAEs and the simulated median values closely following the observations (Figures 5a-5f). In particular, the performance was consistent across the spatial scales, ranging from smaller scales at upland sites KB and RB to larger scales at the downstream main stem sites MEIS/MM and HAUS (with drainage areas <15 and >200 km 2 , respectively). The model simulations also captured the significant fluctuations during the wet high-flow periods, though we were not able to validate this due to the monthly observations. The simulated uncertainties in dryer , 99, and 28 km 2 , respectively) in the Selke catchment. The lines and uncertainty ranges represent the median values and 90% of simulation ranges, respectively, of the best 100 runs from the two calibration schemes. The performance metrics were shown as mean ± standard deviation of the best 100 runs. The calibration periods are shaded (the whole GUEN data were used for a spatial validation). Note that the y-axes are log-transformed for a better presentation.

of 22
periods and in the drought 2018-2019 were also higher and could be probably better constrained if more and higher resolution data were available (e.g., Yang et al., 2021).
In the calibration period 2012-2014, the isotope simulations generally overestimated the observations in 2013-2014 (except for sites HAUS and RB, Figures 5a-5f). Surprisingly in the validation period 2015-2017, these systematic overestimates were mostly eliminated, whereas the simulations at sites HAUS and RB showed larger underestimates. The exclusive lowland tributary HG exhibited heavily enriched and strongly fractionated isotope signals (Figure 2b), and model performance degraded largely showing a continuous underestimation, although . The lines and uncertainty ranges represent the median values and 90% of the simulation ranges, respectively, of the best 100 runs from the two calibration schemes. The isotope performance metrics (Mean Absolute Error, MAE) are shown as mean ± standard deviation of the best 100 runs. The calibration periods are shaded, and the data at site HG (g) were not included in calibrations. the simulated isotopes were significantly more enriched compared to the upland simulations (e.g., the median 2 H simulations at HG and MM were −53.7‰ ± 4.0‰ and −59.4‰ ± 3.0‰, respectively; ANOVA test, 0.001 ; Figures 5b and 5g). Interestingly, exceptions were that two observation points in December 2013 and October 2014 exhibited abnormal depletions, which fitted with the simulated levels.
The temporal variability of isotopic tracer behavior has also been correspondingly reflected in the stream water age simulations-a useful metric to characterize hydrological functioning (the median values of the best 100 runs shown as gray solid lines in Figure 5). Stream water ages were much younger during high-flow periods (as low as <0.5 years) and continuously increased (up to ca. 10 years in the normal years 2012-2017) with decreasing discharge, in accordance to the strongly fluctuating and gradually depleting isotope simulations, respectively. Such seasonal patterns of water ages were generally consistent across the spatial scales, with the dry period ages slightly increased from the upland tributary sites (KB and RB, Figures 5e and 5f) to the main stem sites (SILB and MEIS/MM, Figures 5b-5d). Notably, the ages during April-September were significantly younger in 2013-2014 than the other years (e.g., mean values <3.4 years vs. >5.1 year at the main stem sites, Figures 5a-5d), indicating relatively wetter conditions. This is supported by the higher surface overland flow contributions during these periods ( Figure S5 in Supporting Information S1). In the drought years 2018-2019, water ages in the wet season (January-March) were similar to that of the normal years, while followed by dramatic increases (up to ca. 20 years) crossing the upland tributary sites to the main stem sites (Figures 5a-5f). At the exclusive lowland tributary HG, the ages were 3.9 ± 1.0 years, showing a more stationary state compared to other sites (e.g., the age standard deviation at MM was 2.3 years); interestingly, the ages showed a much more moderate increase in the drought 2018-2019 (5.7 ± 1.5 years), but the seasonal recovery behaviors were not observed during the intervening wet season of 2019. The ages at the outlet HAUS ( Figure 5a) varied seasonally, following the general pattern of the Selke upland sites, while with reduced dynamic amplitudes (e.g., the ages increased to ca. 15 years during the droughts) due to the flow-weighted mixing of the main Selke and the lowland tributary.

Spatio-Temporal Variability of Water Flux and Age Dynamics and Their Responses to Drought
The best-performing model simulations provided further insights into the highly heterogeneous catchment functioning and its responses to the severe droughts 2018-2019, as well as the effects of spatial scale on process representation. Due to the high gradients of climatic and physiographic conditions, the runoff generation and component partitioning strongly contrasted between the uplands and the lowlands (Figure 6). For the normal years, the high-runoff producing upland areas (i.e., >75th quantile annual runoff in upland grid cells, with corresponding annual precipitation as high as 828.0 ± 60.6 mm, Figure S1a in Supporting Information S1) generated 390.9 ± 79.0 mm annual runoff, with overland flow contributions accounting for 19.6% of total runoff in these cells; while the lowland areas exhibited an average annual runoff of 70.7 mm, with a large number of cells having almost no overland flow (Figures 6a and 6b). The groundwater flow from gravitational drainage of soil water storage dominated the total runoff, accounting for 81.3% and 93.9% of total runoff in the uplands and the lowlands, respectively, with the deeper GW flow contributing an additional 8.0% and 1.2%, respectively (Figures 6c and 6d). Such heterogeneous runoff generation patterns also responded differently to drought stress (Figures 6e-6h). The upland areas exhibited 18.4% decreases of annual runoff compared to the normal years (191.4 ± 102.0 vs. 234.7 ± 116.4 mm; ANOVA test, = 0.00125), while its runoff partitioning changed in terms of significantly increased deeper GW flow contributions (an increase of up to 15.0 ± 7.0%, ANOVA test, 0.001; Figure 6h vs. Figure 6d) and large areas with nearly no surface overland flow (except for the high-runoff cells; Figure 6f). In contrast, for the lowlands, the annual total runoff homogeneously decreased as low as 36.8 ± 32.2 mm, with nearly no surface overland flow (<0.01%, Figure 6f) and only a marginal contribution from the deeper groundwater (mean of 2.5%, Figure 6h).
The simulated runoff ages provided further insights into the spatio-temporal heterogeneity of runoff mechanisms and hydrological functioning (Figure 7). Runoff ages of each grid cell exhibited strong seasonal variations in the uplands and a steadier pattern in the lowlands. The spatial variability of upland runoff ages also varied seasonally: the ages were generally 1-year younger in the high-runoff headwaters than in the central cells during the wet and drying seasons (e.g., means of 2.5 vs. 3.9 years during the drying period in Figure 7b); then, the uplands become more uniformly older during the dry season (spatial mean of 4.0 years with the lowest coefficient of variation as 0.325, Figure 7c), followed by spatially varying age recoveries during the re-wetting season (similar age ranges and spatial patterns to the drying season, Figures 7b and 7d). The lowland areas exhibited more homogeneous spatial patterns across seasons (mean ages of ∼3.0 years with standard deviations <1.0 years, Figures 7a-7d). The runoff partitioning changes under drought stress induced marked impacts on the spatio-temporal variability of runoff ages for the uplands (Figures 7e-7h): the runoff ages and spatial patterns were similar to those of the normal years during the wet season, but ages dramatically increased and homogenized during the drying season (similar spatial pattern to the normal year dry season, with even older ages; Figure 7f vs. Figure 7c); the ages increased from 5 years to >10 years as the drought further developed during the dry and wetting seasons (Figures 7g and 7h). The lowland runoff ages showed relatively moderate and spatially homogeneous increases in all seasons of the drought years (e.g., mean ages of 3.7 and 5.0 years during the drying and wetting seasons, respectively, Figures 7f and 7h), which is in accordance with the reduced soil water replenishment due to reduced precipitation (annual means of 547.1 ± 58.0 vs. 400.5 ± 51.2 mm in the normal and drought years, respectively, Figure S1 in Supporting Information S1) and marginal changes in runoff source waters (Figure 6). The simulated ecohydrological fluxes also showed high spatial variations between the uplands and the lowlands (annual evapotranspiration means of 436.5 vs. 369.4 mm, respectively in the normal years, and 361.0 vs. 311.2 mm, respectively in the drought years; detailed seasonal patterns of the flux partitioning into transpiration, soil evaporation and interception evaporation shown in Figure S6 in Supporting Information S1). Transpiration, as the major component of catchment ecohydrological fluxes involving all three-layer soil water dynamics, showed spatio-temporally varying usage of soil water both quantitatively and qualitatively (Figures 8 and 9), although this was not specifically constrained by xylem isotopes. In the normal years, the transpiration fluxes were similarly high during April-September as the major vegetation period (mean of 0.6 ± 0.24 mm d −1 , Figures 8j-8k, Figure  S6 in Supporting Information S1); however, the transpiration water ages were somewhat older during the drying season than the dry season across the catchment (means of 150 vs. 140 days and 251 vs. 225 days in the uplands and lowlands, respectively; Figures 8b and 8c). Meanwhile the transpiration EARs were also higher during the drying season, indicating further higher degrees of vertical mixing of soil waters (Figures 8j, 8k, 9b and 9c; ANOVA test, 0.001 ). The uplands exhibited significantly younger transpiration water ages, but higher EARs, in the channel-connected grid cells compared to the hillslope cells not connected to the channel across all seasons (e.g., during the drying season, mean ages and EARs were 123.1 vs. 196.7 days and 0.242 vs. 0.236, respectively, Figures 8 and 9b; ANOVA test, 0.001 ). The lowland transpiration water ages were mostly >200 days, with some scattered areas exhibiting exceptionally old ages (i.e., >400 days).
The modeling implied that the severe drought in 2018-2019 had dramatically altered the seasonal and spatial patterns of transpiration. The transpiration fluxes were largely maintained during the drying season and dramatically decreased as the drought further developed; importantly, the increases of EARs started earlier since the drying season and further increased during the dry season (Figures 8j and 8k). Moreover, these drought-induced changes of water ages showed significant spatial variability (Figures 8e-8h and 9e-9h). The upland transpiration water ages were significantly younger than that in the normal years during the wet season (means of 123 vs. 147 days, Figures 8a and 9a; ANOVA test, 0.001 ) and continuously increased in the following seasons (i.e., 15 of 22 means of 187 and 165 days during the drying and wetting seasons); while the lowland transpiration ages were significantly older than that of the normal years in all seasons (e.g., means of 282 vs. 251 days during the drying seasons), with an increasing expansion of the high-age (>400 days) areas (Figures 8a-8h). Interestingly, the upland channel-connected cells still exhibited younger transpiration water ages than non-channel connected cells (Figures 8e-8h). Such landscape organizational differences did not always apply for transpiration age extractions (indicated by EARs) under drought stress, that is, EARs in the uplands were spatially more homogeneous during the wet and drying seasons (Figures 9e and 9f) but spatially varying during the dry and wetting seasons (i.e., with higher EARs in channel-connected cells than non-channel cells, Figures 9g and 9h).

Upscaling Tracer-Aided Ecohydrological Modeling to Larger Catchments: Opportunities and Challenges for Process Representation
Through successful use at the plot-scale and in smaller catchments, the fully distributed EcH 2 O-iso model has shown flexible structures for representing intra-and inter-grid heterogeneity and catchment spatial organization (e.g., the hillslope-channel configuration) (Kuppel et al., 2020;Smith, Tetzlaff, Maneta, et al., 2022;Yang et al., 2021). This highlights the model's potential suitability for upscaling and the necessity for testing in larger catchments like the Selke. First, the model allows spatially distributed climate inputs and catchment physiographic properties (e.g., soil types and vegetation species/types) to be represented independent of modeling resolution. This ensures adequate representation of multiple environmental drivers and their various interactions that encompass the functional heterogeneity across larger catchments. For example, given the similar landscape characteristics and geology, the spatial variability of runoff processes in the Selke uplands ( Figure 6) was primarily driven by the spatial pattern of precipitation ( Figure S1 in Supporting Information S1). Meanwhile, the lowland variability of water ages in transpiration fluxes and the soil water sources (Figure 8, Figure S7 in Supporting Information S1) was likely explained by the heterogeneity of soils and related hydraulic properties. In addition, the process heterogeneity within a single model cell became more pronounced with coarser modeling resolution at larger scales (1 km × 1 km in this study). The EcH 2 O-iso model addresses this issue through separate calculation of vegetation processes for each vegetation type present in a grid cell and then areal-weighting them for the overall grid outputs. Additionally, hydrological parameters related to soil processes are soil-type dependent and aggregated using the area-weighted geometric means if multiple soil types are present in one grid cell. Such designs of model structure and parameterization at the finer input resolution have advantages in terms of increasing representation of spatial heterogeneity and modeling robustness across scales (Beven, 1995;Clark et al., 2017). Second, beyond heterogeneity and process representation, the EcH 2 O-iso model structure accommodates catchment organizational heterogeneity, and thus, allows emergent process representation rather than forcing only by distributed parameterization. Accommodating such catchment organizational principles is essential in representing overall catchment functioning (McDonnell et al., 2007;Sivapalan, 2005).
However, the upscaling application to the Selke catchment required some model adaptations (Section 2.2). Specifically, initial simulations highlighted the importance of re-infiltration of surface overland flow from up-slope cells. This process is not considered in the original model when grid cells include river channels; while for coarser modeling resolutions with dense river networks, switching on this process was needed to reproduce stream isotope simulations and damping variability (detailed trial-and-error results not shown due to the limited space). Such upscaling adaptations need to be carefully evaluated since the importance of specific processes might differ at different scales and new processes might emerge (Fatichi et al., 2016). The redistribution of surface runoff between multiple pathways, for example, re-infiltrating into soil, transport to a down-slope grid cell, or drainage to the stream channel as an overland flow component, is highly non-linear at small scales and extremely challenging to constrain for larger-scale investigation with coarser resolution modeling. For example, the clear systematic mismatch of our isotope simulations at some sites during March-May 2013 and June-August 2014 ( Figure 5) was likely due to the abnormally wet weather that invoked significant overland flow contributions (Figure 4a, Figure S5 in Supporting Information S1). This also links to the general limits of coarse-resolution modeling in reproducing localized processes with shorter characteristic length scales (Skøien et al., 2003), and particularly so for the tracer mixing between compartments with contrasting isotopic signatures . In addition, although the simulated 2 H in the drier, warmer lowlands were substantially more enriched in both stream waters and soil waters than in the uplands ( Figure 5, Figure S8 in Supporting Information S1), the simulations could not reach the enrichment level of the HG observations. However, beyond modeling uncertainties in the lowland loess areas, this mismatch is most likely explained by an effect of the interactions between stream water and the nearby mining lake water (impacted by open water evaporation) through complex groundwater dynamics in this region (Z.-Y. Zhang et al., 2021). Note that the well-matched two depleted observations at HG were sampled during the first flow-events after dry summers, likely before stream-lake hydraulic connectivity was re-established. In addition, the surface open water evaporation and the associated isotopic fractionation effects on isotope signatures would need more explicit representation of lake impacts for future modeling of larger-scale catchments where lakes and reservoirs are likely to be present (e.g., our simulations missed the isotope observations at RB, which were also impacted by an upstream small reservoir, Figure 5f).
Although the logarithmic NSE discharge metric was used particularly to better constrain low-flow simulations (Krause et al., 2005), the prolonged low flows under the droughts were still slightly overestimated (means of 0.51 vs. the observed 0.21 m 3 l −1 at the outlet station, note the logarithmic y-axes in Figure 4). It is generally very difficult to further overcome such subtle mismatches of summer low flows and would need further effort to resolve alternative model structures (Staudinger et al., 2011). We also noted that our ecohydrological model representation of deeper groundwater dynamics and their interactions with surface waters remains overly simplistic. Moreover, at some sites the overall isotope simulations exhibited considerable mismatches against the monthly observations. This primarily reflects the challenges of upscaling tracer-aided modeling as detailed above and the general limitation of coarse resolution isotope data available in large catchments for calibration. Similar to the monthly samples in the Selke/Bode catchment, large-scale catchments are unfortunately often lacking of high-resolution sampling, which is necessary for characterizing short-term dynamics of tracer and solute transport. Therefore, our tracer-aided modeling sought to reproduce the general seasonality and spatial patterns of isotopes, which still increases confidence in the general flux-storage-age representation compared to conventional hydrological models (Birkel et al., 2014). This is also the reason that we chose MAE as the metric of isotope calibration, rather than the typical KGE or NSE for discharge simulations with high-resolution timeseries. We noted that the isotope performance might be further improved through different calibration strategies (e.g., using different evaluation metrics and calibration methods), though the general spatio-temporal patterns of catchment functioning derived in this study are still insightful (see Section 5.3).
There was also increased uncertainty in the isotope simulations and the inferred water ages during the droughts (e.g., periods with water ages >15 years, Figure 5), which was most likely due to the increased contributions of deeper groundwater flows with generally much older ages. We stress, however, that we were not intending to derive exact age values in these periods (which would require other tracers), but rather to use the water 17 of 22 stable isotopes to indicate changes in catchment functioning. In addition, and in line with the challenge of all process-based modeling in larger catchments, we acknowledge that the model estimates of internal processes might differ from the real catchment functioning, and that additional data from different sources could be critical to further verify and/or constrain the modeling. For example, the model simulated reasonable total evapotranspiration (e.g., 396.5 ± 43.7 and 332.6 ± 36.2 mm in the normal and drought years, respectively), similar to other regional estimates (e.g., flux tower measurements in Gillefalk et al. (2021) and modeled estimates in Smith et al. (2021)). However, the model seemed to underestimate the proportion of transpiration, especially in the lowlands ( Figure S6 in Supporting Information S1), given that this process is only indirectly constrained by stream isotopes as part of the modeling work flow while not specifically by vegetation isotopes, for example, in xylem waters. To address this would need direct calibration objectives based on, for example, measurements of sap flux (H. , soil water isotopes and/or energy information from an eddy covariance flux tower.

The Value of Tracer Information on Aiding Ecohydrological Modeling and Implications of Isotope Monitoring at Larger Scales
We demonstrated that the information content of even relatively coarsely sampled isotope tracer data can still be a useful complement to conventional hydrometric data and enhance larger-scale tracer-aided modeling through comprehensive cross-scale parameter sensitivity analyses and model calibrations.
The most sensitive EcH 2 O-iso parameters for discharge and isotope data were generally similar, including parameters related to canopy interception (CWSmax), snowmelt (snowmeltCoe), soil hydraulic properties (e.g., Ks and BClambda), vegetation stomatal and transpiration properties (e.g., gsmax, gs_light_coeff, gs_vpd_coeff, and Kroot), and runoff generation (e.g., ChanGWSeep, leakance, and ChanDGWSeep) (Figure 3). This indicates the consistency of the information content of hydrometric and tracer data (Lischeid, 2008). Meanwhile, parameters related to the overall water balance ranked higher for discharge simulations, while parameters related to runoff partitioning, soil water dynamics and the energy balance ranked higher for isotopes. This supports the utility of isotope data in constraining internal processes in terms of more realistic representation of storage-flux interactions, water partitioning and the associated evolution of water ages. These findings are in line with Birkel et al. (2014) who included soil water isotope measurements in sensitivity analysis in the well-studied Bruntland Burn catchment, Scotland (3.2 km 2 , with annual precipitation of around 1,000 mm and ET of ∼350 mm). The inclusion of isotope data in calibration maintained similarly good discharge simulations, with particularly significant improvements at the upper stations in validation (e.g., mean KGEs improved from 0.19 to 0.45 at SILB; Figures 4d and 4e), though slight degradation in fit at the two downstream stations where the "discharge-only" performances were already good (Figures 4b and 4c). Crucially, the 2 H tracer simulations and stream water age estimates improved significantly in terms of seasonal patterns and responses to nonstationary high-flow perturbations ( Figure 5), although the overall performance of isotope simulations was still inhibited by the general challenges of the upscaled tracer-aided modeling as detailed in Section 5.1.
Embracing the upscaling challenges discussed above, the isotope-informed stream water age simulations ( Figure 5, Figure S9 in Supporting Information S1) agreed quite well with the independent age estimations from Lutz et al. (2018) who used lumped transit time models (e.g., our simulated mean age was 2.9 years at the main stem site MM during February 2012-May 2015, compared with that of 2.2 years in Lutz et al. (2018)). Moreover, the simulated stream water ages were significantly younger during the abnormally wet periods of March-May 2013 and June-August 2014 ( Figure 5). Therefore, our upscaled tracer-aided modeling plausibly provided a first larger-scale approximation that captures storage-flux-age interactions and flow path distributions, as well as their nonstationary dynamics, in larger catchments.
Most coupled flow-tracer modeling studies in small-scale experimental catchments are based on high-resolution sampling (e.g., daily) and include distributed measurements in soil and groundwater compartments Soulsby, Birkel, Geris, Dick, et al., 2015). However, such intensive isotope data are less feasible at larger scales, and often available stream water isotopes are coarsely sampled (e.g., monthly), exhibiting highly damped temporal patterns compared to rainfall (like the Selke). Notwithstanding that the isotopic amplitude might be masked by the coarse sampling frequency, the strong damping effects are most likely due to the dominance of subsurface flow path interactions at larger scales, which involve strong internal mixing with stored water (e.g., McDonnell & Beven, 2014;McDonnell et al., 2010). This is also consistent with independent perceptional understanding of the well-studied Selke catchment derived from hydrological and nitrate investigations based 18 of 22 on process-based modeling (Yang et al., 2018) and data-driven analysis (X. Zhang et al., 2020). Therefore, a strategic implication for large-scale monitoring is that the availability of the tracer information can be invaluable, even though the data might be inevitably coarse. Of course, higher frequency data collection, particularly during storm events can provide additional insights into catchment function (e.g., Stevenson et al., 2021). Indeed, we observed that our upscaled modeling was weaker in reproducing the depleted isotope signals induced by enhanced surface overland flow dynamics under wetter weather conditions. Optimizing isotope tracer monitoring design (e.g., higher frequency sampling that targets characteristic flow regimes (Stevenson et al., 2021;L. Wang et al., 2019)) could perhaps better aid representation of highly non-linear processes and modeling constraints, meanwhile reducing the logistical and financial burdens that are often already high for larger-scale monitoring.

Insights Into the Heterogeneous Ecohydrological Functioning Under Varying Hydroclimatic Conditions
The fully distributed EcH 2 O-iso modeling uniquely provided spatio-temporally explicit insights that allowed us to disentangle blue-and green-water flux dynamics and the associated output age distributions across scales. Such detailed information is particularly critical in larger-scale catchments that often show contradictory upland-lowland behaviors (e.g., Tetzlaff et al., 2011), like the Selke catchment.
In the Selke uplands, the high-elevation headwaters exhibited much higher precipitation and annual runoff than the leeward-positioned central areas. The thin soils and shallow bedrock have resulted in a relatively flashy flow regime, with rapid replenishment of soil water storage deficits (Dupas et al., 2017;Yang, Rode, et al., 2022). Coupled with the spatially varying precipitation, the simulated total runoff and contributions of surface overland flow exhibited significant spatial variability (Figure 6b). This has further resulted in younger runoff ages in the more runoff producing headwaters, except for the dry season when the uplands were dry and deeper groundwater uniformly sustained the low baseflows (Figures 7a-7d). It is notable that stream water along the main stem showed more homogeneity and maintained young ages moving downstream, due to the runoff-weighted age mixing and the dominance of younger runoff from the wetter headwaters ( Figure S9 in Supporting Information S1). This implies that stream water tracer signals are significantly impacted by the hillslope-channel organization (e.g., outlined as a first-order control in mountainous regions by Jencso et al. (2009)) and hydroclimatic variability in larger catchments, and that spatially explicit investigations are important to show the actual transit time distributions across the catchment (e.g., Hrachowitz et al., 2010;Soulsby et al., 2010). In contrast, the dryer, low-gradient lowlands homogeneously exhibited more stationary hydrological regimes, due to the flat topography and limited lateral hydrological connectivity ( Figure S10 in Supporting Information S1).
The prolonged droughts in 2018-2019 provided an opportunity to investigate spatio-temporal variations of catchment water balance dynamics under nonstationary conditions. The upland annual runoff was largely maintained in the drought years, but with significant changes in runoff partitioning (i.e., relatively deeper GW runoff contribution almost doubled and there were extensive areas generating no overland flow; Figures 6e-6h). Intensive groundwater monitoring data in the Selke's headwater Schäfertal catchment (Yang et al., 2021) provided direct evidence that the regional deeper groundwater system could sustain significant stream flows under the droughts, primarily because of the upland shallow schist/claystone bedrock with significant fractures (Jiang et al., 2014;Wollschläger et al., 2016). This further resulted in the general increases in runoff ages (Figures 7e-7h), since longer flow paths through the fractured bedrock typically exhibits longer water transit times, although with similar catchment forms and hydrological regimes (Hale & McDonnell, 2016). In contrast for the lowlands, the drought induced significant decreases in annual runoff but only marginal changes in runoff partitioning and more moderate increases in runoff ages. On the one hand, the groundwater fluxes from soil water storage ( Figure S10 in Supporting Information S1) plausibly sustained the lower stream flows due to the flat lowland topography with low hydraulic gradients (Tetzlaff et al., 2011). On the other hand, the deeper groundwater storage in the lowland loess aquifer decreased dramatically in recent drought years (groundwater level measurements from the water authority LHW, https://lhw.sachsen-anhalt.de/, last accessed on 13 September 2022), and thus, did not evoke significant nonstationary ages responses in lowland stream waters.
Transpiration fluxes exhibited significantly younger ages in the upland channel-connected grid cells with higher EARs (e.g., mean ages of 123 vs. 197 days during the drying season; Figures 8a-8d and 9a-9d), indicating more uniformly younger soil water sources in these cells. This likely resulted from interactions between blue-and green-water dynamics: the lateral groundwater seepage to channels has plausibly YANG ET AL.

10.1029/2022WR033033
19 of 22 enhanced overland flow re-infiltration and the consequent vertical exchanges between soil layers, resulting in whole-profile younger soil water ages ( Figure S7 in Supporting Information S1). In the flat lowlands, due to limited lateral flux exchanges, the spatial patterns of transpiration water ages were consistently similar to those of soil water ages in all layers; the areas with much older ages were characterized as higher outgoing potentials, such as the highly sandy areas and the higher-slope headwater grid-cells (Figure 8, Figures S7 and S10 in Supporting Information S1). In responding to the drought disturbances, transpiration fluxes exhibited a delayed response compared to that of EARs (Figures 8i-8l and 9). Such differences indicated that the transpiration fluxes at the beginning of the droughts were maintained largely by deeper layer soil water sources with less-variable age distributions (i.e., higher EARs). Furthermore, compared to the absolute transpiration ages that exhibit landscape organizational and upland-lowland variabilities, the transpiration EARs were more spatially homogeneous, and therefore, can be a sensitive indicator of environmental disturbance (like the droughts) impacts across larger catchments. Assessing catchment-scale vegetation water use is still fundamentally challenging (Asadollahi et al., 2022;Evaristo et al., 2019;Soulsby et al., 2016), and the results of our upscaled modeling largely depended on the preliminary conceptualization and specific catchment settings in this study. In particular, our transpiration simulations were not directly constrained by isotopes. Therefore, increasingly available information from, for example, xylem water samples, will be useful to further reduce model uncertainty and increase the process inferences (e.g., Smith, Tetzlaff, Landgraf, et al., 2022). Also, data-driven improved conceptualization and specific considerations of the complex transpiration processes and their impacts on water use would be advantageous, for example, taking advantage of recent advances from in situ experiments and controlled lysimeters (Benettin et al., 2021;Evaristo et al., 2019;Smith, Tetzlaff, Landgraf, et al., 2022).

Conclusions
We upscaled process-based, tracer-aided ecohydrological modeling using the EcH 2 O-iso model in the large, heterogeneous Selke catchment (456 km 2 ). Several adaptations to EcH 2 O-iso were made to further improve the model's process and spatial representations. To our knowledge, such tracer-aided, process-based ecohydrological modeling has not been conducted at such large spatial scales. The novel larger-scale modeling used spatially distributed flow and isotope data across the Selke catchment with contrasting landscape properties. Multi-site and multi-data sensitivity analysis revealed that the information content of discharge and stream isotope data was generally "consistent" (similar sensitive parameters) and "orthogonal" (e.g., parameters related to runoff partitioning and dynamics in soils ranked high for isotopes). Further, the inclusion of isotope information in multi-criteria calibration generally preserved the discharge performance with improved transferability to validations, and more importantly, substantially improved the isotope performance even at such large scales (though these remained uncertain due to the general challenges of the upscaled tracer-aided modeling using relatively coarse tracer data for calibration). This still provided the basis for more plausible inferences of stream water ages in terms of general seasonal dynamics and spatial variability.
Following comprehensive process representation and isotope-aided model constraints, the novel fully distributed modeling has unique advantages in providing new insights into spatial heterogeneities in ecohydrological functioning in larger catchments, in terms of tracer mixing and water ages, and their response to environmental disturbances like droughts. The upland areas exhibited marked spatial variability in hydroclimatic conditions and further in simulated ages of terrestrial water sources; the hydrological regimes under the drought disturbances were largely maintained by the increased contributions from deeper groundwater in the fractured bedrock, while associated with dramatically increased stream water ages. In contrast, the flat and drier lowlands exhibited more constant flow regimes and water age distributions, with moderate responses to drought disturbances, primarily due to the flat topography and limited hydrological connectivity. As a major green-water flux, transpiration water ages exhibited strong spatial patterns related to catchment organization in the uplands but were generally more homogeneous in the lowlands. The more spatially generalized transpiration EAR, indicating transpiration water age extraction preference, can be a sensitive indicator of functional changes across larger catchments under environmental disturbance (like the droughts). All these innovative insights from the upscaled tracer-aided ecohydrological modeling are of particular relevance to inform larger-scale management of heterogeneous catchments under environmental change and climate extremes.