A Novel Emergent Constraint Approach for Refining Regional Climate Model Projections of Peak Flow Timing

Global climate models (GCMs) are unable to produce detailed runoff conditions at the basin scale. Assumptions are commonly made that dynamical downscaling can resolve this issue. However, given the large magnitude of the biases in downscaled GCMs, it is unclear whether such projections are credible. Here, we use an ensemble of dynamically downscaled GCMs to evaluate this question in the Sierra‐Cascade mountain range of the western US. Future projections across this region are characterized by earlier seasonal shifts in peak flow, but with substantial inter‐model uncertainty (−25 ± 34.75 days, 95% confidence interval (CI)). We apply the emergent constraint (EC) method for the first time to dynamically downscaled projections, leading to a 39% (−28.25 ± 20.75 days, 95% CI) uncertainty reduction in future peak flow timing. While the constrained results can differ from bias corrected projections, the EC is based on GCM biases in historical peak flow timing and has a strong physical underpinning.


Introduction
Future warming is expected to substantially alter many aspects of Earth's hydroclimate, with significant implications for policy and stakeholder concerns.For example, past studies show that warming will drive reductions in snowfall across all but the coldest portions of the planet (Mankin & Diffenbaugh, 2015;Räisänen, 2008).Moreover, a warmer atmosphere can hold more moisture, driving increases in both the occurrence and magnitude of heavy precipitation events in most areas (Trenberth et al., 2003;Westra et al., 2014).These factors combine to dramatically impact hydrology, and more specifically flood timing in many parts of the globe (Bloschl et al., 2017;Wasko et al., 2020).From a flood management perspective, earlier streamflow events with a greater influence from precipitation as rain are expected to eventually transition to increased streamflow intensity compared to snowmelt-driven events (Davenport et al., 2020).From a water supply management perspective, earlier or cold season floods allow for less water to be retained for water supply throughout summer months, when water is most needed (Barnett et al., 2005).Signs of these changes are already evident across the western United States, where snow storage has decreased over recent decades (Fyfe et al., 2017;Hale et al., 2023).However, todate, analysis of changes in the timing of high flow conditions has largely been limited to the historical period (e.g., Wasko et al., 2020).How the timing of high flows will change in the future remains relatively unexplored.
Global climate models (GCMs) are the ultimate source for our understanding of future flood hazards, but they have two key limitations: coarse resolution and large inter-model uncertainty.First, their coarse resolution inherently limits the usefulness of their output in areas of complex terrain such as the western United States (Wrzesien et al., 2018).Second, despite substantial model development efforts, the latest phase of the Coupled Model Intercomparison Project (CMIP6; Eyring et al., 2016) exhibits substantial model disagreement in many key metrics of climate change, including those shaping flood risk, such as warming and precipitation change.
Dynamical downscaling methods can be used to translate GCM output to regional scales (e.g., Giorgi & Gutowski Jr, 2015), but original GCM biases largely remain and may increase (Diaconescu & Laprise, 2013;Rahimi et al., 2024).Consequently, stakeholder-relevant variables produced from the dynamical downscaling of GCMs contain biases that may present unrealistic projections of change, or add to uncertainty.Most hydroclimate projections take the model mean as the most likely future, and some measure of associated spread from a suite of GCMs to quantify uncertainty in a given variable (Knutti et al., 2010).However, such an approach does not consider how the spread in future projections may be influenced by biases in GCMs.
Emergent constraints (ECs) are a common approach for reducing uncertainty in CMIP projections of climate change (e.g., Hall et al., 2019;Williamson et al., 2021).This method was developed to work around the challenge of assessing GCMs for their ability to project future climate scenarios that have not yet occurred.In this method, the uncertainty in projections of a variable Y is reduced by considering the historical bias of GCMs relative to an observed variable X that is correlated to Y.However, applications of ECs have been limited to GCM output to date.This is because the EC approach requires a sufficiently large ensemble of simulations, which can be a challenge to produce given the high computational and data storage costs associated with dynamical downscaling.This is a challenge, since the complex topography and hydroclimate of inland regions, such as the western US, make it difficult to apply the EC approach directly to coarse GCM outputs for land surface variables in such regions.
For the first time, we seek to apply the EC method to an ensemble of regional climate model simulations (rather than raw GCM data), which can allow for the assessment of a new set of highly uncertain climate change metrics.
Here, we focus on future changes in peak flow timing, defined as the timing of the annual maximum 1-day natural flow (Qx1d).We evaluate peak flow timing since it is recognized as an important climate change metric for both flood hazards and water resources (Bloschl et al., 2017;Wasko et al., 2020) and it cannot be evaluated at the native GCM resolution.Rather than evaluating drivers of the change in peak flow timing, which for snowmeltinfluenced basins has been previously studied and is primarily driven by warming causing earlier snowmelt and greater precipitation falling as rain rather than snow (e.g., Bloschl et al., 2017;Huang et al., 2018), we focus on whether the EC method can be used to constrain inter-GCM uncertainty in projected changes in peak flow timing.To inform our analysis, we use an ensemble of 12 dynamically downscaled GCMs across the western US (Rahimi et al., 2024).This stakeholder-relevant variable, which requires dynamical downscaling, is typically obtained from subsequent hydrologic simulations with bias-corrected data.However, whether or not the removal of GCM biases influences projected change signals in a physically realistic manner is a matter of ongoing debate and study (Maraun et al., 2017).As such, we focus on the EC approach's ability to constrain projections of peak flow timing based on dynamically downscaled results.However, we also compare the EC approach against unique bias correction (BC) methods, and each approach's varying levels of uncertainty.We attempt to address a series of questions that both the atmospheric and hydrologic communities commonly confront with respect to inter-model GCM uncertainty at the landscape-resolving scale: (a) if inter-model spread is large, can the EC approach be applied to build confidence in dynamically downscaled hydrologic output; and (b) how are various types of projections (unconstrained, EC-based, apriori BC, and post-downscaling BC) similar or different?Does the removal of the historical biases via two bookend BC techniques reinforce a particular outcome of the EC approach?

CMIP6 GCMs and Modeling Methods
Here, we make use of CMIP6 GCMs, when dynamically downscaled with the Weather Research and Forecasting (WRF, Rahimi et al., 2022) model to assess shifts in peak flow timing at a landscape-resolving scale.We evaluate 12 single member CMIP6 GCMs (Table S1 in Supporting Information S1) that were downscaled to a 9 kilometer (km) resolution across the western US following the historical and SSP3-7.0scenarios (Rahimi et al., 2024).We briefly describe the GCMs selected for analysis in this study in Text S1 in Supporting Information S1, and further details can be found in Rahimi et al., 2024.Because the WRF is tightly coupled to a land surface model, Noah-Multiparameterization (Noah-MP), this downscaling framework provides the opportunity to assess peak flow timing given its reliable representation of this metric when compared against observational data (Figure S1a and Figure S2a in Supporting Information S1).Further details of the downscaling set-up are described in Text S1 in Supporting Information S1 and Rahimi et al. (2022Rahimi et al. ( , 2024)).
We primarily focus on the application of the EC approach using the dynamically downscaled data set; however, we also compare the EC method against two BC techniques.We evaluate a mean-state BC that is applied apriori to downscaling (apriori BC) and a post-downscaling BC (post BC) technique similar to that commonly used in standard hydrologic projection frameworks.These two techniques represent the bookends of BC methods, such that apriori BC is applied upstream of dynamical downscaling, while the post BC is applied downstream of dynamical downscaling.Given space limitations, we describe the apriori and post BC methods in Text S2 and Text S3 in Supporting Information S1, respectively.After applying the BC, the apriori BC uses the same WRF downscaling set-up and tightly coupled Noah-MP model as the EC method, while the post BC requires an offline hydrology model.For the offline hydrology, we use a calibrated Noah-MP model with a 9 km resolution which is further described in Text S4 in Supporting Information S1 and Bass et al. (2023).The combination of the post BC step and the calibrated hydrology modeling encompasses the framework typically used for hydrologic climate assessments.For simplicity, we refer to the results from these simulations as post BC.

Representation of Peak Flow Timing
The primary focus of this work is on projected end-of-century shifts in peak flow timing, defined as the change in Qx1d timing.To represent natural flow from our model output, we use the total aggregated runoff across a given basin.Since our simulations of natural flow accurately represent Qx1d timing for the basins in this study (Text S5,Figures S1 and S2 in Supporting Information S1), we refer to Qx1d timing interchangeably as peak flow timing.In our analysis, we focus on different subregions of the Sierra-Cascade in the western US, where historical and projected shifts in Qx1d timing are evaluated in basins with observed natural flow data in each subregion.We represent the projected shift in Qx1d timing for each GCM based on the difference between the mean Qx1d timing during the end of the century (2066-2099) and the historical time-period .
Historically measured Qx1d timing is used to assess the performance of the WRF model, to represent uncertainty in the EC method, and to evaluate the bias of each GCM.Observational natural flow data, at the daily timeresolution, was obtained from the sources outlined in Table S2 in Supporting Information S1.We determine historical peak flow timing based on observed data for water years (water year defined as 10/1/YYYY-1 to 9/30/ YYYY) that contain at least 330 days of data for all of the basins within a given subregion (reference Table S2 and Table S3 in Supporting Information S1).In our analysis, we calculate Qx1d timing based on the water year.Also, we evaluate natural flow to explicitly focus on climate change impacts on Qx1d timing, which purposely excludes infrastructure or land use changes that occur during the time-period of analysis.

Emergent Constraint Analysis
The EC method involves identifying a measurable aspect of the current climate, referred to as element X, which shows considerable variation among different GCMs and has a meaningful statistical correlation with a projected climate variable Y. Here, we employ the EC approach to constrain projected shifts in Qx1d timing (variable Y) based on historical Qx1d timing (variable X).We employ the statistical methods outlined in Bowman et al. (2018) to determine the constrained central estimate and uncertainty of Qx1d timing (66% and 95% confidence intervals).This method considers both the correlation of the emergent relationship and the observational uncertainty in the current climate metric.To represent observational uncertainty, we take the difference between the observed Qx1d timing from natural flow measurements and our ability to reproduce Qx1d timing from our dynamical downscaling set-up when applied to downscale ERA5 (Rahimi et al., 2022).As outlined in Bowman et al. (2018), this observational uncertainty impacts the EC method's estimate of uncertainty.The method assumes that the GCMs conform to a Gaussian distribution in the spread around the regression line that relates the historical metric to future change, which required the removal of a GCM that is a clear outlier for several subregions (GISS-E2.1-G,Figure S3 in Supporting Information S1).Further details associated with the EC method and the approach to determine the constrained estimate, and the 66% and 95% CI, can be found in Text S6 in Supporting Information S1, Bowman et al. (2018) and Williamson et al. (2021).

Future Changes in Peak Runoff Timing
Relative to historical annual maximum 1-day (peak) runoff timing (Figure 1a), under SSP3-7.0we see widespread shifts to earlier peak runoff timing across the western US (Figure 1b).We observe >66% model agreement in a shift toward earlier peak runoff timing in mountainous regions, with the ensemble mean reaching up to 100 days earlier.The shift toward earlier peak runoff in mountainous regions is expected with warming due to earlier snowmelt and/or more precipitation falling as rain, resulting in rain-on-snow or rainfall-driven flow events that occur earlier in the water year (Boschl et a. 2017;Wasko et al., 2020).However, as outlined in Figure 1c, the intermodel uncertainty (represented by two standard deviations) associated with the projected changes in peak runoff timing can roughly equal the change signal, particularly in the Sierra-Cascade region.

An Emergent Constraint for Peak Flow Timing
Given the large inter-model spread in projected changes in peak runoff timing, here we evaluate whether the use of historical GCM biases can be used to help reduce the uncertainty in changes associated with peak flow timing through an EC.Henceforth, we shift our focus from peak runoff as shown at the grid-scale level in Figure 1 to peak flow conditions across basins using the methods and validation described in Section 2.2.We focus our analysis on the Sierra-Cascade mountain region based on (a) its relatively large uncertainty associated with projected shifts in peak runoff timing (Figure 1c), (b) the general expectation that earlier peak flows will lead to more intense peak flows in this region (Figure 1d), and (c) the availability of ground-truthed observational data across this region.We break down our analysis to focus on subregional shifts across basins in the Southern California Sierra Nevada's, Northern California Sierra Nevada's, Oregon Cascades, and Washington Cascades (Figure 1e).While statistically significant shifts in peak flow timing exist for all the subregions evaluated except the Oregon Cascades (Figure S4 in Supporting Information S1), substantial inter-model uncertainty remains.The range of projected changes in peak flow timing varies between 40 and 100 days for each subregion evaluated (Figure 2a), based on the difference between the smallest and largest projected shift in peak flow timing amongst the GCMs.While several GCMs demonstrate a shift towards earlier peak flow timing, a few show either no change or even delayed peak flows (later in the water year), as demonstrated for the Northern Sierra Nevada and Oregon Cascades (Figure 2a).Better understanding these expected shifts in peak flow timing can provide critical information for flood and water supply adaptation planning.
In all subregions, we find that projected changes in peak flow timing are strongly negatively correlated with historical biases in peak flow timing.Simulations with earlier historical Qx1d exhibit a smaller change in peak flow timing.The Pearson-r correlation coefficient ranges from 0.77 to 0.85, with a mean Pearson-r of 0.81 across the four subregions evaluated (Figure 2a).While other historical climate metrics were considered (Table S4 in Supporting Information S1), this metric provides the strongest correlation to projected changes in peak flow timing.Furthermore, this metric (historical biases in peak flow timing) was chosen for the following reasons: (a) reliability of observational data, (b) reliability of our regional climate modeling methods in representing this metric, and (c) the likelihood of a physically meaningful relationship between this metric and projected shifts in peak flow timing.Regarding the first reason, the basins used in this analysis (Figure 1e and Table S3 in Supporting Information S1) contain sufficient observational data (Tables S2 and S3 in Supporting Information S1) during the historically downscaled GCM time-period from 1981 to 2014, from which historical mean peak flow timing could be derived.For all intents and purposes, the gauged observational data is considered the only reliable measurement of daily natural flow and thus the Qx1d timing.Regarding the second reason, the small difference between observed peak flow timing and the dynamically downscaled representation of historical peak flow timing (Figure 2a, Figures S1a and S2a in Supporting Information S1) builds confidence in the downscaling method's ability to represent the Qx1d timing.In addition, the small difference reduces observational uncertainty and thus allows for a strong future constraint (Bowman et al., 2018).Prior to discussing the third reason for our choice of the current climate metric (the physically meaningful aspect of historical peak flow timing), we discuss this constraint's ability to reduce the inter-model uncertainty associated with projected changes in peak flow timing.
In our EC analysis, we generally obtain similar projections for the most likely estimate of changes in peak flow timing based on the ensemble mean; however, the uncertainty in the projected shift in peak flow timing is reduced by 32%-43%, depending on the sub-region considered (Figure 2b).Based on the average across the subregions, the constrained estimate of 28 days is only slightly modified from the unconstrained estimate of 25 days.However, the domain-averaged 95% CI is reduced by 39%, with the unconstrained 95% CI ranging from 60 to 9 days, and the constrained 95% CI ranging from 49 to 8 days.This domain-averaged result and the constrained results for each individual subregion indicate that peak flow timing in these snowmelt-influenced basins is very unlikely to shift later in the water year.At the same time, the constrained spread indicates larger shifts in peak flow timing (e.g., beyond 2 months for all regions except the Southern Sierra Nevada) to be very unlikely as well.In other words, the constrained estimate leads to a reduced spread in projected shifts in peak flow timing, with outlier GCM estimates of projected shifts in peak flow timing deemed unlikely.We generally find similar results when changes in Qx1d timing are normalized by GCM differences in warming (Figure S5 in Supporting Information S1).
Next, we seek to understand sources of model spread in historical peak flow timing.Generally, we find that models with later historical peak flow timing are colder and have a lower percentage of precipitation falling as rain, and vice versa (Figure S6 in Supporting Information S1).This is demonstrated based on cold season (November-March) temperature and the percent of precipitation falling as rain.Such conditions determine whether a basin is characterized primarily by rainfall-driven or snowmelt-driven conditions during large streamflow events (Davenport et al., 2020).For example, the downscaled simulation of UKESM1-0-LL features cooler conditions than other GCMs (Figure S6a in Supporting Information S1), which results in a greater percent of cold season precipitation as snow (Figure S6b in Supporting Information S1).This in turn leads to delayed spring or summer snowmelt-driven peak flows (Figure 2a).Conversely, FGOALS-g3 produces a warmer historical climate (Figure S6a in Supporting Information S1) with a greater percentage of precipitation as rain during the cold season (Figure S6b in Supporting Information S1), leading to rainfall-driven peak flows earlier in the water year (Figure 2a).Warmer models are less likely to experience large shifts in future peak flow timing to earlier in the season.This is because earlier (rainfall-driven) peak flow events are already occurring in such models' historical simulations, due to a higher percentage of precipitation falling as rain.This intuitive relationship between historical peak flow timing, driven largely by historical temperature biases and percent of precipitation falling as rain, and projected shifts in peak flow timing, is the likely physical underpinning of the EC shown in Figure 2.

Comparison With Bias Correction Approaches
Here, we evaluate whether eliminating historical biases, such as the historical peak flow timing used in the EC method, can achieve reductions in uncertainty comparable to those attained by the EC approach itself.To test this, we compare the constrained regional EC estimates with that generated from apriori and post-downscaling BCd data sets (Section 2.1, Text S2-S4 in Supporting Information S1).Relative to the ensemble used for the EC, UKESM1-0-LL was not available from the post BC data set.As such, this analysis uses 11 rather than 12 GCMs for the comparison.The Pearson-r correlation coefficient of the EC with this reduced set of GCMs generally remains strong (Figure 3a).
The apriori and post BC techniques result in substantial reductions of the bias and spread associated with historical peak flow timing across the GCM ensemble compared to when no bias-correction is used (Figure 3b).For projected shifts in peak flow timing, the EC method provides consistent reductions in uncertainty; however, the bias-corrected data sets demonstrate varying changes in uncertainty (Figure 3c).When compared to the originally downscaled data, the EC method leads to a reduction in uncertainty ranging from 24% to 40%, the apriori BC method ranges from 14% to 48%, and the post BC method ranges from 35% to 54%.The mean shift in peak flow timing generally aligns for the unique methods (EC method, apriori BC, and post BC); however, there is limited agreement in the mean shift in the Southern Sierra Nevada.Nonetheless, it is worth noting that the BC techniques can provide more consistent reductions in uncertainty when large outlier GCMs are introduced (Figure S7 in Supporting Information S1).

Discussion and Conclusions
For the first time, we illustrate how the EC (EC) technique can be used to reduce uncertainty in future hydroclimate projections derived from a dynamically downscaled ensemble.This could open the door to a variety of follow-on studies, seeking to produce more robust projections of hydroclimate change across the western US.We demonstrate the ability to reduce inter-model spread in projected end-of-century changes in annual maximum 1day (peak flow or Qx1d) timing across snowmelt-influenced subregions of the western US.As a constraint, we use biases in historical Qx1d timing.Such information is relevant for flood and water supply management adaptation measures that may be required under a warming climate.We found our constraint to be physically meaningful, owing to its relationship to cold season temperature and the percent of precipitation falling as rain during the cold season.This drives peak flow timing due to the impact such conditions have on snowmelt-driven versus rainfall-driven flows.
With an appropriate dynamical downscaling set-up, the application of the EC method with regional climate data can be expanded to reduce uncertainty in additional climate change metrics and applied to other regions where stakeholder-relevant climate information is needed.The approach takes data from uncertain projections of downscaled GCMs, reduces that uncertainty, and provides information that typically cannot be obtained from GCMs.Like this study, other studies can use the EC approach to evaluate regional variables that cannot be extracted from GCMs at a higher resolution such as the basin-scale.Here, we focus on peak flow timing, a metric that cannot be derived from native GCMs, which highlights the benefit of this first application of the EC approach with dynamically downscaled data.
Limitations in our analysis were primarily driven by computational constraints.The robustness of regional EC analysis can additionally be assessed across multiple phases of CMIP as the storage and computational-costs of dynamical downscaling become less of a limitation.Future studies may also consider evaluating consistency in the EC-approach between unique downscaling and land surface or hydrologic models.Finally, prior to the application of the EC method to constrain shifts in peak flow timing for snowmelt-influenced basins in other regions, we recommend first testing whether historical peak flow timing remains the strongest constraint.Despite these limitations, we found substantial reductions (39% on average) in projected shifts in peak flow timing uncertainty, at the subregion scale we focused on, by using the EC-approach with a dynamically downscaled ensemble.This inaugural study, employing a dynamical downscaled ensemble for EC analysis, opens avenues for future research to refine regional climate projections, reduce uncertainties and aid stakeholder relevant decisionmaking.
Another aspect of our study unique from typical applications of the EC method, involved the comparison of the EC findings to bias corrected results.Such a comparison becomes relevant at the regional-scale, where BC is more commonly applied, as compared to applications of the EC method with native GCM data.We hypothesized that the removal of historical peak flow timing biases could translate to a reduction in the uncertainty associated with projections.While BC techniques reduce GCM biases in historical peak flow timing, and the mean change in peak flow timing was generally consistent between the unique methods, some subregions showed a greater spread in changes in peak flow timing with the BC approaches.This comparison is critical to discuss and explore further between the atmospheric and hydroclimate communities.Since the dynamical downscaling approach accurately represents the historical peak flow timing, we suggest that the EC method is appropriate and physically intuitive given the relationship we explored between GCM biases and projected shifts in peak flow timing.The EC method avoids well-known issues associated with BC techniques, including stationarity assumptions and a reduction in the physical consistency between the original GCM synoptic conditions and local scale conditions (Maraun et al., 2017).Because of this, the EC method may provide the most plausible change and spread in peak flow timing.While there is agreement in the projected mean change in most subregions, the uncertainty introduced between the unique approaches calls for a continued effort to disentangle how the hydroclimate modeling community can reduce uncertainty in projections.To support such efforts, we suggest that follow-on studies perform further comparisons of the EC method against the BC techniques for additional variables and varying spatial and temporal resolutions to evaluate where the unique approaches agree and differ.Also, while we explore two BC approaches that represent the bookends of BC techniques, a future study may consider evaluating a larger variety of BC procedures and their impact (e.g., a simpler apriori BC and a multivariate post BC).This could be useful to evaluate the physical credibility of different bias-correction techniques when compared against the ECmethod.For synoptically-driven variables like wet precipitation extremes, such analysis could also assess the physical consistency of bias-corrected results to ensure extreme precipitation aligns with expected synoptic scale conditions from the original GCM.

Figure 1 .
Figure 1.(a) Historical peak runoff timing based on downscaled ERA5, (b) Projected change in peak runoff timing (days) based on 12 global climate model (GCM) mean (hatching where >66% GCMs agree in direction) for end of century (2066-2099) versus historical (1981-2014) conditions, (c) two standard deviations from the inter-model change in peak runoff timing (days), masked for regions with historical peak snow water equivalent > 50 mm, (d) correlation between peak runoff timing change (days) and Qx1d intensity change (red indicates earlier peak runoff timing is correlated to larger peak runoff events), and (e) basins with sufficient observational natural flow data evaluated in each subregion of the Sierra-Cascade mountain range.

Figure 2 .
Figure 2. (a) Emergent constraint between historical Qx1d timing and change in Qx1d day.The correlation between the x-and y-axis is shown at the top right of each region, each Global climate model is represented by a different colored marker, and dashed vertical lines illustrate observational estimates.(b) Change in model spread between the unconstrained (left boxplot for each subregion) and constrained (right boxplot for each sub-region) estimates.Boxplot includes the mean (dashed white line) and the 66% (thicker portion of boxplot) and 95% (thinner portion of boxplot) confidence intervals.The mean and 95% CI values are included as text at the bottom of each boxplot, and the percent reduction in spread is indicated at the top right of each region.

Figure 3 .
Figure 3. (a) Historical and projected change in Qx1d Day across the four subregions for 11 downscaled global climate models (GCMs) (missing UKESM from prior ensemble) with no bias correction (BC) (black), with apriori BC (green) and with post BC (red).The Pearson-r correlation in the top right for each region is based on results without BC (since this is the data set the Emergent constraint method is applied to).(b) Historical peak flow timing spread without and with BC compared against the historically observed peak flow timing for each region (horizontal dashed line).(c) Spread of downscaled GCMs, spread with EC, and spread with BC.In panels b and c, the boxplots and text represent the same statistics as described for Figure 2.