Parabolic northern-hemisphere river flow teleconnections to El Niño-Southern Oscillation and the Arctic Oscillation

It is almost universally assumed in statistical hydroclimatology that relationships between large-scale climate indices and local-scale hydrometeorological responses, though possibly nonlinear, are monotonic. However, recent work suggests that northern-hemisphere atmospheric teleconnections to El Niño-Southern Oscillation (ENSO) and the Arctic Oscillation can be parabolic. The effect has recently been explicitly confirmed in hydrologic responses, though associations are complicated by land surface characteristics and processes, and investigation of water resource implications has been limited to date. Here, we apply an Akaike Information Criterion-based polynomial selection approach to investigate annual flow volume teleconnections for 42 of the northern hemisphere’s largest ocean-reaching rivers. Though we find a rich diversity of responses, parabolic relationships are formally consistent with the data for almost half the rivers, and the optimal model for eight. These highly nonlinear water supply teleconnections could radically alter the standard conceptual model of how water resources respond to climatic variability. For example, the Sacramento river in drought-ridden California exhibits no significant monotonic ENSO teleconnection but a 0.92 probability of a quadratic relationship, reducing mean predictive error by up to 65% and suggesting greater opportunity for climate index-based water supply forecasts than previously appreciated.


Introduction
River flows are critically important to drinking, agricultural, and industrial water supply availability, hydroelectric power generation, ecosystems and fisheries, flooding, erosion and sedimentation, coastal ocean circulation, and river and sea ice processes. Year-to-year variations in river flow volume, however, are complex functions of climate and other factors. Indices of coherent patterns of large-scale ocean-atmosphere variability, such as El Niño-Southern Oscillation (ENSO) and the Arctic Oscillation (AO), provide a convenient and relatively effective framework for understanding and predicting such interannual variability. Many studies have examined the effects of ENSO or various other climate Oscillations on river flows worldwide, and these indices are routinely used in operational water supply forecast systems. As an example of the attendant benefits, Hamlet et al (2002) demonstrated how using such climate information in water supply forecasting could yield efficiencies which increase annual Columbia River hydropower revenue by about US$153 million.
It is often recognized, particularly for ENSO, that atmospheric and hydrologic teleconnections can be nonlinear (e.g., Hoerling et al 1997, Khan et al 2006, Fleming et al 2007 insofar as they are often asymmetric, that is, warm-and cool-phase responses are typically not simple mirror images of one another, and better predictive skill may Environmental Research Letters Environ. Res. Lett. 9 (2014) 104007 (9pp) doi:10.1088/1748-9326/9/10/104007 3 Also at University of British Columbia and Oregon State University.
Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. be attained using nonlinear models. However, in the hundreds if not thousands of teleconnection studies in various communities of practice (climatology, hydrology, forestry, fisheries, ecology, economics, and so forth) to date, it is normally assumed-either explicitly or, more often, implicitly-that such relationships between large-scale climate indices and local-scale hydrometeorological responses remain monotonic. The majority of teleconnection analyses have been performed using either strictly linear methods like Pearson correlation or regression, or more robust nonparametric statistical techniques like Spearman rank correlation, and none of these directly accommodate parabolic functional forms in their standard applications. Even composite analysis, another common teleconnection detection technique that is relatively robust to data complexities such as nonlinearity, is cumbersome for evaluation of non-monotonic teleconnections (further discussion of these issues may be found in Wu et al 2005 and Khan et al 2006, for example).
In contrast, artificial neural network analyses, applied to gridded continental-to hemispheric-scale meteorological data, have revealed that wintertime temperature and precipitation responses to ENSO and the AO can exhibit significant and widespread non-monotonicity across much of the northern hemisphere (Wu et al 2005, Hsieh et al 2006. In particular, these responses often appear to be parabolic instead, with extreme positive-and negative-state climate events eliciting similar temperature or precipitation responses. A small handful of local-to regional-scale climate studies have independently hinted at such effects (Pozo-Vázquez et al 2005) or since confirmed (B Taylor, Environment Canada, pers. com. 2012, Bai et al 2012, Fleming and Dahlke 2014) the wider findings of Wu et al (2005) and Hsieh et al (2006). Additionally, Hsieh et al (2006) suggested these parabolic teleconnections propagate climatic perturbations farther than linear or monotonically nonlinear processesindeed, into regions where classical linear teleconnections are insignificant. While the specific physical mechanism for nonmonotonic teleconnections does not appear to have been fully resolved, their particular form may be related to quadratic terms in the governing equations for atmospheric circulation (Hsieh et al 2006).
It cannot simply be assumed, however, that atmospheric teleconnections associated with a particular climate mode, such as the strongly nonlinear associations identified by Hsieh et al (2006) and Wu et al (2005), produce straightforwardly parallel effects in hydrology. Streamflow represents a spatiotemporal integration of all upstream hydrometeorological dynamics, including the modulating impacts of terrestrial hydrologic characteristics. As one example, the presence or absence of glacial ice in a river's headwaters can fundamentally control the direction and even existence of downstream water resource responses to atmospheric teleconnections (e.g., Neal et al 2002, Fleming et al 2006, Dahlke et al 2012. Thus, understanding the water supply effects of parabolic temperature and precipitation responses to climate modes requires directly evaluating hydrologic data on a river-by-river basis. Accordingly, Fleming and Dahlke (2014) performed a detailed study of a small number of alpine headwater catchments in Canada and Norway, evaluating their responses to ENSO and the AO, respectively. They directly confirmed the existence of parabolic hydrologic impacts under the strongly nonlinear atmospheric teleconnections identified by Wu et al (2005) and Hsieh et al (2006) but also found that such hydroclimatic dynamics can be complex.
Overall, such strongly nonlinear hydroclimatic relationships remain only preliminarily explored. If widely present, parabolic streamflow teleconnections might radically alter the standard conceptual model of (and the standard suite of methodological tools used to study) how water resources respond to climatic variability. Here, we address the fundamental environmental science question of identifying highly nonlinear climate signals in the hydrology of the northern hemisphere. Specifically, we ask: do total annual water supplies for the hemisphere's largest ocean-reaching rivers exhibit parabolic teleconnections to ENSO or the AO, which form dominant modes of interannual climatic fluctuation (Wallace and Thompson 2002)? The question is addressed by applying Akaike Information Criterion (AIC)-based polynomial selection procedures advanced by Fleming and Dahlke (2014) to annual flow volume data obtained from the global hydrology dataset of Dai and Trenberth (2002) and Dai et al (2009). Some potential implications for a route toward improving water supply forecasting are also briefly explored.

Data
We employed the observational (streamgauge-based) global hydrology dataset compiled by Dai and Trenberth (2002) and Dai et al (2009). This dataset is relatively comprehensive, upto-date, quality-controlled, and freely available. The database focuses on flows measured near the mouths of the world's largest ocean-reaching rivers, and characterizes each river by which ocean basin it flows into: Atlantic, Pacific, Arctic, Indian, or Mediterranean. We selected the ten largest (by approximate average annual flow) northern-hemisphere rivers for each of these ocean basins, to ensure that effects for each are explored. Attention was restricted to the northern hemisphere to capitalize upon prior work on highly nonlinear temperature and precipitation teleconnections to ENSO and the AO by Wu et al (2005) and Hsieh et al (2006).
Monthly mean data were processed to annual volumes on a hydrologic-year basis (for example, the 1984 water year spans 1 October 1983 through 30 September 1984). Several elements of the hydrograph can be of potential interest to various different applications. Annual total flow volume is our metric of choice here for three reasons: (i) it is the most fundamental measure of a river's size and, for example, its ability to deliver water supplies to societies and freshwater fluxes to oceans; (ii) it is statistically well-behaved, insofar as annual flow volume is in general normally distributed, reflecting the fact that it is an additive measure such that the central limit theorem applies; and (iii) it is comparatively robust to upstream flow modifications like reservoir operation (e.g., Dai et al 2009) relative to, for example, seasonal or extreme flow measures. Note that in the context of relatively high-frequency climatic phenomena like ENSO and the AO (as opposed to analyses of long-term trends, for example), the general impact of noisy or non-natural hydrologic time series should be to obscure hydroclimatic relationships, rather than to induce artificial associations. An annual value was only calculated for a given hydrologic year if no monthly values were missing. Pre-1951 data were removed from consideration, due in part to limited available AO index information prior to that date, but this generally had little effect as most of the rivers considered had little or no pre-war flow data. Rivers having less than 20 years of data following these procedures were then screened out from the teleconnection analysis. Locations for the final set of hydrometric stations, 42 in total, are illustrated in figure 1.
We concentrated on ENSO and AO effects because these are commonly viewed as dominant modes of northernhemisphere climate variability (Wallace and Thompson 2002) and because these climate modes were considered by Wu et al (2005) and Hsieh et al (2006) in their atmospheric teleconnection work, freeing us up to focus on water resource impacts instead. ENSO and AO indices were obtained from the US National Oceanographic and Atmospheric Administration (NOAA) Climate Prediction Center. For ENSO, we chose the Oceanic Niño Index (ONI), which is used by NOAA for operational classification of ENSO state (e.g., Kousky and Higgins 2007). Mean wintertime (December--January-February) values were used, as is common practice, and were assigned to the corresponding hydrologic (January-February) year.

Method
Standard teleconnection detection methods, like correlation and composite analysis, can be cumbersome for examining the parabolic relationships of interest to us here (Wu et al 2005). A few, more suitable alternatives are available. Here, climate relationships were assessed using an AIC-based polynomial selection technique, as adapted to nonlinear hydroclimatic trend and teleconnection analysis by Fleming and Dahlke (2014). Reviews of the theory and application of AIC-based modeling in environmental science are given by Anderson et al (2000), Burnham and Anderson (2002), and Hobbs and Hilborn (2006). A brief summary of only the most salient aspects is provided here. The Kullback-Leibler (K-L) information quantifies how much information about a system is lost when approximating truth using a model: where f is full reality or truth, g is the model with parameters θ, and I is the information lost when g is used to approximate f. Combining these information theoretic considerations with maximum likelihood concepts, Akaike (1973Akaike ( , 1974 developed a model selection criterion, which for the small-sample case is: where Y is sample size and σ 2 is RSS/Y. RSS is the residual sum of squares: y Y y y 1 2 where x y and x y are the observed and model-simulated values for year, y. λ is the number of parameters explicitly appearing in the model plus one, as in this context the fit metric (RSS) is considered an additional estimated parameter. The AIC is a model selection tool. From M models, g m = 1,M , each having different parameter values or model structures, the one having the lowest AIC is considered best. By 'best' we mean that model m gives the optimal combination of performance and parsimony, as AIC increases with both model error (RSS) and model complexity (λ). That is, AIC-based model selection amounts to a rigorous formalization of Occam's razor, and by construction avoids model over-fitting. In particular, amongst the various a priori candidate models identified as potential explanations for a dataset on the basis of physical reasoning, past experience, or other considerations, model ranking is accomplished by rescaling the AIC values: The best-performing model has Δ = 0, and all other candidates have Δ > 0. The standard rule of thumb is that if some model yields Δ > 2, it can be ruled out as a viable model given the available data. On the other hand, if multiple models show 0 ⩽ Δ < 2, then all such models remain acceptable candidates on the basis of the available data. The Akaike weight is the approximate probability that model m is the K-L best model in the set of models considered: If this set of M models adequately spans the range of physically plausible models for the situation at hand, then we can in turn treat w m AIC as an approximate estimate of the probability that model m is true, given the data.
Here, we seek to determine whether there is some relatively simple relationship between annual flow volume and a climate index (AOI or ONI). To this end, we follow the procedure of Fleming and Dahlke (2014) by using polynomial fits of varying degrees. Specifically, M = 3 models are physically plausible and consistent with prior work on teleconnections between ENSO or AO and northern hemisphere temperature and precipitation:  model is a polynomial of degree 0 and simply amounts to the mean of the hydroclimatic variable (i.e., no-effect model).
Model m = 2, a polynomial of degree 1, is equivalent to linear regression (i.e., linear-effect model). The third model, a second-degree polynomial, is a quadratic function which can take on various graphical forms over a given v(t) range, including the parabolic shape described by Wu et al (2005) and Hsieh et al (2006) (i.e., nonlinear-effect model). Wu et al (2005) and Hsieh et al (2006) found no statistical evidence or physical basis (see also our introduction) for relationships corresponding to polynomials of degree higher than 2. As such, we may assume that the three models above fully span the potential range of plausible univariate models relating water supply to a circulation index. The corresponding Akaike weights can, therefore, be interpreted as approximate estimates of the probability that each model is true, given the data. For example, the probability that a teleconnection exists can be estimated as p(°=1 or 2) = p(°=1) + p(°=2), where°is polynomial degree.
In standard teleconnection studies (e.g., correlation analysis), the normal procedure is to assess for teleconnection presence using a two-sided significance test (e.g., on the correlation coefficient), and if a relationship is found, the characteristics of that relationship-whether it is a positive or negative association-are subsequently determined. We followed an analogous path here. Further information about the character of the AIC-acceptable models was extracted using the derivatives of the best-fit polynomials. For the case that Δ < 2 for m = 2, we determine the nature of this significant linear association much as we would in the case of a standard correlation analysis, i.e., from dQ/dv = d(β 0 + β 1 v)/dv = β 1 : If Δ < 2 for m = 3, then whether this nonlinear relationship is monotonic (e.g., a nonlinear growth function) vs. nonmonotonic (e.g., a parabola) over the hydroclimatically relevant part of the solution space is determined by comparing the signs of the first derivative of the quadratic model, dQ/dv = d (β 0 + β 1 v + β 2 v 2 )/dv = β 1 + 2β 2 v, evaluated at the lowest v value over the period of record for a given river, v min , and at the highest value, v max : If Δ < 2 for m = 3 and the relationship is additionally determined from (8) to be monotonic, we can judge whether it is a positive or negative association from the sign of the first derivative of the quadratic model, dQ/dv = d(β 0 + β 1 v + β 2 v 2 )/ dv = β 1 + 2β 2 v, evaluated for convenience at v = 0. This gives a decision protocol identical to that for the m = 2 model (equation (7)), except of course the β 1 value used in the evaluation is in this case taken from the best-fit quadratic, not linear, model. Conversely, if Δ < 2 for m = 3 and the relationship is determined from (8) to be non-monotonic instead, we can deduce its geometric orientation on the basis of the second derivative of the quadratic model, d 2 Q/ dv 2 = d 2 (β 0 + β 1 v + β 2 v 2 )/dv 2 = 2β 2 : β β > < 2 0 : concave up 2 0 : concave down.
2 2 4. Results Figure 1 illustrates the firstand second-best teleconnection models, on the basis of the AIC results, for each basin and climate mode considered. The full set of polynomial selection results for all 42 rivers is summarized in table 1.
While the outcomes offer a number of interesting features, of particular note here is that parabolic teleconnections are clearly, though not universally, apparent. Of the 38 basins for which any linear or nonlinear teleconnection to either ENSO or the AO was identified as one of the AIC-acceptable models, 19 of these rivers (50%) included a parabolic relationship among the corresponding set of Δ < 2 models. Similarly, of the 24 rivers for which any type of teleconnection to ENSO or the AO was identified as the single best model amongst the candidates, in eight cases (33%) this AIC-optimal model was a parabolic teleconnection. Unsurprisingly, the geographic patterns for ENSO and AO teleconnections are different, but the overall statistics appear reasonably similar. For instance, no fewer than 11 (ENSO) or 10 (AO) basins include strong nonlinearity among the AICcredible models, and of these, a parabolic water supply teleconnection is the preferred or sole model for 6 (ENSO) or 2 (AO).
Though 20 years of data are generally viewed as sufficient for identifying the linear impacts of relatively highfrequency climatic Oscillations like ENSO and the AO, in the quadratic case longer records should in principle better-sample the comparatively rare climatic events that define the potentially highly nonlinear tails of the associations. Table 1 tells an ambiguous story around possible relationships between record length and the appearance of strongly nonlinear teleconnections: basins for which parabolic ENSO or AO teleconnections were identified tend to have longer records, but several rivers which have longer records do not exhibit parabolic teleconnections. In any event, these recordlength considerations suggest that the number of major oceanreaching northern hemisphere rivers found here to experience parabolic ENSO or AO teleconnections might be a minimum estimate.

Implications: a water management case study of the Sacramento Basin
Obtaining a clear and rigorous view of precisely how and why each of these 42 rivers exhibits its specific teleconnection behaviors, including detailed comparisons to prior studies in each basin, lies far beyond the scope of this short note. Nevertheless, we can take a brief look at how well some of the outcomes match with other lines of investigation. We focus on the Sacramento river in drought-ridden California, the results for which are particularly intriguing and relevant. A strongly nonlinear, concave-up relationship was the only AIC-acceptable model in this case, corresponding to p (ENSO) = 0.92. That result appears fully consistent with Wu et al (2005), who found that the peak level of nonlinearity in North American precipitation responses to ENSO is located in California. This example also appears consistent with the suggestion of Hsieh et al (2006) that highly nonlinear Table 1. Summary of results. 'Basin' gives the ocean basin into which each river empties, and A is the corresponding drainage area, as listed in the database of Dai and Trenberth (2002) and Dai et al (2009). N is the number of yearly data used in this analysis (see text). P(ENSO) and P(AO) give estimated probabilities that any teleconnection (linear or nonlinear) exists with ENSO or the AO, respectively. 'Decision' gives final outcomes based on inspection of the AIC-identified (i.e., Δ < 2) polynomial model(s) for each river: L+ indicates a linear positive association with the ONI or AOI, as appropriate; L-indicates a linear negative association; NL+ indicates a nonlinear but monotonic positive association; NL-indicates a nonlinear but monotonic negative association; NL↑ indicates a nonlinear and non-monotonic, concave-up relationship; NL↓ indicates a nonlinear and non-monotonic, concave-down relationship;-indicates no association. For a given river and circulation pattern, the AIC-acceptable models are listed from left to right in increasing order of Δ values, i.e., in decreasing order of preference.

River
Basin A (km 2 ) N P(ENSO) ENSO decision P(AO) AO decision We further tested the Sacramento result using alternative data types and analytical techniques, which also serves as a spot-check on the sensitivity of our overall results to such methodological choices. First, total annual precipitation (TAP) and mean annual temperature (MAT) were examined for three climate stations along a north-south transect through the Sacramento valley (Mount Shasta (NCDC COOP id 045983), Orland (NCDC COOP id 046506), and Colfax (NCDC COOP id 041912)), over the same period of record as the Sacramento flow data used here. Little evidence was found for a MAT teleconnection, but a quadratic (concaveup) relationship was an AIC-acceptable model for TAP teleconnections to ENSO at all three stations, and the only acceptable model for the more northerly stations (Mount Shasta and Orland), which exhibited p(ENSO) ∼1. This AICbased analysis of station precipitation data is fully consistent with results from Wu et al (2005) and Hsieh et al (2006) obtained using neural network-based nonlinear projections for a gridded 0.5°× 0.5°meteorological data product from the University of East Anglia's Climate Research Unit. Thus, analyses of meteorological data within the Sacramento Basin are fully physically consistent with outcomes for Sacramento flow volumes: the parabolic flow teleconnection in this case reflects an unambiguous parabolic precipitation teleconnection.
Second, as a methodologically independent check on the flow analyses, we abandoned the AIC-based approach in favor of conventional linear regressions upon ONI and ONI 2 . In both cases, regression residuals were Gaussian (both Jarque-Bera and one-sample Kolmogorov-Smirnov tests) and free of significant serial correlation (both Durbin-Watson and runs tests). There was no statistically significant relationship between Q and ONI (conventional p-value of 0.45), whereas there was a highly significant relationship between Q and ONI 2 (p-value of 0.003). This replication of our primary result by suitably applied conventional regression techniques provides further evidence that the strongly nonlinear teleconnection between Sacramento annual flow volume and ENSO is not a methodological artefact. While the corresponding prediction quality remains mediocre (correlation skill of 0.40), this is to be expected as ENSO is obviously not the only control on northern California water supply, and it nevertheless represents a notable improvement over the linear model (correlation skill of 0.10), reducing root mean square error by 65%. Clearly, exploring the use of a quadratic climate predictor greatly improves the odds of uncovering underlying teleconnections, which are of potential predictive use along with other quantities such as antecedent soil moisture. In this particular case, using ONI 2 proved particularly useful for diagnosing unusually high-flow years (see figure 2).
This improved prediction skill is important. ENSObased, long-range forecasting skills of seasonal and intraseasonal precipitation and streamflow are considered poor in northern California (Gershunov et al 2000). Despite longterm investments in the monitoring and forecasting of climate and hydrologic states (soil moisture and snowpack) here, most operational water resources predictions at seasonal lead times are still based on methods and data sources that have been in place since the 1950s. Specifically, the prevalent method used for forecasting seasonal streamflow in the western US remains a regression model of spring and summer streamflow on the previous winter's precipitation and accumulated snowpack (Wood and Lettenmaier 2006). California experiences considerable interannual variability in water supplies, largely due to variable winter precipitation patterns reflecting the behavior of northeast Pacific winter storm tracks. In southern California, winter precipitation is influenced by ENSO, where El Niño and La Niña conditions typically correspond to positive precipitation anomalies and meteorological droughts, respectively (e.g., Johnstone 2011). However, ENSO influence on winter precipitation differs between the coastal region and the Sierra Nevada (Andrews et al 2004, Cayan et al 1999. As streamflow integrates climatic forcing spatially, and variability in rainfall can be enhanced in runoff (e.g., Chiew et al 1998, Cayan et al 1999, associations between atmospheric circulation patterns and California rivers which originate from both the Coast Range and the Sierra Nevada (as the Sacramento river does) may be more complex. Further, ENSO teleconnections to precipitation and/or streamflow, and the attendant seasonal forecasting skill, vary markedly between southern, central, and northern California (e.g., Schonher and Nicholson 1989, Andrews et al 2004, Cayan et al 1999, Gershunov et al 2000. However, figures 8(c) and (d) of Cayan et al (1999) hinted that the absence of simple (linear or monotonically nonlinear) ENSO teleconnections in northern California may be misleading, and that warm-and cool-phase responses may in fact both yield higher-than-normal frequencies of high-flow days. Our analysis provides deeper confirmation of these strongly nonlinear effects. Although an operational water supply forecast model would obviously include additional predictors, such as antecedent streamflow (an indicator of initial conditions and catchment storage), it seems reasonable to suggest from figure 2 that incorporating parabolic teleconnections into hydrologic forecasting models could open new avenues for water resources management in California.

Conclusions
The possibility of strongly nonlinear teleconnections in annual water supply was investigated using AIC-based polynomial selection for several dozen of the northern hemisphere's largest ocean-emptying rivers. A variety of responses were found, underlining the inherent complexity of atmospheric teleconnections in general and of fluvial responses in particular. A key point, however, is that parabolic teleconnections to ENSO or the AO were indeed identified in the yearly flow volumes of several of the world's largest rivers. These results have significant methodological implications going forward. For example, the presence of parabolic climate associations in certain areas where linear water supply teleconnections are not clearly present (e.g., the Sacramento River) may open new possibilities in those regions for understanding and forecasting interannual water supply variations. Highly nonlinear teleconnections to other circulation patterns additionally await exploration, as do a clearer understanding of how parabolic teleconnections affect the details of the annual hydrograph and whether these relatively high-frequency hydroclimatic patterns may shift under possible long-term climatic changes, along with assessment of implications to the many downstream uses and impacts of water, ranging from ecological instream flow requirements to coastal ocean circulation.