Interannual hydroclimatic variability and its influence on winter nutrients variability over the southeast United States

Introduction Conclusions References

list impaired water bodies and develop Total Maximum Daily Loads (TMDLs) for these waters.Despite these efforts and frequent updates to TMDLs, the US Environmental Protection Agency's recent update reveals that nutrients affect 20 % of impaired and 12 % of the assessed river miles (EPA, 2006).The increase in aquatic nutrients might result from population growth as well as from increased fertilizer application (Meybeck 1982;Vitousek et al., 1997).However, natural variability associated with weather (e.g.hurricanes) and climatic events (e.g.El Ni ño) could also induce significant increase in nutrient concentrations beyond critical levels (Chen et al., 2007) even if the basin is not experiencing any pressure from urban development or changes in agricultural practice.Thus, it is critical to estimate the seasonal nutrient loadings conditioned on the expected runoff from nonpoint sources.The National Research Council (NRC, 2001(NRC, , 2002) ) has emphasized that a detailed understanding of various sources of uncertainties, including the role of climate change and climate variability, is required for improving water quality prediction in natural systems.One of the dominant and well understood modes of global climatic variability is ENSO that has a periodicity of 3-7 yr and exhibits anomalous warm/cold SST conditions in the equatorial Pacific, thereby modulating the climate particularly in the tropics and sub-tropics (Ropelewski and Halpert, 1987).Considerable research now exists on the recurrence and regime structure of ENSO and its teleconnections to rainfall/streamflow, and their potential predictability of interannual hydroclimatic variability over the United States (Ropelewski and Halpert, 1987;Dettinger and Diaz, 2000;Devineni and Sankarasubramanian, 2010).It is also well known that instream nutrient concentration and loadings primarily depend on streamflow variability (Borsuk et al., 2004;Paerl et al., 2006;Lin et al., 2007) and antecedent flow conditions (Vecchia, 2003;Alexander and Smith, 2006).Recent studies on the relationship between coastal water quality conditions and SST conditions also show that there is a strong association between climatic modes and concentrations of phosphorous (Childers et al., 2006), aquatic vegetation (Cho and Poirrier, 2005), and chlorophyll and phytoplankton levels (Arhonditsis et al., 2004).However, systematic research in associating climatic Figures

Back Close
Full variability to instream nutrient variability and utilizing that linkage to estimate seasonahead nutrient loadings is very limited.Most of the studies on estimating instream nutrient concentrations have focused primarily on predicting the average annual concentrations using runoff and various basin attributes (Smith et al., 1998(Smith et al., , 2003;;Mueller and Spahr, 2006;Mueller et al., 1997).
Studies have also recommended approaches to predict daily and seasonal loadings and concentrations nutrients using streamflow and their time of observation (Cohn et al., 1992;Runkel et al., 2004).However, these nutrient models rely on the observed information (e.g.streamflow) during that season, which has limited utility in developing season-ahead estimates of nutrients.Findings from the hydroclimatic literature clearly show that interannual variability in streamflow can be predicted by developing low dimensional models contingent on SST conditions (Devineni et al., 2008) as well as using precipitation forecasts from General Circulation Models (GCMs) (Sankarasubramanian et al., 2008).Similarly, water quality literature emphasizes that streamflow is the most important descriptor in explaining nutrient variability (Cohn et al., 1992;Runkel et al., 2004;Cohn, 2005).To our knowledge, this is the first effort that associates the interannual variability in all of the above noted three variables -climate, streamflow and total nitrogen (TN) -to develop TN forecasts over a region.The purpose is to understand the "controls" that are required for developing a skillful seasonal nutrient forecasts and also to assess how the skill in hydroclimatic predictions translate to skill in nutrient forecasts over the regional scale.For this purpose, we consider two low-dimensional models that consider season-ahead climate forecasts and land surface conditions as predictors for developing season-ahead estimates of winter nitrogen loadings over the SEUS.
The manuscript is organized as follows: a brief description of climate forecasts, forecasts developed using the low-dimensional models.Next, we discuss about the potential implications of the findings in the context of developing adaptive water quality management plans.Finally, in Sect.6, we summarize the findings and conclusions from the study.

Data sources
In this section, we discuss various hydroclimatic and water quality databases employed for associating climate forecasts with the nutrient loadings over 18 watersheds from the Southeast US.

HCDN streamflow database
Given the intent of the study is to associate interannual variability in winter nutrient loadings to climatic variability, we focus our analysis on 18 undeveloped basins over the Southeast United States (SEUS) from the Hydro-Climatic Data Network (HCDN) database (Slack et al., 1993).Daily streamflow records in the HCDN basins is purported to be relatively free of anthropogenic influences such as upstream storage and groundwater pumping and the accuracy ratings of these records are at least "good" according to United States Geological Survey (USGS) standards.The HCDN database contains the mean daily discharge for about 1600 sites across the continental United States with an average length of 48 yr. Figure 1 shows the location of 18 HCDN stations and Table 1 provides

WQN Water Quality Network database
USGS provides national and regional descriptions of stream water quality conditions in Water Quality monitoring Network (WQN) across the nation (Alexander et al., 1998).
The WQN database comprises water quality data from USGS monitoring networks from both large watersheds (National Stream Quality Accounting Network, NASQAN) and minimally developed watersheds (Hydrologic Benchmark Network, HBN).We employ the observed daily concentrations of Total Nitrogen (TN) available for these 18 stations from the NASQAN.Observed streamflow during the time of sampling is also available as part of the WQN database.The available water quality data varies from 10-30 yr depending on the measured water quality variable and station.By ensuring the selected watersheds are from HCDN basins, we basically ensure that both the streamflow and water quality data are minimally affected by anthropogenic influences.
For additional details about WQN, see Alexander et al. (1998).The selected 18 HCDN stations have observed TN concentrations for 12-22 yr (Table 1).However, the number of samples for each station ranges from 54-152 daily observations with an average of five-seven observations per year.

Simulated nutrients database
Though nutrient data in the WQN database is available for 12-23 yr over 18 watersheds (see Table 2), their samplings are intermittent.Using the daily observation over this period, we first obtain continuous daily nutrients for the observed period using the LOADing ESTimation (LOADEST) program developed by USGS (Runkel et al., 2004).The LOADEST model allows the user to select the best-fitting regression model from eleven predefined regression models using the Akaike Information Criterion (AIC) (Akaike, 1974).Five regression models that include "dtime" term are not appropriate to use for extrapolation since those models incorporate a linear time trend.Therefore, the simulated nutrient loadings based on the remaining regression models (i.e. on model forms, see Runkel et al. (2004).Table 2 shows the "goodness of fit" statistics (coefficient of determination (R 2 ) and AIC) in predicting the observed daily loadings in the WQN database (Table 1) and the coefficients of the best-fitting regression model for TN for the selected 18 stations.From Table 2, we infer that R 2 ranges from 0.83-0.97indicating good fit in predicting the observed daily loadings over 18 stations.
To relate retrospective climate forecasts (discussed in the next section), we estimated the daily TN loadings from 1957-2009 using the observed streamflow data available from the extended HCDN database and the best fitting regression model (in Table 2).The simulated daily loadings obtained from the LOADEST model over the period 1957-2009 are aggregated during JFM to develop winter loadings (L t ) of TN.Given that the estimated loadings are based on Adjusted Maximum Likelihood Estimation (AMLE) procedure in the LOADEST model, the simulated daily and the aggregated winter loadings are statistically unbiased (Cohn, 2005).We also computed R 2 (LOADEST) and the root-mean square error (RMSE (LOADEST) ) for the simulated winter TN loadings obtained from the LOADEST model.For additional details on computing errors in seasonal predictions, see Cohn (2005).Further, to ensure that there is no trend in the winter loadings, we performed Mann-Kendall test.At 1 % significance level, null-hypothesis with Kendall's tau being not equal to zero was rejected in all of the sites for TN.We also performed regional Mann-Kendall test to account for spatial correlation among the 18 stations (Douglas et al., 2000).The p-value for TN is 4 % indicating no-trend at the regional level.Our study will consider the simulated winter TN loadings (L t ) available during 1957-2009 for relating the interannual hydroclimatic variability to nutrient variability over 18 stations in the SEUS.Institute of Climate and Society (IRI) data library) (Li and Goddard, 2005).Retrospective precipitation forecasts from ECHAM4.5 are available for 5 months in advance for every month beginning January 1957.To force the ECHAM4.5 with SST forecasts, retrospective monthly SST forecasts were developed based on the observed SST conditions in that month based on the constructed analogue approach.For additional details on forcing ECHAM4.5 using constructed analogue SST forecasts, see Li and Goddard (2005).Figure 1 also shows the locations of 56 grid points of precipitation forecasts from ECHAM4.5 along with their latitude and longitude over SEUS.For this study, we utilize only the forecasted mean (which is obtained by computing the average of 24 ensembles) of winter retrospective precipitation forecasts issued in the beginning of January for developing 3-month ahead retrospective nutrient forecasts over the period 1957-2007.

Low-dimensional models development and performance validation metrics
Given that winter streamflow over the SEUS is predominantly rainfall driven with limited snow accumulation, we hypothesize that precipitation is the primary driver in controlling the JFM loadings.To verify this, we correlate simulated JFM loadings with both observed precipitation (Fig. 2) and principal components of the forecasted precipitation from ECHAM4.5 (Table 3).In this study, we only employ Spearman rank correlation for performing all correlation analyses.Similarly, the computed rank correlation was checked for statistical significance (i.e.1.96/(n − 3) 0.5 at 95 % confidence interval, where "n" denotes the number of data points used in calculating the correlation).Thus, the computed correlation in Fig. 2 needs to be greater than 0.29 (n = 50) to indicate statistically significant relationship between the observed precipitation and simulated loadings.
From Fig. 2, we infer that the correlation between observed precipitation and simulated loadings is statistically significant and greater than 0.55 for all the basins.Given this dependency, we first identify relevant grid points (  forecasts that have statistically significant correlation with JFM observed precipitation for each watershed.Nearest grid points that are significantly correlated to each watershed (Fig. 1) are selected.The variance explained by the first principal component (PC1) of the precipitation forecasts from these grid points is around 74-95 % indicating the strong spatial correlation among the gridded forecasts.Further, from Table 3, we also infer that rank correlations between PC1 of precipitation forecasts with streamflow, and seasonal loadings of TN are statistically significant for all stations (>0.29) with the only exception being station 18.The primary reason for such low correlation for station 18 is due to the poor coefficient of determination from the LOADEST model (Table 2) in predicting observed WQN data.To summarize, Fig. 2 and Table 3 provide the scope for using the low-dimensional components of precipitation forecasts for developing season-ahead forecasts of TN loadings over 18 selected stations.

Low-dimensional models
Given that our interest is primarily in understanding how large-scale hydroclimatic information could be utilized for seasonal nutrient predictions over the SEUS, we consider two low-dimensional models: principal components regression (PCR) and canonical correlation analysis (CCA).Low-dimensional models reduce the correlated predictors and predictands so that a subspace of uncorrelated predictors and predictands could be used for regression model development (Tippet et al., 2003;Sankarasubramanian et al., 2008).Further, these low-dimensional models also recalibrate the GCM forecasts so that any marginal bias in predicting the observed precipitation could be adjusted based on the regression model (Landman and Goddard, 2003).Brief description of the low-dimensional models is provided next.

Principal Components Regression (PCR)
PCR, which is otherwise known as Model Output Statistics (MOS) (Wilks, 1995), eliminates systematic errors and biases in GCM fields and also recalibrates the principal Introduction

Conclusions References
Tables Figures

Back Close
Full components (PCs) of GCM fields to predict the hydroclimatic variable of interest using regression analyses.The predictand could be streamflow (Q t ) or loadings (L t ) over a watershed.Since the gridded precipitation forecasts over a given region are spatially correlated, employing precipitation forecasts from multiple grid points as predictors would raise multicollinearity issues in developing the regression.To avoid this, we employ PCR based on Eq. ( 1): where L t denotes the estimate of daily average TN loadings during the JFM season in year "t", PC k t denotes the "k"th PCs from the retained "K " PCs of precipitation forecasts and βs denote the regression coefficients whose estimates are obtained by minimizing the sum of squares of error.We employ step-wise regression to select "K " PCs out of the rotated grid points of precipitation (given in Table 3) for developing the PCR model.From Table 5, we infer that most of the stations (except stations 8, 10 and 11) require only up to the first four principal components for developing the PCR model.

Canonical Correlation Analysis (CCA)
In PCR, we develop separate regression models for each site.Given that the predictands, the winter loadings, across the basins are also spatially correlated, one could utilize that information to develop a reduced set of regression models.This could help in utilizing the inter-site correlations to develop many (multiple predictands)-tomany (multiple predictors) regression relationships.Consider winter loadings available from "m" sites represented by L T = (L 1 , L 2 , . . ., L m ) (dimension: n X m) whose corresponding "p" grid points of precipitation forecasts (p > m) are represented as X T = (X 1 , X 2 , . . ., X p ) (dimension: n X p), then canonical correlation analysis finds a linear combination of the "p" predictors, Y * = b T Y , that maximally correlates with the linear combination of "m" predictands (X * = a T X ).Mathematically, the canonical Introduction

Conclusions References
Tables Figures

Back Close
Full where denotes the variance-covariance matrix between the two variables in the subscript.For a detailed mathematical treatment of CCA, see Wilks (1995).Number of components from "m" predictands and "p" predictors to be retained for the regression is decided based on step-wise regression.Squared values of canonical correlation represent the percentage of variance explained in each predictand by the predictors under that dimension.Thus, the skill in predicting the loadings for each site could be obtained based on the precipitation forecasts by developing a reduced set of models.
Before performing CCA, we first group the basins based on k-means clustering (Hartigan and Wong, 1979) so that CCA could be performed on each cluster.Based on clustering, four groups were identified (Table 4) with the sites having the highest loadings placed under group 1 and the lowest average loadings placed under group 4. Separate CCA was performed for each group.For instance, CCA on group 1 is performed on loadings from two sites (m = 2) and the corresponding grid points of precipitation forecasts for the two sites (#13 and #18) from Table 3 are combined (p = 20) as predictors.The skill in predicting the winter loadings for each station is evaluated based on two different skill scores, which are discussed next.

Skill scores for nutrients forecasts
To evaluate the skill in predicting the interannual variability in winter TN loadings using climate forecasts, we consider two error metrics -coefficient of determination (R 2 ) (Eq. 2) and root mean square error (RMSE, defined in Eq. 3) per unit area of the watershed (A).These metrics need to account both sources of errors: error in predicting the observed JFM nutrient loadings from the WQN database using the LOADEST program (R Figures RMSE (PCR/CCA) ).Since these two models are developed independently, R 2 and RMSE in predicting winter nutrient loadings using climate information could be expressed as follows: For each station, R 2 (PCR/CCA) , RMSE (PCR/CCA) , were computed based on the estimated TN loadings from the low-dimensional models and the simulated winter TN loadings for the period 1957-2006.Thus, we compute skill measures using Eqs.( 2) and (3) to quantify our ability to explain the interannual variability in winter TN loadings using precipitation forecasts from ECHAM4.5.

Results and analyses
To ensure that the skill in forecasting winter nutrients is reliable, we evaluate the low dimensional models based on two different types of validation namely, leave-X out crossvalidation (LCV) and split-sample validation (SSV).Both these methods are commonly adapted in forecasting literature for validating the model (Wilks, 1995).

TN loadings forecasts based on PCR models
For validating the PCR models under LCV, the methodology suggested by Towler et al. (2009) is modified to evaluate the skill of the model over 51 yr .The LCV steps for PCR models are described as follows: (i) 10 % of the data (5 yr) are randomly removed along with the year for which the prediction is desired, (ii) a PCR model is developed using the remaining 45 yr of loadings (L t ) and retained PCs (iii) the developed model is then used to predict the left-out year, and (iv) steps (i) to were computed based on the 51 yr of predicted data.This entire procedure (i)-(iv) is repeated 100 times and a box-plot of R 2 (Fig. 3) and the median of RMSE (in Table 5) are presented.
Figure 3 shows the box-plot of R 2 under LCV for 18 stations.Under LCV, we compute R 2 based on the predicted loadings for 51 yr.Hence, R 2 needs to be higher than 0.08 (correlation > 0.29) to demonstrate statistically significant skill in predicting season-ahead nutrient loadings.However, more than 12 stations exhibit R 2 greater than 0.16 over 100 trials of LCV.From Fig. 3, sixteen stations show statistically significant skill for TN.The developed PCR model under LCV explains more than 10 % of interannual variability in TN loadings in all the 100 different fittings (Fig. 3) except stations 6 and 18.For the rest of the 16 sites, the correlation between the predicted nutrient loadings obtained using climate forecasts and the loadings simulated from LOADEST using the WQN database is greater than 0.29, which is statistically significant for the 51 yr of data considered.The forecasted TN loadings show statistically insignificant relationship with observed TN loadings that the correlation coefficients are 0.28 and 0.16 for stations 6 and 18 respectively.Poor goodness of fit (see Table 2, R 2 (LOADEST) ) from the LOADEST model is the primarily reason behind the poor performance of these two stations during the winter season.Further, station 18 shows poor correlation between the principal components of precipitation forecasts and JFM loadings (Table 3).Another possible reason for such poor prediction by LOADEST model in those two stations is the limited number of years of data availability (see Table 1) with station 6 (18) WQN observations spanning 14 (12) years having a total of 56 (57) daily samples.The median RMSE (Table 5) computed under LCV also shows that error in predicting the observed WQN loadings during the winter season is lesser than 1 kg day −1 km −2 for most of the stations.
Under split-sample validation (SSV), PCR models are developed using L t and PCs available over the calibration period (PCR: 1957(PCR: -1986) and skill measures are computed in predicting L t during the validation period   1).For stations 1 and 12 perform well under LCV, but the skill is statistically insignificant under SSV even though the simulated TN (Fig. 2) loadings exhibit significantly correlation with both observed precipitation (Table 2) and forecasted precipitation (Table 3).Thus, based on two different validation methods, we understand that eleven stations (2-4, 7-11 and 13-15) exhibit statistically significant skill in predicting the observed WQN loadings using the PCR model developed separately for each site.Next, we evaluate the ability to predict the loadings in these stations under a different low-dimensional model -Canonical Correlation Analysesthat utilizes the spatial correlation in the TN loadings to develop a predictive model.

TN loadings forecasts based on Canonical Correlation Analyses
Four different CCA was performed on each group (listed in Table 4) and the developed models were evaluated under LCV and SSV.For LCV, we simply perform leave-5 out cross-validation instead of repeated fitting of the model (as described in Sect.4.1 for the PCR model).Under leave-5 out cross-validation, we randomly leave five predictands and predictors along with the year for which the prediction is desired for each station under a given group (in Table 4) and CCA model is developed using the rest of the 46 years of data.The developed CCA model was employed to predict the year for which the prediction is desired.This procedure was repeated for all the years of obser- (2) (Fig. 5) and (3) (Table 5) to account for the errors in the LOADEST model in predicting the WQN database.From Fig. 5 Under SSV (Fig. 6 and Table 5), we compute R 2 based on the predicted loadings during 1987-2007 using the CCA model developed over the period 1957-1986.From Fig. 6, CCA model did not exhibit any skill in predicting the winter loadings in stations 5, 6, 11-13, 16-18.Comparing this with the PCR model performance, CCA model performs similar with the exception being very low R 2 in stations 11.For station 11, PCR (R 2 = 0.47) performs significantly better than the CCA model (R 2 = 0.14).
One possible reason for such poor performance of CCA model is that station 11 has low correlation with the rest of the sites under group #4 which has 8 stations.Under SSV, R 2 of the CCA model for the rest of the stations is almost similar to that of R 2 of the PCR model.However, the RMSE CCA is consistently higher than the RMSE PCR (Table 5).This implies the conditional bias (over prediction and under prediction) of the CCA model is much higher.One possible reason for such increased conditional bias under CCA is due to increased heteroscedasticity in the observed loadings under a given group.However, the ability of CCA model in explaining the observed variance in loadings is almost comparable to that of the PCR model indicating the source of interannual variability in winter nutrients being the same across the region (discussed further in Sect.4.4).To summarize, using ECHAM4.5 precipitation forecasts alone, we infer both low-dimensional models demonstrate significant ability in predicting the Figures

Role of antecedent flow conditions in improving season-ahead TN forecasts
Though instream loadings primarily depend on streamflow and precipitation variability during the season, antecedent moisture/flow conditions also play a critical role in influencing the nutrient loadings from the watershed (Vecchia, 2003;Alexander and Smith, 2006).At seasonal time scales, antecedent flow conditions could be considered as the surrogate for basin storage or initial conditions in influencing the streamflow variability.
To understand the role of antecedent storage conditions, we consider the observed December streamflow at each station as an additional predictor along with the gridded precipitation forecasts (Table 3) to develop nutrient forecasts for each station.Forecasts of TN loadings were developed using both PCR model (Fig. 7a) and CCA model (Fig. 7b) and the modified R 2 and RMSE (Table 6) are computed (Eqs.

Source of climatic information influencing the winter TN variability
To understand the source of climate information that modulates the TN variability over the SEUS, we performed principal component analysis on the simulated loadings (L t ) of TN over 18 stations.The first component approximately explains 59 % of total variability in TN loadings over 18 stations.It is well-known in the hydroclimatic literature that ENSO is one of the important climatic conditions that influence the winter precipitation, temperature and streamflow over the SEUS (Ropelewski and Halpert, 1987).
Figure 8 shows the correlation between the first component of JFM TN loadings over 18 stations and JFM Nino3.4 -an index used to denote ENSO conditions by averaging the SST's (Kaplan et al., 1998) over the tropical Pacific (170 From Fig. 8, we infer that roughly 36 % of the variability in the first principal component of nutrient loadings over SEUS could be explained purely based on ENSO conditions.ENSO plays an important role on the winter climate of the US since its peak activity typically coincides during December-February.In fact, the precipitation forecasts from ECHAM4.5 incorporate the forecasts of tropical SST conditions (i.e.Nino3.4 region), which are obtained from constructed analogue SST forecasts for forcing the ECHAM4.5.Thus, ENSO is one of the sources of climatic variability that primarily influence both JFM hydroclimatic and nutrient variability over the SEUS.

Discussion
Analyses presented in Figs.2-8 show that interannual variability in nutrient loadings could be predicted well before the beginning of the season contingent on the climate forecasts.By selecting grid points of precipitation forecasts that are statistically significant with the observed precipitation in the basin, we ensure that the skill in predicting nutrient loadings is related to the basin process as well.Since obtaining long continuous records of daily observations of nutrients is difficult particularly over a large region, role of climate variability in modulating the interannual variability in nutrients over the SEUS.However, to account for the errors in the LOADEST model in predicting the observed WQN database, the reported skill measures (Eqs. 2 and 3), R 2 and RMSE, are adjusted for both LOADEST model error as well as the error of the low-dimensional models.
Thus, the intent of this study is to understand how well climate and basin storage conditions control the seasonal TN loadings rather than developing a skillful nutrient forecasts using low-dimensional models.In principle, the analyses provided here could also be extended with other sophisticated statistical models including nonparametric and Bayesian hierarchical models to estimate the entire conditional distribution of loadings.Similarly, one could also develop nutrient forecasts by forcing the mechanistic water quality model with forecasted streamflow and water temperature which in turn could be obtained based on dynamical downscaling (Leung et al., 1999) or statistical downscaling (Devineni et al., 2008) based on the climate forecasts.
Perhaps the most important utility of the season-ahead forecasts of nutrient loadings is in promoting water quality trading.Some of the successful water quality trading programs in the country (e.g.Tar-Pamlico River basin and Neuse River basin in NC) typically allow trading nutrient loadings across different point sources as well as with nonpoint sources (e.g.farmers participating in the voluntary nutrient reduction program) through the basin-level trading association so that the seasonal/annual load caps are always met from the basin.Research on climate forecasts and water allocation clearly show that probabilistic streamflow forecasts could be effectively utilized to specify the failure probability (1-reliability) of reservoir releases as well as in ensuring the end of season target storage conditions being met with high probability (Sankarasubramanian et al., 2009).Similarly, in the context of seasonal water quality management, the developed forecasts of loadings could be used to estimate the probability of violation of target loadings for the upcoming season.One could also develop an optimal nutrient loading model such that the probability of violating the total loadings from multiple sources is within the acceptable level.Thus, utilizing season-ahead forecasts Introduction

Conclusions References
Tables Figures

Back Close
Full of nutrient loadings and updating them throughout the season provide an opportunity to develop adaptive nutrient control strategies that ensure target nutrient loadings and desired concentration.

Summary and conclusions
The study primarily focused on understanding the process controls in estimating winter nutrient loadings by considering 18 HCDN watersheds over the SEUS.Given the discontinuous observed daily TN loadings, the study reconstructed simulated TN loadings using the LOADEST model for the winter season.The ability to predict these simulated loadings were validated with two low-dimensional models that utilize winter nutrient forecasts and pre-season flow conditions.However, the reported skill in predicting the TN loadings account for both error from the LOADEST model as well as the error from the low-dimensional models.
Out of 18 stations, totally nine stations (#2-4, 7-10 and 14-15) exhibited statistically significant skill in predicting the observed winter nutrient loadings under both low-dimensional models based on two different validation methods.Given that these stations exhibit skill under two different validation methods (LCV and SSV), the reported skill is also significant over the entire validation period .Findings from the study could be summarized as the following "controls" that influence the skill in predicting seasonal TN loadings: stations that have very high R 2 (LOADEST) (>0.8) in predicting the observed WQN loadings during the winter (Table 2) exhibit significant skill in loadings.Incorporating antecedent flow conditions (December flow) as an additional predictor did not increase the explained variance in these stations, but substantially reduced the RMSE in the predicted loadings.Understanding the source of climatic variability that control the TN variability revealed that Nino3.4,an index denoting ENSO conditions over the tropical Pacific, accounted for 36 % of the observed spatial variability in the TN loadings over the SEUS.Given that using climate forecasts has been very beneficial in improving reservoir management over seasonal time scale Introduction

Conclusions References
Tables Figures

Back Close
Full  Full  Full  Full  Full to improve national water quality conditions resulted in the enactment of the 1972 Clean Water Act with section (303) d requiring the states and territories to Figures Back Close Full Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | streamflow and water quality databases employed in the study is first provided in Sect. 2. Following that, Sect. 3 provides the details of the low-dimensional statistical models and skill measures utilized in developing and evaluating the season-ahead nitrogen loadings forecasts.In Sect.4, we present results from the winter nutrient Figures Back Close Full Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | the list of the 18 stations considered in this study along with their drainage areas.Since the streamflow data (Q) in the HCDN database is available only up to 1988, we have extended it up to 2009 based on the USGS historical daily streamflow databaseDiscussion Paper | Discussion Paper | Discussion Paper | model forms: 1, 2, 4 and 6) in the LOADEST program do not have any time trend.For details Figures Back Close Full Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

For
developing season-ahead TN forecasts, we utilize the retrospective winter precipitation forecasts from ECHAM4.5 General circulation model forced with constructed analogue SSTs (http://iridl.ldeo.columbia.edu/SOURCES/.IRI/.FD/ .ECHAM4p5/.Forecast/ca sst/.ensemble24/.MONTHLY/.prec/,International Research 10941 Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | correlation is obtained by choosing the vectors a and b that maximizes the relationship a Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | (iii) are repeated to develop prediction for each year and skill measures (R 2 and R-RMSE) Discussion Paper | Discussion Paper | Discussion Paper | vation under a given group to develop the CCA model estimated loadings.The R CCA were further adjusted according to Eqs.
Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | 2 and 3) based on SSV by evaluating the model over the period 1987-2007.Comparing Fig. 7a and b with Figs. 4 (PCR model) and 6 (CCA model), we infer that only station #13 (under CCA model) has resulted in statistically significant skill by adding streamflow as an additional predictor.However, by using streamflow as an additional predictor, R 2 of the CCA model substantially improved over all stations which indicate the importance of incorporating local information in spatial-dimension reduction.Further, RMSE of both PCR and CCA models are substantially reduced by adding the observed December streamflow as an additional predictor.This implies that antecedent storage/flow conditions are very critical in reducing the conditional bias in developing season-ahead TN forecasts resulting in reduced over/under prediction compared to the models developed using the precipitation forecasts alone.Thus, from a process control perspective, given the good skill in the reconstructed seasonal nutrient loadings, the interannual variability in nutrient loadings could be partially explained based on climatic variability.But, to obtain improved prediction (i.e.RMSE), it is important to incorporate both climatic variability and antecedent storage conditions in developing season-ahead nutrient forecasts.Discussion Paper | Discussion Paper | Discussion Paper | we employed simulated nutrient loadings from the LOADEST model to understand the Figures Back Close Full Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | (Sankarasubramanian et al., 2009), we argue the need to develop nutrient loadings forecasts conditioned on climate forecasts.Our future work will utilize these seasonal nutrient forecasts in developing adaptive water management plans over the SEUS.Discussion Paper | Discussion Paper | Discussion Paper | maximum likelihood to type 1 censored data, Water Resour.Res., 41, W07003, doi:10.1029/2004WR003833,2005.Cohn, T. A., Caulder, D. L., Gilroy, E. J., Zynjuk, L. D., and Summers, R. M.: The Validity of a Simple Statistical-Model for Estimating Fluvial Constituent Loads -an Empirical-Study Involving Nutrient Loads Entering Chesapeake Bay, Water Resour.Res., 28, 2353-2363Discussion Paper | Discussion Paper | Discussion Paper | ods to Model Total Organic Carbon, Alkalinity, and pH after Conventional Surface Water Treatment, Environ.Eng.Sci., 26, 1299-1307, 2009.Vecchia, A. V.: Relation Between Climate Variability and Stream Water Quality in the Continental United States, Hydrological Science and Technology, 19, 77-98, 2003.Vitousek, P. M., Aber, J. D., Howarth, R. W., Likens, G. E., Matson, P. A., Schindler, D. W.Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Figure 1 :
Figure 1: Location of the 18 HCDN stations along with the considered grid points over the SEUS.

Fig. 1 .
Fig. 1.Location of the 18 HCDN stations along with the considered grid points over the SEUS.

Figure 2 :
Figure 2: Rank correlation between the simulated TN loadings from the LOADEST model and observed precipitation over the selected 18 stations.

Fig. 2 .Fig. 3 .
Fig. 2. Rank correlation between the simulated TN loadings from the LOADEST model and observed precipitation over the selected 18 stations.

Figure 4 :
Figure 4: Modified R 2 (based on equation (3.2)) of PCR model predicted TN loadings obtained using PC's of forecasted precipitation under SSV.

Fig. 4 .
Fig. 4. Modified R 2 (based on Eqs. 2 and 3) of PCR model predicted TN loadings obtained using PC's of forecasted precipitation under SSV.

Figure 5 :
Figure 5: Modified R 2 (based on equation (2)) of CCA model predicted TN loadings obtained using PC's of forecasted precipitation under LCV.

Fig. 5 .
Fig. 5. Modified R 2 (based on Eq. 2) of CCA model predicted TN loadings obtained using PC's of forecasted precipitation under LCV.

Figure 6 :
Figure 6: Modified R 2 (based on equation (2)) of CCA model predicted TN loadings obtained using PC's of forecasted precipitation under SSV.Fig.6. Modified R 2 (based on Eq. 2) of CCA model predicted TN loadings obtained using PC's of forecasted precipitation under SSV.

Fig. 6 .Figure 7 :
Figure 6: Modified R 2 (based on equation (2)) of CCA model predicted TN loadings obtained using PC's of forecasted precipitation under SSV.Fig.6. Modified R 2 (based on Eq. 2) of CCA model predicted TN loadings obtained using PC's of forecasted precipitation under SSV.

Figure 8 :Fig. 8 .
Figure 8: Relationship between the first principal component of TN loadings over SEUS and ENSO conditions, which is indicated by Nino3.4.
significant skill in predicting season-ahead nutrient loadings.Based on this, Fig.4indicates that eleven stations(2-4, 7-11 and 13-15)show significant skill in predicting TN loadings.Stations 6 and 18 perform poorly because of the limited number of years of WQN data which results in very low R 2 of the LOADEST model.
. Hence, R 2 needs to be higher than 0.21 (correlation > 0.46 for 21 yr of data) to demonstrate statistically Figures Back Close Full Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | , stations 5, 6 and 18 do not exhibit statistically significant correlation in predicting the loadings from the WQN loadings.As discussed under PCR model, stations 5, 6 and 18 did not perform well because of the limited number of years

Table 1 .
Baseline information for 18 selected stations showing the number of years of observed daily records of TN available in the WQN database.Values in the parentheses under number of years column show the total number of daily observations available for each station.

Table 3 .
Rank correlation between observed winter streamflow, TN loadings with the first principal component of the winter precipitation forecasts for the 18 selected stations.Locations of grid points indicated in the Table are shown in Fig.1.

Table 4 .
Grouping of 18 selected stations based on K-Means clustering.

Table 5 .
Skill, expressed as RMSE (based on Eq. 3), in predicting winter TN loadings using climate forecasts.Tablealsogives the number of principal components considered and the percentage variance explained by them for the total grid points selected (given in Table3) for each station.

Table 6 .
RMSE (Eq.3)of forecasted TN loadings based on PCR and CCA models that consider ECHAM4.5 precipitation forecasts and December streamflow as predictors under SSV.