A watershed modeling approach to streamflow reconstruction from tree-ring records

Insight into long-term changes of streamflow is critical for addressing implications of global warming for sustainable water management. To date, dendrohydrologists have employed sophisticated regression techniques to extend runoff records, but this empirical approach cannot directly test the influence of watershed factors that alter streamflow independently of climate. We designed a mechanistic watershed model to calculate streamflows at annual timescales using as few inputs as possible. The model was calibrated for upper reaches of the Walker River, which straddles the boundary between the Sierra Nevada of California and the Great Basin of Nevada. Even though the model incorporated simplified relationships between precipitation and other components of the hydrologic cycle, it predicted water year streamflows with correlations of 0.87 when appropriate precipitation values were used.


Introduction
Climate change effects are predicted to impact water resources all over the world, and are likely to cause drier conditions in the western USA [1]. Climatic perturbations also play an important role in changing ecosystem structure and function [2], including disturbance regime, as recently shown by modifications of wildfire dynamics [3]. Studies on ecosystem response to relatively short perturbations have indicated that fish populations or species assemblages may recover rapidly from drought [4], but quantitative assessments of ecological impacts from extreme, decades-long wet or dry episodes have revealed more pervasive ecological impacts than previously thought [5,6]. A potential strategy available to policy makers for understanding and coping with the risk associated with these future climate impacts is to obtain a clear definition of past hydrological variability and extremes [7]. Instrumental records of precipitation, temperature, and surface-water flow in many areas of the world, including the mountain regions of the western USA, are often limited to the past century, but long-term estimates of streamflow variability are critical for mitigating impacts of floods and droughts [8]. While demands for water expand and change as population increases and becomes more urbanized, estimates of water runoff statistics (averages, extremes, etc) derived from long records can result in sustainable water planning and resource allocation, thereby reducing conflicts among stakeholders [9].
Streamflow records can be extended by stochastic approaches to generate synthetic data [10]. Statistics for the existing record are incorporated in a stochastic model that produces streamflow sequences that replicate these statistics for a longer period. Stochastic methods also generate long time series of precipitation that are transformed into streamflow using deterministic hydrologic models [11]. These approaches assume that existing instrumental data adequately represent the characteristics of streamflow or precipitation well beyond the actual period of observations.
Another method for extending instrumental records of streamflow is the use of tree-ring records from moisturesensitive species [12]. Since water availability is a growthlimiting factor, its influence on radial growth increments diminishes as soil moisture increases. Hence, tree-ring data are especially well suited for drought and low-flow analyses [13]. Reconstruction of streamflows has generally been based on multivariate regression and principal component analysis [14], although simple bivariate regression has also been used [15].
An attractive feature of the dendroclimatic approach for streamflow reconstruction is that tree-ring chronologies are much longer than instrumental records, thus it is easier to quantify dry and wet periods with duration, magnitude, and peak (as defined by [16]) outside the limits of the existing instrumental record.
However, tree-ring reconstructions cannot directly test the influence of watershed factors that alter streamflow independently of climate, because there is usually no direct physical connection between streamflow and wood growth. Any statistical relationship between treering chronologies and water runoff is based on the web of connections that both wood accumulation and streamflow have with precipitation, temperature, soil features, and other environmental variables. Because of this, dendrohydrological records of streamflow cannot easily incorporate the influence of factors that alter runoff even when climate remains the same. Such factors include watershed topography, vegetation dynamics, natural disturbance (such as wildfire), and human land use.
In this letter, we present a novel approach that combines tree-ring records and watershed modeling to produce estimates of long-term streamflow. We designed a watershed model that can be run using data at annual (and possibly seasonal) timescales using as few inputs as possible. The model was calibrated using instrumental data from the upper reaches of the West Walker River basin in California. As an example of model applicability to proxy reconstructions, a simplified tree-ring record of precipitation was also used, and streamflows produced by the model were compared to instrumental values.

Site description
The Walker River basin (total drainage area ∼10 485 km 2 ) is located at the boundary between California and Nevada (figure 1). The basin originates in the eastern Sierra Nevada and terminates in Walker Lake, a remnant of prehistoric Lake Lahontan that encompassed most of the Great Basin during the last ice age [17]. Basin elevations range from about 1200 m to more than 3700 m. The Walker River system consists of two forks, the East Walker River and the West Walker River, which originates near the divide between the Walker River basin and Yosemite National Park. A major tributary (the Little Walker River) joins the West Walker River near Sonora Junction. Average annual precipitation at the Bridgeport, CA, meteorological station (figure 1) is about 230 mm with approximately 1100 mm of snowfall each year. Average monthly temperatures range from −7 to 17 • C. The model was set up to simulate discharge at the United States Geological Survey's (USGS) gaging station on the West Walker River near Coleville (elevation 2009 m; figure 1), where average annual discharge over 68 water years (WY) is 53.8 cm. 5 A WY begins October 1st, ends September 30th, and is designated by the year in which it ends. Based on a GIS coverage provided by the State of California, human settlement within that portion of the watershed is minimal. Thus, the watershed is relatively undeveloped and has not changed much over the period of instrumental streamflow records.

Model development
Because our modeling approach was intended for use with treering proxy records of climate, it had to operate on a seasonal or annual timescale. In addition, tree-ring data can extend over hundreds (sometimes thousands) of years in the past, but most instrumental records only cover the last century at best. Therefore, the model had to perform reasonably well with just precipitation as input. Given these premises, a water balance model was deemed most appropriate. Watershed models at the monthly timescale have been developed since the 1960s [18], so that many seasonal and monthly water balance models now exist [19], and comparisons have been done between alternative models that consider different processes and data [20].
For our purposes, we used a simple water balance model that operates at the annual timescale and only requires precipitation as input [21,22]. This model assumes the watershed is homogeneous and composed of three storages (appendix; figure A.1): surface, subsurface (unsaturated zone), and groundwater (saturated zone). Each storage has input and output variables that are either known (e.g., measured or reconstructed precipitation) or calculated by parametric relationships, and the storages are linked to each other by inputs and outputs (e.g., infiltration, deep percolation, etc). The basic processes considered in the model are: surface runoff, infiltration, evapotranspiration, deep percolation, baseflow, and streamflow (appendix). Model parameters include a = fraction of precipitation that becomes surface runoff; b = fraction of infiltrated water that evaporates; c = fraction of groundwater storage that becomes baseflow; and d = fraction of groundwater storage that becomes groundwater flow. These parameters do not change with time. In addition, the model requires an initial boundary condition of starting groundwater 5 Value obtained by dividing average annual discharge (7.65 m 3 s −1 ) by the area of the watershed of interest (448.5 km 2 ), then multiplying by 31 536 000 s yr −1 . storage (GS). A copy of the model is available from the authors and is located on the dendrolab.org website.
Model inputs and setup. Precipitation data were obtained from two National Weather Service (NWS) meteorological stations, one located at Bridgeport, CA (NWS COOP station no. 041 072 at elevation 1972 m) and the other at Topaz Lake, NV (NWS COOP station nos 268 186 and 268 202 at elevations 1530 m and 1701 m, respectively). Topaz Lake is part of the West Walker River watershed, but the stations are at least 300 m lower than the region being modeled; the Bridgeport station is about 30 m lower than the USGS gaging station at Coleville, CA and is outside the watershed. Precipitation records starting in WY 1895 and continuous until the present were also obtained from the parameter-elevation regressions on independent slopes model (PRISM; [23]) dataset. PRISM data at 2.5 arc-minute resolution are freely available on the internet, and we selected grid cells that include the Coleville station and cover most of the basin area ( figure 1).
In addition to these annual precipitation inputs, initial boundary conditions were necessary. Soil survey data from the Natural Resources Conservation Service (http://websoilsurvey. nrcs.usda.gov/app/WebSoilSurvey.aspx; accessed 1/11/08) were available for watershed areas in the Humboldt-Toiyabe National Forest. The dominant soil type in the West Walker River watershed is Hydrologic Soil Group D, which tends to have a slow infiltration rate when wet [24], promoting more surface runoff. Because of this, the upper bound of the a parameter was set at 0.9 (i.e., 90% of precipitation could become surface runoff). Additional constraints were that the sum of parameters c and d was 1.0 (i.e., the amount of baseflow and groundwater flow could not exceed the available groundwater storage) and that all parameter values were positive (appendix).

Model simulations
Several experiments were set up to test model performance, and their outcomes were compared to historical discharge during WY 1939-2001.
One model employed PRISM precipitation, using the average of either 4 grid cells surrounding the Coleville gage (figure 1), or of 12 cells in the watershed (the original 4 plus 8 others upstream of the gage). In addition, two models used measured precipitation, one from Bridgeport, which had 35 years of data overlap with Coleville discharge, and the other from Topaz Lake, which had 25 years of overlapping data with Coleville discharge. The best performing model was then modified to use a simple tree-ring index of WY precipitation to evaluate if the model could reconstruct streamflow. The tree-ring index was developed using western juniper (Juniperus occidentalis) samples collected at two sites within the basin (figure 1). Details on the development of this tree-ring index are in [16]. Although the tree-ring record spans 2300 years, only the period of overlap with the instrumental record  was used in this analysis. Since the tree-ring record was in standard deviation units, we transformed it into WY precipitation using the mean and standard deviation of the PRISM data.
Because there were five parameters in each model, 500 runs were made for each model with randomly chosen starting values for each parameter. Starting values of a, b, c, and d were allowed to be 0 and 1, and starting values of GS were 0 and 254 cm. The 'best' fit between measured and simulated discharge was determined using the pseudooptimization function in the Excel Solver module to minimize the root mean squared error (RMSE) between predictions and observations. We also calculated the r 2 between measured and simulated discharge and the relative RMSE (RRMSE, which is RMSE divided by the standard deviation of the observations).

Results and discussion
Parameter values that resulted in the 'best model fit' and corresponding model evaluation statistics indicate that our model can reproduce annual streamflows remarkably well, with r 2 up to 0.87 (table 1; figure 2). PRISM precipitation produced a much better fit to observed streamflow than singlestation precipitation outside of the basin (Bridgeport: r 2 = 0.61; RRMSE = 0.72; n = 35; Topaz lake: r 2 = 0.67; RRMSE = 0.75; n = 25). This is most likely due to PRISM's ability to accommodate topographic variation and its influence on precipitation. In addition, PRISM data were specific to the watershed that drains to the Coleville gage. Both Bridgeport and Topaz Lake are located at elevations lower than the gage, have much shorter records than PRISM, and also had data gaps that resulted in more uncertain model estimates. Model fit statistics were essentially the same between models using 4 or 12 PRISM cells, so only results using 12 PRISM cells are shown in table 1 and figure 2. This indicates that while precipitation needs to be representative of the watershed being Table 1. Mean ± standard deviation (ranges in parentheses) for best fit parameters and metrics of West Walker River basin models after 500 runs with random starting values of a, b, c, d, and GS. a = fraction of precipitation that becomes surface runoff in each timestep; b = fraction of infiltrated water that evaporates in each timestep; c = fraction of groundwater storage that becomes baseflow in each timestep; d = fraction of groundwater storage that becomes groundwater flow in each timestep; GS = initial groundwater storage; RRMSE = relative root mean squared error; Avg Q pred = mean annual discharge predicted by model; Std Q pred = mean standard deviation of the annual discharge predicted by the model; Avg Q obs = average annual discharge observed for the years simulated; Std Q obs = standard deviation of the annual discharge observed for the years simulated. n = number of years simulated in each model. modeled, spatial coverage is not so critical. The model using proxy precipitation did not reproduce observed streamflow as well as the model using PRISM precipitation (r 2 = 0.22; RRMSE = 0.88), most likely because of the crude way in which it was generated: proxy precipitation had a relatively low correlation with PRISM precipitation (r 2 = 0.35; n = 63). These results are promising for simulating streamflows at annual timesteps over long periods with minimal inputs. Although our model incorporates simplified relationships between precipitation and other components of the hydrologic cycle, it predicted WY streamflows with correlations 0.9 if appropriate precipitation values were used. Streamflows reconstructed directly from tree-ring data for WY 1939-2001 (n = 63) had an r 2 of 0.22 and RRMSE of 1.18, which is slightly worse than model results for the same period in terms of RRMSE (0.88). Thus, while streamflow variability is reconstructed similarly by both approaches, model-derived streamflows are closer to observed values than statisticallyderived streamflows.
Future work will address reconstruction of streamflows over the entire period covered by the proxy data (up to 2300 years in this basin). Because our approach is aimed at quantifying the effect on past streamflows of vegetation dynamics, natural disturbances (such as wildfire or insect outbreaks), geomorphological changes (e.g., landslides or modifications of the stream channel), and land use, we plan to include evapotranspiration in the model. We will use the approach of [25], who derived monthly equations that do not require air temperature data to estimate potential evapotranspiration in Nevada as a function of elevation. Treering reconstructions of air temperature can also be used to estimate annually varying evapotranspiration. Preliminary modeling based on pan evaporation inputs indicated that model results are sensitive to evapotranspiration. This is promising because changes in species dynamics, natural disturbances and land use alter the distribution and type of vegetation, which would then affect evapotranspiration. Such factors are not considered in empirical streamflow reconstructions. Furthermore, seasonality could be explored by developing seasonal (i.e., wet and dry) tree-ring records (such as from earlywood or latewood; e.g., [26]). We have been using a model that can operate at the seasonal timescale, allowing for finer temporal resolution. The same model can also be used with sub-watersheds, allowing for finer spatial resolution of parameter values to reflect differences in evapotranspiration and infiltration due to elevation and soil characteristics.
To our knowledge, the approach of combining tree-ringderived precipitation with a deterministic watershed model has not been applied to generate long-term streamflows. Time series of climatic variables generated from general circulation models (GCMs), mesoscale meteorological models, and regional-scale climate models have been input into hydrologic models to produce long-term streamflow sequences. [27] used several atmospheric models coupled with the semidistributed SLURP model to predict hydrologic output including streamflows for the upper Columbia River basin and the MacKenzie basin. [28] used a GCM with a distributed hydrologic model to predict flows in the Uruguay River basin. In the approach we have described here, climatic input for hydrologic models is given by multi-century long, exactly dated, and annually resolved tree-ring records rather than through stochastic or atmospheric models. The ability to use proxy records of climate combined with mechanistic watershed modeling provides a potentially transformative option for longterm reconstructions of streamflow.

Acknowledgments
Research partially supported by NSF grant ATM-0503722. F Biondi was funded in part by NSF grant ATM-CAREER 0132631. L Saito was funded in part by the Nevada Agricultural Experiment Station (Publication No. 52087097). The comments of two anonymous referees and of the anonymous board member improved an earlier version of the manuscript. F Biondi thanks Paolo Burlando (ETH, Zurich) for useful discussions.

Appendix
Equations for the model are as follows [21,22]: where BF t = baseflow for year t; DP t = deep percolation for year t; E t = evaporation for year t; GF t = groundwater flow for year t; GS t−1 = groundwater storage for previous year (or initial boundary condition for year 1); GS t = groundwater storage for year t; I t = infiltration for year t; P t = precipitation for year t; Q t = streamflow for year t; SR t = surface runoff for year t. All model components have units of length.