Runoff prediction in a poorly gauged basin using isotope-calibrated models

Abstract Predictions in ungauged basins have been a major challenge in hydrologic sciences, and there is still much work needed to achieve robust and reliable predictions for such basins. Here, we propose and test a novel approach for predicting runoff from poorly gauged basins using a minimum complex model calibrated with isotope data alone (i.e., without observed discharge data). The model is composed of two water-stores (soil water and groundwater) and considers their connectivity to runoff in terms of both water and isotope budgets. In a meso-scale basin in which riverbed deformations frequently occur, making automatic observation of river discharge difficult, we measured hydrogen and oxygen isotope composition (δ 2 H and δ 18 O) of precipitation and river water twice-weekly for one year. Runoff predicted by the model agreed well with that observed monthly or bimonthly. Monte Carlo simulation revealed a strong coherence between model performance in isotope simulation and runoff prediction, demonstrating that the use of isotopes as dynamic proxies of calibration targets helps reliably constrain model parameters. Our results indicate that this approach can serve as a powerful tool for prediction of runoff hydrographs, particularly for basins in which the stage-discharge relationship is highly variable.


Introduction
River gauging (i.e., measuring water level and, by inference, flow) has been conducted since the foundation of human civilization, as indicated by the Nilometer in ancient Egypt (Dooge, 2004). However, many drainage basins worldwide are ungauged or poorly gauged, and in some cases existing gauging stations are even declining, so that prediction of runoff in such basins is considered to be a challenge (Sivapalan, 2003). The Prediction in Ungauged Basins (PUB) initiative  launched by the International Association of Hydrological Sciences has resulted in considerable advances in hydrology , although much of this success has been in gauged rather than in ungauged basins; thus, there is still a long way to go to achieve robust and reliable predictions without river gauging .
Prediction of runoff hydrographs has been made using a wide spectrum of rainfall-runoff models (Beven, 2012). However, these models all must be calibrated by comparing predicted runoff with observed one (Wagener et al., 2004). Accordingly, if sufficient high quality records of river discharge are not available for a basin, reliable runoff predictions cannot be made. Measured water levels can usually be converted to discharge using known stage-discharge relations (also known as H-Q curves, rating curves). However, in some basins in which riverbed morphology and stream geometry are frequently changed because of high sediment yield, the stage-discharge relation is variable and the discharge cannot be determined from water level measurements (Herschy, 2009). In this case, it is difficult to calibrate models and thus reliably predict runoff hydrographs.
There are three methods of predicting runoff hydrographs in ungauged basins (Parajka et al., 2013). The first is to estimate model parameters from basin characteristics a priori. Although some parameters of completely physics-based models can be well constrained, their performance is generally worse than that of calibrated models (Duan et al., 2006). The second method is to calibrate model parameters for well gauged basins and then apply them to an ungauged basin of interest by extrapolation of the model parameters along with basin characteristics while assuming hydrological similarity between basins. Extrapolations tend to be more reliable if process realisms are held ; however, it is not easy to confirm the realisms without observation. The third method is to calibrate models using observed proxy data other than (but intrinsically associated with) discharge. For instance, McGuire et al. (2007) calibrated a hillslope model with artificially applied tracer (Br -), then obtained reasonable runoff predictions. Tracer data can yield integral fingerprints of flow paths and storage (Leibundgut et al., 2009;McDonnell and Beven, 2014), and provide a powerful test of whether the model is right for the right reasons (Kirchner, 2006). Therefore, the third approach is expected to help assure process realisms in the models, although its feasibility has not been assessed in detail.
Here, we propose a novel approach for predicting river runoff from poorly gauged basins using a model calibrated with environmental stable isotopes of water alone. We selected environmental isotopes as a proxy of calibration targets because they are ideal tracers for water flow/storage and involve information integrated over the basin as well as discharge (Genereux and Hooper, 1998;McDonnell and Beven, 2014). Stable isotope tracers have been conventionally used for hydrograph separation (see Klaus and McDonnell (2013) for overview) and/or mean transit (or residence) time estimation (see McGuire and McDonnell (2006) for overview), and such ''soft data" can be used for calibrate/validate runoff models (Seibert and McDonnell, 2002;Stadnyk et al., 2005;Vaché and McDonnell, 2006;Sayama and McDonnell, 2009). However, the present study directly incorporate stable isotopes into runoff model. The objectives of this study were to test the proposed approach for a basin in central Japan, to validate its usefulness and to clarify its limitations.

Study basin
The study basin was the upstream part of the Midai basin (35.6°N, 138.4°E, 55.2 km 2 ) in Yamanashi Prefecture, central Japan (Fig. 1). The Midai River is a tributary of the Kamanashi River, which is a northwestern branch of the Fuji River. The Midai River is steep, with a bed slope of 1/10-1/50, and is known to have caused severe floods since prehistoric times (Takeuchi, 2004). The Midai River carries a large amount of sediments because it originates from fragile areas in mountains (Ministry of Land, Infrastructure and Transport, Kofu Work Office, 2005). Although the Yamanashi Prefectural Office has a gauging station that routinely measures the water level of Midai River, frequent bed deformations due to strong flow and large sediment yield prevent conversion of the measured water levels to discharge rates. Therefore, it is necessary to establish a novel approach for runoff prediction without discharge data, which is the principal reason we selected this basin.
The study basin has a temperate inland climate with warm rainy seasons and cold dry seasons. At Kofu station of the Japan Meteorological Agency (JMA), which is adjacent to Midai basin, the long-term mean annual precipitation is 1135 mm and the annual mean temperature (mean annual range) is 14.7°C (23.8°C). Relatively light snow accumulation is common. The vegetation is dominated by natural Quercus crispula stands and Pinus densiflora plantations.
Topography is characterized by a large elevation range exceeding 2000 m and steep slopes with a mean of 32.4° (Table 1). Geology is dominated by andesite and mudstone (including shale/ slate). Forest, the dominant land cover, occupies 73.5% of the total area. The study basin is not an experimental one, and no further information about runoff-related characteristics are available.

Observation
For isotopic measurements, we collected river water and precipitation samples twice-weekly (i.e., 3 or 4 days intervals) for one year from 1 May 2011 to 30 April 2012. River water sampling was conducted at a water purification plant that receives water from the Midai River. Precipitation was sampled at an open space within the plant using a precipitation collector with a polyethylene funnel that was designed to prevent evaporation of stored water (Yamanaka et al., 2015). Hydrogen and oxygen stable isotope ratios ( 2 H/ 1 H and 18 O/ 16 O) in all samples were measured by tunablediode laser spectroscopy using a liquid water isotope analyzer (L1102-i, Picarro Inc., Santa Clara, CA, USA) at the Center for Research in Isotopes and Environmental Dynamics, University of Tsukuba. Measurement results are expressed using d notation (d 2 H and d 18 O) relative to the Vienna Standard Mean Ocean Water (V-SMOW). Measurement errors for the analyzer were 1.0‰ for d 2 H and 0.1‰ for d 18 O (Yamanaka and Onda, 2011).
River discharge was observed monthly (or bimonthly) by the velocity-area method (Herschy, 2009) at a site 1.8 km downstream from the sampling site. Flow velocity was measured by an electromagnetic flowmeter (AEM1-D, JFE Advantech, Nishinomiya, Japan) at 0.6 of the depth from the water surface for each vertical. The river stream usually branched off within a channel, and the number and location of small branches within the channel was not fixed owing to the riverbed deformation. Thus, we measured discharge for all the branches and then summed up. On 9 Sep. 2011, flow velocity could not be measured because the flow was too fast for safe wading, and thus discharge data at that time are not available.
To confirm the possibility of reconstructing continuous discharge data, we referred to water level data observed in close proximity to our site at 10 min intervals by the Yamanashi Prefectural Office. However, the stage-discharge relationship was not fixed month by month.

Model
The exponential tank model originally developed by Kondo et al. (1992) was modified and employed in the present study. The model is composed of two tanks, A and B (Fig. 2). Tank A represents soil water storage and tank B represents groundwater storage. Water storage in the tank (S A , mm) and flow from the tank (F A , mm d À1 ) at a time step j were calculated as follows: where E T is the evapotranspiration (mm d À1 ), f I is the interception fraction, P G is the gross rainfall (mm d À1 ), Dt is the time interval (=1 d), and superscripts j and j À 1 denote time steps. The remaining portion of precipitation in tank A, d A (mm), is represented as: where S Amax is a model parameter representing the maximum storage of tank A (mm). Similarly, the water budget in tank B can be given as follows: where S B is the water storage in tank B (mm), d B is the remaining portion of F a in tank B (mm), and S Bmax is the maximum storage of tank B (mm). A base flow component, F C (mm d À1 ), is given as: where k C (mm À1 d À1 ) is a factor assuming that base flow is proportional to the square of storage height, and represents drainage ability of the basin. Finally, river discharge, Q (mm d À1 ), can be given as the sum of quick flow and base flow as follows:  ( Schematic illustration of the model (see text for definition of symbols).
We introduced additional procedures into the tank model for computing isotope mass balance, as explained below. The d values of water in tank A (d A , ‰), tank B (d B , ‰) and river water (d R , ‰) were computed as follows: where d P is the isotopic composition of precipitation (‰), and S B ⁄ is a model parameter (mm) representing the water storage portion that does not affect the base flow-storage relationship but contributes to isotopic variations in groundwater storage (Barnes and Bonell, 1996). We assumed the isotopic composition of F A was as follows: where f hc is a parameter representing the hydrologic connectivity of soil water to runoff. This treatment is a reflection of incomplete mixing of precipitation and soil water during groundwater recharge and runoff generation. Recently, Good et al. (2015) suggested that partial disconnection between mobile and immobile soil water should be considered in isotope budget studies. Brooks et al. (2009) and Evaristo et al. (2015) showed isotopic evidence of such a disconnection. At hillslope scale, density of macro pores for preferential flow can control the disconnection. At the basin scale, we may have to consider connectivity between landscape units (Burt, 2005) such as hillslope-riparian-stream connectivity (Jencso et al., 2010;Tetzlaff et al., 2014). At or near the riparian zones, saturation-excess overland flow tends to occur easily, as described by well-known variable source area concept (Hewlett and Hibbert, 1967;Ward and Robinson, 1990); thus, mixing of soil water with runoff water is limited (Birkel et al., 2015). This is also true if infiltration-excess overland flow occurs. In the present model, f hc implicitly represents all such factors causing incomplete mixing of precipitation and soil water.

Input data and calibration
The JMA Radar-AMeDAS (Automated Meteorological Data Acquisition System) precipitation data (e.g., Makihara, 1996) with a spatial resolution of 1 km were employed for computation of basin-mean daily precipitation as P G . The f I was estimated to be 0.128 as an area-weighted average of typical f I for every land use/vegetation types within the study basin (Ma and Yamanaka, 2016). The E T was estimated by the FAO Penman-Monteith method (Allen et al., 1998) using daily meteorological data (solar radiation, air temperature, relative humidity, and wind speed) observed at Kofu station (Fig. 1) of JMA. We assumed that the crop coefficient is unity. Validity of the estimated E T for several different basins adjacent to the present study basin was confirmed by Ma and Yamanaka (2016). Daily d P data were reconstructed from data observed twice a week under an assumption that the precipitation d value does not change during each sampling period (i.e., 3 or 4 days). In addition, d P was corrected considering isotopic lapse rates (À11.66‰ km À1 for d 2 H and À1.724‰ km À1 for d 18 O; Yamanaka et al., 2015), and the difference between the basinmean elevation (1362.7 m) and the sampling site elevation (476 m). Consequently, basin-mean d P values are 10.34‰ (1.529‰) lower in d 2 H (d 18 O) than the observed at the sampling site.
We performed a simulation for a period from 1 January 2010 to 30 September 2012, using the first 1-yr period for spin-up. Initial values of S A and S B were set to 500 mm and 100 mm, respectively. Initial values of both d A and d B were set to À95‰ for d 2 H and À13‰ for d 18 O. The d P values for periods other than the observation period (i.e., 1 May 2011 to 30 April 2012) were complemented by assuming the same annual cycle. In other words, we assumed the same d P value for the same day of year (DOY). Though this assumption may not be always true, biases in simulated d R were expected to be the largest in May 2011 and we confirmed no such a bias in that period.
After this, five unknown parameters remained, S Amax , S Bmax , k C , S B ⁄ , and f hc . The values of these parameters were fixed through calibration by stochastic hill climbing optimization (Brownlee, 2012), with a measure of the root mean square error (RMSE) of d R for d 2 H and d 18 O. In this optimization, multiple restarts with different initial parameter-sets were adopted to confirm whether the result does not become trapped in local optima. To avoid unrealistic drifts in water budget, we added the following constraint.
where R denotes the annual total; although the constraint value of 100 does not have any physical meanings, we confirmed that optimum values of unknown parameters (especially S Amax and S Bmax ) tend to unrealistically diverge when we used the larger values.
In addition, f hc should range from 0 to 1, and the other 4 parameters as well as S A and S B should have positive signs.
The authors of the present study have proposed an isotopecalibrated five-layer tank model for estimating time-variant transit time Yamanaka, 2013, 2016). However, the model contains too many parameters to calibrate without discharge data. The reason we developed and employed the minimum complex model comprising only two tanks with five unknown parameters was to reduce the equifinality problem (Beven and Binley, 1992). Reducing model layers implies that accuracy of transit time (if we estimate) will be reduced. However, it is expected that reproducibility of d R is less affected, because isotopic contents in deeper layers are not very different generally.

Monte Carlo simulation
Simulation results from calibrated (i.e., optimized) models still involve uncertainties in both isotope and runoff predictions. In addition, higher performance in isotope prediction is not always associated with that in runoff prediction. To confirm the relationship between reproducibility of predicted isotopes and runoff, we performed Monte Carlo simulations after calibrating unknown parameters. A total of 1000 runs were made with different randomly selected values of S Amax , S Bmax , and k C , within the range of optimum values ±100%. Because S B ⁄ and f hc do not affect runoff prediction (i.e., no appearance in Eqs. (1)-(8)), they were not considered.
In addition, three sets of 100 runs were made with randomly selected values of one out of S Amax , S Bmax , and k C to reveal the relationship of uncertainty in each parameters with model performance. Fig. 3 shows the daily precipitation amount, elevation-corrected d P , and observed d R . The corrected d P ranged from À31.8 to À153.6‰ for d 2 H and À2.57 to À20.44‰ for d 18 O, with lower values in winter and higher values in summer. Variations in observed d R were less than 10% of those, and the annual cycle corresponding to those for precipitation was not clear. However, moderate increases in summer and short-term changes throughout the year were found in observed d R .

Isotope reproducibility
These tendencies were captured well by our simulations (Fig. 4). The coefficient of determination (R 2 ) between observed and simulated d R was 0.84 for d 2 H and 0.73 for d 18 O. The RMSE of d R was 1.1‰ for d 2 H and 0.13‰ for d 18 O, being almost equivalent to (but slightly worse than) the measurement accuracy of the isotope analyzer used. The ratio of RMSE to the standard deviation of observed data (RSR) was 0.40 for d 2 H and 0.52 for d 18 O. According to a performance rating proposed by Moriasi et al. (2007), these figures correspond to 'very good' (0.00 6 RSR 6 0.50) or 'good' (0.50 < RSR 6 0.60).
Errors in simulated d R can be attributed to input data of d P with coarse temporal resolution and spatial heterogeneity, E T with low accuracy, and f I as a constant first approximation, as well as model structure and parameterizations. Although isotopic enrichment caused by soil surface evaporation, which was not considered in this model, could potentially introduce errors in simulated d R , especially for d 18 O, our preliminary assessment considering this effect (see Ma and Yamanaka (2016) for inclusion into tank models) showed no improvements, suggesting that it is a minor factor, at least for the study basin, or was cancelled by errors in assumed d P and/or initial values of d A and d B .
Calibrated values of unknown model parameters are summarized in Table 2. For S Amax , S Bmax , and k C , the difference between the best values in d 2 H-and d 18 O-calibarion were relatively large, indicating uncertainties in model structure and/or parameter optimization. However, those were nearly the same for S B ⁄ and f hc , suggesting robustness of the calibrated values.

Runoff prediction
Runoff predicted by the d 2 H (d 18 O)-calibrated model agreed well with the observed river discharge (Fig. 5), with an R 2 of 0.96 (0.93). The Nash-Sutcliffe efficiency (NSE; Nash and Sutcliffe, 1970) of the d 2 H (d 18 O)-calibrated model was 0.86 (0.92), corresponding to 'very good' (0.75 6 NSE 6 1.00) performance (Moriasi et al., 2007). Because the number of available data (=9) is not sufficient, absolute performance rating may not be appropriate. However, relative significance of calibrated models can be evaluated using NSE as shown below. Fig. 6 shows the relationships between RSR in isotope simulation (RSR i ) and NSE in runoff prediction (NSE r ) from 1000 Monte Carlo runs. Strong coherence between them was observed for the entire domain (Fig. 6a), whereas data plots were scattered in a smaller domain around the best results (Fig. 6b). For runoff prediction, slightly better results than the isotopically optimized prediction were obtained, even if isotope reproducibility was worse. This means that isotopically optimized model is not always the best one for runoff prediction. However, all 'very good' runs for isotope simulation were also 'very good' for runoff prediction, demonstrating that better isotopic calibration deduced acceptable better model for runoff prediction.
Both RSR i and NSE r were insensitive to changes in S Amax , although the best values were slightly different between isotope simulation and runoff prediction (Fig. 7a). Changes in S Bmax introduced large changes in RSR i , but smaller changes in NSE r (Fig. 7b). Changes in k C had greater effects on both isotope simulation and runoff prediction than the other two parameters (Fig. 7c). Sensitivity of NSE r to k C changes was greater around the best k C value for isotope simulation, suggesting the importance of local optimization for improving runoff prediction. These results suggest that isotope simulation is highly sensitive to base flow, but less sensitive to soil water storage associated with quick flow. The lower sensitivity of the model to soil water was likely caused by lower temporal resolution (i.e., twiceweekly) of input isotope data. In contrast, runoff prediction appears to be sensitive to both components, but less sensitive to the groundwater storage associated with quick flow. For a model with the best parameter values in d 2 H (d 18 O) calibration, base flow (i.e., F C ) accounted for more than 76% (93%) of the total runoff even under peak flow conditions. This explains why both isotope simulation and runoff prediction are highly sensitive to a base flow related parameter, k C .

Usefulness of isotope calibration for runoff prediction
Some studies have partially demonstrated that incorporation of isotopes into runoff models helped reduce the uncertainty of model parameters (Dunn et al., 2008;Fenicia et al., 2008;Birkel et al., 2010aBirkel et al., , 2010bBirkel et al., , 2011Yamanaka, 2013, 2016). However, it has also been argued that incorporating isotope (or other tracer) data may not always improve models owing to uncertainties associated with additional parameters required in isotope simulations (Seibert et al., 2003;Wissmeier and Uhlenbrook, 2007;McDonnell et al., 2010). As shown in Fig. 6, better models for isotope simulation can provide better prediction of runoff, confirming that isotope calibration is useful in practice for runoff prediction.
Isotope calibration can constrain more accurately the base flow prediction than the quick flow prediction. Thus, this approach is expected to be useful for basins where base flow generally dominates, like the present study basin (base flow contribution > 76%). However, in quick flow dominated basins, inconsistencies among the best parameter values between isotope simulation and runoff prediction may result in worse runoff prediction. One potential factor causing the uncertainty of quick-flow-related parameters is the temporal resolution of isotope data. Twice-weekly data cannot capture quick flow components exactly. Isotope data with higher temporal resolution (e.g., daily or sub-daily) have the potential to improve runoff prediction by isotope-calibrated models (Birkel et al., 2010a). Advanced tunable diode laser spectroscopy could enable isotope analysis at low cost for a large number of samples (Lyon et al., 2009) or even high frequency (e.g., hourly) monitoring in situ, increasing the usefulness of our approach.
According to McDonnell and Beven (2014), the flow velocities in the system control the tracer response and the celerities control the hydrograph. In addition, velocities are controlled by the water storage filled in the system, while celerities are controlled by the storage deficit. The results that model calibration with single source of information (not discharge but isotopes) was useful suggest a constant (i.e., temporally stable) relationship between velocities and celerities in the study basin. It is also indicated that our model structure helped successfully represent the above two mechanisms. In other words, it should be noted that usefulness of isotope calibration for runoff prediction may depend on constancy of the velocity-celerity relationship in the basin and the selection of model structure.

Limitations and ways forward
Use of dual isotopes (d 2 H and d 18 O) allows us to know the degrees of uncertainties associated with the calibrated model. However, it is still difficult to estimate the reliability of runoff prediction. For instance, if the model structure as presented herein was not suitable for a basin of interest, predicted runoff may include non-negligible errors despite there being good reproducibility in isotope simulation. Therefore, several or more observations of discharge in different stage conditions are needed to confirm the validity of the model. Once this is accomplished, predicted runoff is expected to be reliable, regardless of changes in riverbed morphology.
However, as shown in Fig. 5, uncertainty in peak runoff prediction is relatively large. Although it should be noted that hydrometric measurements (Herschy, 2009) and other runoff models (Beven,  2012) tend to include large errors in peak discharge, intrinsic errors in isotope-calibrated models should be carefully examined. Quick flow (i.e., F B ) contributions in the model became large before the peak for every event, after which base flow dominated, suggesting that errors in predicted peak runoff are associated with both components rather than either one alone. Therefore, determination of k C appears to be the most important aspect, because this parameter is more strongly influenced by slight changes in isotopic reproducibility (see Section 3.2). In other words, more data are needed for calibration to ensure better parameter optimization. Long-term isotope records that include large storm events will be useful for improving the accuracy of peak runoff prediction.
Calibration resulted in f hc values of 0.2-0.3 (Table 2). Interestingly, these values are comparable to a global estimate (0.38 ± 0.28) by Good et al. (2015). Of course, this consistency is physically less meaningful, and the f hc is not always constant. Nevertheless, these findings indicate the need for more attention to the causes and mechanisms of the weak connectivity between soil water and runoff. Further investigations in both headwater small catchments and meso-scale basins with diverse topographical/ geological settings will help improve our understanding of catchment hydrology and conceptualization of model structure for isotope calibration.

Conclusions
From a practical point of view, we developed a novel approach that calibrate a minimum complex runoff model using isotopic data alone, and showed its ability to predict runoff without hydrometric calibration. Model performance in isotope simulation was closely related to that in runoff prediction, demonstrating the model is right for the right reasons. Thus, we conclude that isotope calibration helps reliably constrain model parameters, and that this approach is useful for runoff prediction in poorly gauged basins such as the study basin, where riverbed deformations frequently occur.
Uncertainties still remain, especially with respect to constraining quick-flow-related parameters and predicting peak runoff. Higher frequency and/or longer-term isotope data are expected to improve these uncertainties. The applicability of the model structure in this study (i.e., two-layer tank representing soil water and groundwater storage) should be further investigated for other basins (including well gauged ones), with a focus on hydrological connectivity on a basin scale.