Multimodel uncertainty changes in simulated river flows induced by human impact parameterizations

Human impacts increasingly affect the global hydrological cycle and indeed dominate hydrological changes in some regions. Hydrologists have sought to identify the human-impact-induced hydrological variations via parameterizing anthropogenic water uses in global hydrological models (GHMs). The consequently increased model complexity is likely to introduce additional uncertainty among GHMs. Here, using four GHMs, between-model uncertainties are quantified in terms of the ratio of signal to noise (SNR) for average river flow during 1971–2000 simulated in two experiments, with representation of human impacts (VARSOC) and without (NOSOC). It is the first quantitative investigation of between-model uncertainty resulted from the inclusion of human impact parameterizations. Results show that the between-model uncertainties in terms of SNRs in the VARSOC annual flow are larger (about 2% for global and varied magnitude for different basins) than those in the NOSOC, which are particularly significant in most areas of Asia and northern areas to the Mediterranean Sea. The SNR differences are mostly negative (−20% to 5%, indicating higher uncertainty) for basin-averaged annual flow. The VARSOC high flow shows slightly lower uncertainties than NOSOC simulations, with SNR differences mostly ranging from −20% to 20%. The uncertainty differences between the two experiments are significantly related to the fraction of irrigation areas of basins. The large additional uncertainties in VARSOC simulations introduced by the inclusion of parameterizations of human impacts raise the urgent need of GHMs development regarding a better understanding of human impacts. Differences in the parameterizations of irrigation, reservoir regulation and water withdrawals are discussed towards potential directions of improvements for future GHM development. We also discuss the advantages of statistical approaches to reduce the between-model uncertainties, and the importance of calibration of GHMs for not only better performances of historical simulations but also more robust and confidential future projections of hydrological changes under a changing environment.


Introduction
Human activities have greatly affected the hydrological cycle [1,2], whereas the simulation of human water uses and model uncertainties therein are still great challenges for global hydrological modeling [3]. Model simulations have shown that discharge has been increasingly disturbed by human water uses in the late 20th century [4]. In the recent decade, hydrologists have made large efforts to identify the human impacts on hydrological cycle under a changing environment [5][6][7][8][9][10][11]. The major human impacts (e.g. irrigation and reservoirs) have been more or less parameterized in many global hydrological models (GHMs) [12][13][14][15][16][17][18].
However, large discrepancies among models result from the differences in model input, algorithms, parameters, etc. [3,19]. The parameterizations of human impacts vary greatly across GHMs and thus possibly bring extra discrepancies among models.
Hydrologists are aware of the uncertainties among GHMs and some intercomparison projects have been initialized to profile them. For example, the betweenmodel uncertainties for naturalized simulations of GHMs have been investigated through the Water Model Intercomparison Project (WaterMIP) [20] and the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP) Fast-track [21,22]. The human water uses such as irrigation are remarkable in some regions with intensive human impacts (e.g. western United States, China, and South Asia), and can induce considerable uncertainties in hydrological projections for the future [9,23]. All these prior studies showed large discrepancies among GHMs in future hydrological projections, however, these uncertainties might result from numerous differences among GHMs, e.g. different input data and model algorithms, which makes it difficult to clarify the uncertainty sources. The ISIMIP phase 2 provides a framework for comparing and evaluating multiple GHMs based on consistent input data, e.g. meteorological forcings, human impacts (reservoirs and irrigation area), and drainage network for flow routing. In view of the potential influence of human impacts on the GHMs simulations, it is now possible to examine the changes of between-model uncertainty induced by the inclusion of human impacts in GHMs quantitatively, based on the ISIMIP2 simulation protocol.
In this study, we use four GHMs to investigate the uncertainty changes in the simulations with and without human impact parameterizations. On this basis, we further provide discussions on the differences in the parameterization of human impacts, which are associated with between-model uncertainties. This paper is organized as follows: description of models, experiments and methods are presented in section 2; results are presented in section 3; the implications of the results are discussed in section 4 and a summary is presented in section 5.

Human impacts in GHMs
In the experiment, human impacts are considered in terms of irrigation and reservoir regulation. Timevarying areas of both irrigated and rainfed cropland are represented as the combination of present-day (year 2000) areas of crop types from MIRCA2000 [36] and backward trends of agricultural land cover from HYDE [37]. The reservoir (dam) information is derived from the Global Reservoir and Dam (GRanD) Database [38], with the locations re-arranged to half-degree grid cells based on the global drainage direction map (DDM30) [39]. The reservoirs are included or not for regulation according to the documented year of completion. Reservoirs and irrigation areas used in the experiment are shown in figure S1. The river basin delineations defined by the DDM30 data [39] are used for analysis at basin scale. The parameterizations of human impacts in the four GHMs are summarized in table S1 by referring to the relevant literature, e.g. [9,30], who have documented the human water uses in several stateof-the-art GHMs specifically including those used here.

Uncertainty measurement
Streamflow simulations from the experiment with human impacts are compared with the observed station data from the Global Runoff Data Centre [40] to evaluate the performances of GHMs. Annual flow (AF) and highest monthly flow (HMF) for each year, and their means over the study period, i.e. mean annual flow (MAF) and mean highest monthly flow (MHMF) are computed. Relative errors between simulated and observed MAF and MHMF, and the correlation coefficients between simulated and observed AF and HMF are calculated for each station. The respective simulated streamflow is picked out from the global grids according to the latitudes and longitudes of stations. The stations (1235 in total) with record lengths of more than 20 years and catchment areas larger than 10 000 km 2 are used for comparison (see figure S2).
The signal to noise ratio (SNR), defined as the mean divided by the standard deviation, is used as an indicator of uncertainty among the multimodel simulations. SNR is calculated for global and basinaveraged MAF and MHMF from model grid cells to address the uncertainty in the simulations with and without human impacts. SNR differences between the experiments with and without human impacts are interpreted as the change of uncertainty caused by the inclusion of human impacts in GHMs. Annual SNR is also computed for global AF and HMF for temporal change analysis during the 1971-2000 period. Figure 1 shows the observed MHMF (figure 1(a)) and MAF ( figure 1(b)) versus the ensemble means of simulations across all GHM-forcing combinations. Both MHMF and MAF simulations show large deviations at many stations with relatively small catchment areas, while stations with large catchment areas tend to show little deviation. For the ensemble means, about 10% of the stations show small relative errors of À10% to 10%, while more than 40% of stations show relative errors of À50% to 50% for both MHMF and MAF. For the ensemble of individual GHM, no more than 10% (15%) of stations show small relative errors of À10% to 10% for MHMF (MAF), and about 20% to 30% have relative errors of À50% to 50% for both MHMF and MAF (see table  S2). The simulations of MAF show generally better performance than those of MHMF at most stations, but both of them seem to be overestimated at many stations.

Evaluation of GHMs
The correlation coefficients between the simulated and the observed HMF and AF are shown in figure 1(c). The correlation coefficients for AF are significantly larger than those for HMF. Nearly 70% (85%) stations have correlation coefficients greater than 0.6 for HMF (AF). The proportions are larger than those for individual models (see table S3). This brief evaluation indicates that improvements of GHMs are necessary to capture river flows, particularly in small catchments, and ensemble means of multimodel simulations usually fit better to observations. Figure 2 shows the SNRs for the experiment with human impacts and the SNR differences between the experiments with and without human impacts for global HMF and AF. During the 1971-2000 period, the all-ensemble SNR of global HMF ranges from 4 to 5, and SNR of global AF ranges from 4.5 to 5.5. SNRs show large spread among different meteorological forcings (see figures 2(a) and (c)): the WATCH's SNR is the smallest (∼4 for HMF and 4.5-5 for AF) over the historical period; WFDEI's SNR shows the same value as WATCH's before 1979 (WFDEI and WATCH share the same data for this period) and then increases greatly to be the largest (∼5.5 for HMF and ∼7.5 for AF) among those of the four forcings; the PGMF's SNR and the GSWP3's SNR are very close (5-5.5) for HMF, but the former (from 6.5 to 5.5) is slightly smaller than the latter (from 7.5 to 6.5) for AF. This indicates that the uncertainties in historical climate data bring large discrepancies to GHMs simulations, which agrees with previous studies [41]. The SNR for the simulations with human impacts is generally larger (smaller) than for the naturalized simulations regarding global HMF (AF). SNR differences for global HMF (figure 2(b)) increase over time, whereas the ensemble SNR difference ranges from 0.1 to 0.3 (2%-6%); the WATCH's SNR difference is the smallest, ranging from 0.1 to 0.2, while SNR differences for other forcings mostly range from 0.2 to 0.5, which show relative large interannual variations. SNR differences for global AF (figure 2(d)) show considerable interannual variation. The all-ensemble SNR difference ranges from À0.12 (2%) to zero; the WATCH's SNR difference is also the smallest, and the other SNR differences mostly ranges from 0.05 to 0.15. Figure 3 shows the SNR differences between the simulations with and without human impacts for the basin averaged MHMF and MAF, respectively, over the 1971-2000 period. The SNR differences for HMF at basin scale shows many negative values (indicating larger uncertainties), e.g. some basins in Europe, North India, and South China ( figure 3(a)). There are generally small changes in the basins with a few reservoirs and irrigation areas, but considerable positive SNR differences (lower uncertainties) are found for the Yenisey and Lena basins. Lower uncertainties are also found in some major basins with great human impacts, such as the Liao River and Hai River of China, the Don River of Russia, the Amu Darya River in Central East, the Tigris-Euphrates River in West Asia, the Zambezi River in Africa, and the São Francisco River in South America. However, only a relatively weak relationship (with a correlation coefficient of 0.18) is found between the basin MHMF's SNR difference and reservoir storage capacity, as shown in figure 3(c).

Uncertainty assessment
SNRs for MAF simulations with human impacts are mostly smaller than naturalized simulations at basin scale. Large differences are observed in some major river basins with great human impacts, such as the Chang Jiang and Huang River basins of China, the Ganges, Godavari and Krishna Rivers of India, the Indus River of Pakistan, the Amu Darya River in Central East, the Tigris-Euphrates River in West Asia, and the Danube River in Europe. It indicates that uncertainty increases in MAF simulations with human impacts in these regions. Only a few positive SNR differences (lower uncertainty) are found, e.g. in the Hai and Liao River of China and the East Coast of Caspian Sea. The SNR differences for MAF are relatively well related to the basin irrigation area (correlation coefficient À0.41; figure 3(d)), indicating that between-model uncertainty is higher for the basins with larger irrigation area. Environ. Res. Lett. 12 (2017) 025009 Figure 4 shows the ratios of the SNR of humanimpact-induced MAF differences to the SNR of naturalized MAF at basin scale. The numerator is the SNR of MAF differences between the simulations with and without human impacts. The smaller the ratio is, the larger uncertainty in human impact simulations is, and vice versa. The ratios are less than one (mostly < 0.5) for many basins (particularly those with   numerous irrigation areas and reservoirs), that is, the SNR of human-impact-induced MAF differences are obviously smaller than the MAF SNRs (see figure S3). The southern basins and the Hai River basin in China and many basins in Europe with intensive human activities show very small ratios that less than 0.2. Only several basins show large ratios greater than one, such as the Tarim Interior of China, the Volga of Russia, St Lawrence in North America, and the Shebelli-Juba and some northern basins in Africa where irrigation areas are small and reservoirs are few. It indicates that the human impact simulations-i.e. the human-impactinduced MAF differences-show larger uncertainties compared to the naturalized simulations in these regions.
Unlike the SNR differences in figure 3, the ratios in figure 4 are very weakly related to both the fraction of irrigation area and reservoir storage capacity. Nevertheless, uncertainty in irrigation-as the largest human water use-perhaps plays a key role for the small SNRs. For example, the actual water withdrawal for irrigation (IRRWW) simulated by GHMs shows considerable differences and is significantly underestimated compared to reported data (see figure S4).

Discussion
The simulated river flows show large deviations with overestimation at many hydrological stations compared to GRDC observations. This may be due to regional overestimation of runoff generation and the underestimation of anthropogenic water uses (e.g. see figure S4) and soil water storage [42]. The betweenmodel uncertainties are measured in terms of SNR, which are larger (a bit smaller) in the annual flow (high flow) simulations with human impacts than in the naturalized simulations. The differences of betweenmodel uncertainty from the two experiments are relatively small (2%-4%) at global scale but are more significant for some regions. Previous studies showed that human intervention (e.g. irrigation water withdrawal) largely altered regional water cycle [9,43]. The human impacts are primarily represented by anthropogenic water uses (irrigation, industrial domestic, etc.) and reservoir regulation in the GHMs, which increase the model complexity with respect to model structure and parameters.
The different model algorithms and various parameters should be responsible for the large between-model uncertainties [3]. The uncertainties due to the different responses of GHMs to climate input are beyond the scope of this paper. However, it is noted that the differences in naturalized simulations resulted from the different responses also will influence the simulation of human water uses (e.g. irrigation). Regarding to the simulations of human impacts, the severely lack of water uses data primarily in developing countries should be one of the major reasons leading to great deviations to observations and uncertainties among GHMs. Here, we focus on the differences of the between-model uncertainties between the experiments with and without human impacts, and the discussion of the potential major sources of the uncertainties associated with the different parameterizations of human impacts in GHMs.

Uncertainties in irrigation simulations
The simulation of irrigation, the largest anthropogenic water use, is likely to contribute to the discrepancies in the simulations of human impact by GHMs, as indicated by the relationship between the SNR difference and irrigation area ( figure 3(d)). Irrigation water demand is usually estimated as the difference of potential crop evapotranspiration and local available soil (green) water. Therefore, uncertainties in IRRWW simulations are largely associated with the estimation of crop evapotranspiration, soil moisture, and irrigation efficiency. Though all four GHMs use the FAO Penman-Monteith equation to estimate the potential crop evapotranspiration, the simulated potential water withdrawal for irrigation can be significant different (see figure S4(c) and (d)). The irrigation efficiency (the ratio of irrigation water use to the total water withdrawal) varies across the GHMs (see table S1) and may significantly influence the estimation of potential and actual water withdrawal for irrigation. On the other hand, the implementations of water withdrawals in GHMs may be different in several aspects, which are partly responsible for the differences in IRRWW, such as the accessibility to available water for a grid cell, the proportion of withdrawal from river, reservoir and groundwater, and the allocation of the water supplies for different sectors from a reservoir. Hence, irrigation schemes and associated parameters need to be reconciled against the observed regional conditions to provide more consistent IRRWW simulations at both global and regional scales.

Uncertainties in reservoir simulations
Reservoir regulation scheme is critical in coupling human-induced and natural hydrological changes in GHM simulations. Human impacts on hydrological processes could be much more complex than the simulations in this study for they are associated with many socioeconomic factors. For instance, irrigation is linked to reservoir regulation and regional water allocation, while reservoir regulation rules are mostly defined by energy demand, flood control, various water supplies, and even the energy and food prices [44]. The role of reservoir regulation therein makes the simulation be relatively uncertain. Water losses due to evaporation are particularly significant for some small reservoirs [45], which may result in uncertainty in reservoir regulations since not all GHMs consider this process (table S1). The different reservoir regulation schemes inevitably bring uncertainties to river flow simulations. Though the GHMs more or less take the reference of [46] or [47] in their reservoir regulation development, the adapted rules are still (perhaps largely) different [15,48,49]. The reservoir regulation may produce significantly different simulated hydrographs of the dammed rivers by the GHMs [50]. Nevertheless, the uncertainty differences for both annual flow and high flow are weakly related to the reservoir storage in this study. It is noted that the simulations of high flow are less discrepant among GHMs in some basins (figure 3(c)) with large storage capacity of reservoirs (e.g. the Bratsk, Irkutsk reservoirs in the Yenisey basin and Vilyui reservoirs in the Lena basin) and small irrigation areas, where the reservoir regulation greatly determine the variations of downstream flow [51]. At global scale, the slightly higher consistency in the high flow simulations with human impacts perhaps results from the universal flood control rules in GHMs which are greatly associated with reservoir storage capacity and annual average inflows.

Uncertainties in simulations of groundwater withdrawal
Groundwater withdrawal is also a key source for to the IRRWW in some regions [52]. However, modeling of groundwater availability remains a challenge due to the complex interactions between surface water and groundwater [10,53], and the large differences in the implementation of groundwater withdrawal give rise to significant discrepancies among GHMs [54]. The proportion of withdrawals from groundwater (R grd ) is a key parameter associated with the groundwater withdrawal estimation. In current GHMs, due to insufficient historical data at global scale, the proportion of groundwater withdrawal is often estimated according to water use demand and surface water availability-in this case the amount of groundwater pumping was often unlimited [55]-or further constrained by estimated groundwater availability and historical groundwater pumping data [30,[56][57][58]. Leng et al [59] showed that the calibrated R grd using historical census data could largely improve the simulation of irrigation amount in the USA (see their figure 3). The PCR-GLOBWB model limits the groundwater withdrawal according to its availability and the reported groundwater pumping data based on the International Groundwater Resources Assessment Centre, and obtains better performances in the simulations of groundwater withdrawal [30], although it may result in deviation in the IRRWW estimates in regions like India and Pakistan where groundwater pumping remains unreported in many parts. This studies suggested that R grd could be determined from historical data and is useful for improving the simulation of groundwater withdrawal, and thus reduce the uncertainty among GHMs. Besides, the uncertainties in estimated water use demand, surface water and groundwater availability will be propagated to the groundwater withdrawal estimation. The groundwater use efficiency usually was taken the same as the surface water, but it was supposed to be higher [60]. Potential uncertainty resulted from this parameter needs further investigation.

Potential of reducing uncertainties in multimodel simulations
Validation and calibration of GHMs against historical observations would advance model development, and are perhaps a crucial means to refine the GHMs simulations and to narrow the spread therein [61,62]. Regarding the large spread in the simulations of human water uses, validations of individual sectoral water uses or hydrological components are necessary to get access to more constrained and confident hydrological modelling. Though the robust hydrological response to climate change in GHMs during historical period would not necessarily imply good model performances-not necessarily narrow model spread either-in future projections, the historical credits of GHMs would benefit the assessment presented by the ranges of hydrological changes with higher confidence [3,63].
On the other hand, to some degree, the discrepant GHMs simulations further call for multimodel assessment rather than that based on a single model [64]. Before one can achieve better performing and more consistent hydrological predictions by GHMs, advanced statistical tools may be useful to improve the projections from multimodel ensembles for the assessment of climate change impact. For example, the Bayesian model averaging scheme can be an effective tool to obtain hydrological projections with less between-model uncertainties by weighting the individual model prediction with their likelihood measures [64].
Modeling of the dynamics of human water uses is still a great challenge since sectoral water use efficiencies are kept changing (improving) in the wake of technological developments and management changes. Döll et al [3] pointed out that the major challenges in modeling human water uses in GHMs come from input data, model algorithms, scaling issues, and etc. (see their table 1). Particularly, more data of human water uses are urgently needed to further understand the human disturbances on hydrological cycle and therefore to derive better descriptions of them in terms of mathematical models. We noted that capturing the linkages between sectors in terms of water use would also be a major challenge. Two-way coupling of human water uses at different scales as well as the natural hydrological processes in GHMs is perhaps necessary to mimic the connected and competitive water uses among sectors. Therefore, full representation of human impacts in global hydrological modeling under climate change raises the request of dynamically coupling the so-called Environ. Res. Lett. 12 (2017) 025009 nexus of climate-water-energy-food [65,66]. This would be more essential for future projections of water resources and uses with regard of various socioeconomic scenarios [31,67].

Conclusions
Global river flows simulated by four GHMs are validated and the between-model uncertainties in terms of SNR are investigated with respect to the inclusion of human impacts in the GHMs. The GHMs show relatively poor performances with considerable between-model uncertainties in the simulations of annual (AF) and high flow (HMF). Main conclusions can be drawn as follows.
1. Over the historical period , the GHMs show limitations in modeling AF and HMF at many stations-particularly for those from small basins, while relatively better performance is observed at many large basins. The multimodel ensemble means fit better to observations than individual models.
2. With consideration of human impacts (irrigation and reservoirs in this paper), the between-model uncertainties of simulated annual flow are higher (∼2% on average globally in terms of SNR) compared to those from naturalized simulations, but are lower (2∼4% globally) for the simulated high flow. The uncertainty differences are largest in most areas of Asia and northern countries of the Mediterranean Sea, and they appear to be significantly related to the fractional irrigation area of river basins.
3. The consistency of human impacts simulations between GHMs is much less pronounced than in the naturalized simulations, probably due to differences in the parameterizations of human impacts (especially the irrigation).
The large uncertainties in human impact parameterizations put forward the need for further development of GHMs (not only for the models used in this paper) to reduce between-model uncertainties associated with irrigation and reservoir regulation. It is the first quantitative investigation of between-model uncertainty resulting from the inclusion of human impact parameterizations, and the quantitative method may be used to examine uncertainty caused by other parameterizations in GHMs. In this study, we emphasize that calibration of GHMs including representations of the anthropogenic effects on the water cycle are essential for global hydrological modeling of a changing environment. Reconciliation of human water uses schemes and associated parameters in GHMs with global and regional observations would facilitate improvement of human impact parameterizations. Ensemble prediction approaches are promising tools for reducing uncertainty in model intercomparison projects, and would benefit future hydrological projections in assessment of climate change impact.