Assessing regional groundwater stress for nations using multiple data sources with the groundwater footprint

Groundwater is a critical resource for agricultural production, ecosystems, drinking water and industry, yet groundwater depletion is accelerating, especially in a number of agriculturally important regions. Assessing the stress of groundwater resources is crucial for science-based policy and management, yet water stress assessments have often neglected groundwater and used single data sources, which may underestimate the uncertainty of the assessment. We consistently analyze and interpret groundwater stress across whole nations using multiple data sources for the first time. We focus on two nations with the highest national groundwater abstraction rates in the world, the United States and India, and use the recently developed groundwater footprint and multiple datasets of groundwater recharge and withdrawal derived from hydrologic models and data synthesis. A minority of aquifers, mostly with known groundwater depletion, show groundwater stress regardless of the input dataset. The majority of aquifers are not stressed with any input data while less than a third are stressed for some input data. In both countries groundwater stress affects agriculturally important regions. In the United States, groundwater stress impacts a lower proportion of the national area and population, and is focused in regions with lower population and water well density compared to India. Importantly, the results indicate that the uncertainty is generally greater between datasets than within datasets and that much of the uncertainty is due to recharge estimates. Assessment of groundwater stress consistently across a nation and assessment of uncertainty using multiple datasets are critical for the development of a science-based rationale for policy and management, especially with regard to where and to what extent to focus limited research and management resources.


Introduction
Groundwater is a critical resource for agricultural production, ecosystems, drinking water and industry [1][2][3]. Approximately 43% of the global consumptive use of water for irrigation is groundwater [3] and groundwater is the Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. primary drinking water source for an estimated 2 billion people globally [1]. Yet several recent studies have estimated the accelerating magnitude of groundwater depletion (the permanent loss of stored groundwater) on a global scale, in order to examine the contribution of decreasing continental water storage to sea-level rise [4][5][6][7][8]. Although groundwater depletion is an obvious and observable manifestation of groundwater stress, few studies have focused on quantifying groundwater stress. Water stress, an indicator of water scarcity for humans, is generally defined as the ratio of withdrawals to availability [9][10][11][12], sometimes taking into account ecosystem needs [13]. Most assessments of global water stress have focused on surface water and have not examined the uncertainty due to various input data [9][10][11][12]. By focusing on surface water, previous water stress assessments exclude an important water resource, groundwater, that is often used in regions where surface water is stressed. Assessments of water stress are dependent not only on the metric and methodology but also on the input data used in the methodology. By not examining the uncertainty due to various input data, it is more difficult to interpret results in risk, management or policy frameworks since it is uncertain how consistent the results would be with different input data.
Numerous indicators of groundwater sustainability, vulnerability and stress have been proposed [14][15][16][17]. Indicators include the groundwater quantity and quality indicators developed by Webb et al [14] as well as the water use regime [18], the social sustainable aquifer yield [17], the groundwater sustainability infrastructure index [15] and the groundwater footprint [16]. Groundwater stress can be assessed at multiple scales using the recently developed groundwater footprint methodology that has been applied globally but only for large aquifers (rather than consistently across nations) [16]. The groundwater footprint is an areabased methodology like the ecological footprint [19] rather than a volume-based method like the water footprint [20][21][22]. The groundwater footprint of an aquifer is the area required to sustain groundwater use and groundwater-dependent ecosystem services for the aquifer. The ratio of groundwater footprint to aquifer area (GF/A A ) is an indicator of groundwater stress as we discuss further below [16]. To our best knowledge, groundwater stress has not previously been assessed consistently across a nation with multiple input data, in part because detailed social data [15,17] or hydrologic data [18] are not consistently available. Yet assessing groundwater stress consistently across a nation is critical for developing a science-based rationale for policy goal and management plans, especially where and to what extent to focus limited management resources.
Our objective is to consistently analyze and interpret groundwater stress across whole nations using multiple data sources for the first time. We use the groundwater footprint which has not previously been calculated consistently across a nation or using multiple data sources. The ratio of groundwater footprint to aquifer area is mapped across two nations, the United States and India, which have the highest national groundwater abstraction rates in the world [3] as well as significant localized groundwater depletion [5,7,[23][24][25]. We map this ratio for multiple data sources of groundwater abstraction and recharge to highlight the uncertainty between different data sources. These cohesive and relatively consistent data sources of national-scale or global-scale groundwater abstraction and groundwater fluxes have been recently derived from hydrologic models and data synthesis [3,[26][27][28][29][30]. Our maps can also be viewed as a 'dashboard' that highlights groundwater stress for water managers or policy makers. By comparing and integrating multiple datasets, this dashboard reveals how the data used affects how we quantify and potentially interpret groundwater stress.

Calculating groundwater stress
Groundwater stress in the aquifers of India and the contiguous United States was assessed by first calculating the groundwater footprint (GF). The methodology for calculating the groundwater footprint is summarized here and more details can be found in Gleeson et al [16]. The groundwater footprint (GF) is defined as GF = C R−E A where C, R and E are respectively the area-averaged annual abstraction of groundwater, recharge rate including artificial recharge (from irrigation), and the groundwater contribution to environmental streamflow, all in units of L/T such as m d −1 [16]. A (units of L 2 such as m 2 ) is the areal extent of any region of interest where C, R and E can be defined. Environmental flow (E) is the quantity of groundwater that needs to be allocated to surface water flow (e.g., baseflow) to sustain ecosystem services including vegetation (e.g., forest), which is most important during low flow conditions [13,31]. Although environmental flow requirements are best determined by detailed hydroecological data and multidisciplinary expert consultation [13,32], to be consistent in our analysis we assume that E is a fraction of R for a basin, following Gleeson et al [16]. To calculate this fraction, the ratio of Q 90 (the monthly streamflow exceeded 90% of the time, which we consider the low flow), to Q Avg , the long-term average streamflow, is used.
The quantities of R, E and C were aggregated over previously mapped aquifers in India [28] and the United States [33] with a known areal aquifer extent (A A ). Aquifer characteristics in both countries are extremely diverse with a wide range of permeabilities and storage capacities (e.g. bedrock aquifers versus alluvial aquifers) and recharge rates (e.g. humid versus arid regions). The largest aquifers in India and the United States with well-documented localized groundwater depletion as well as differences in recharge were further divided to ensure representative conditions [5,7,[23][24][25]. The Ganges-Brahmaputra aquifer system was subdivided into four aquifers by ridges in the underlying bedrock: Delhi ridge, Faizabad ridge, and the Munger Saharso ridge [34]. The High Plains and Central Valley aquifer systems were subdivided into three and five aquifers, respectively based on detailed hydrogeological mapping [35,36]. Figure S1 (available at stacks.iop.org/ERL/8/044010/mmedia) shows the geographic location of these aquifer divisions in the United States and India.
The assumptions and methodology for each data source (table 1) is described in detail in the appropriate reference but here we summarize salient differences between the data sources. Groundwater abstraction is calculated based on the number from the wells [28], surveys (USGS; http:// water.usgs.gov/watuse/) or reported country statistics from the year 2000 (www.un-igrac.org). The reported country statistics were downscaled spatially relative to the local surface water deficit (i.e., the amount of total water demand  [26] Wada et al [26] includes return flows in excess of surface water availability including reservoirs) or total water demand in cases where the country-wide abstraction is larger than the sum of the local surface water deficit over a country [26]. However, this method likely overestimates the amount of abstraction in regions where extensive water diversions (e.g., aqueduct) provide additional surface water availability (e.g., the Indo-Gangetic plains and the California's Central Valley) that are not accounted for in the model, and underestimates the amount in humid regions where people tend to rely more on groundwater rather than surface water abstraction. Long-term groundwater recharge is calculated using (1) water level fluctuation and a rainfall infiltration factor [28]; (2) interpolating the baseflow index [30]; (3) or using large-scale hydrologic models [26,27,29]. It is important to note that only Wada et al [26] and CGWB [28] includes an estimate of return flows from irrigation; all the other methods only estimate natural recharge. Each method has potential biases, assumptions and uncertainties. None of the methods used to calculate recharge use isotopes to compare, validate or derive recharge rates [37]. The water table fluctuation method based on seasonal data [28] represents net recharge over the monsoon and may underestimate total recharge [38]. Rainfall infiltration factors [28] assume a certain range of recharge rates for each aquifer type which may not be accurate in all hydroclimatic regions. Using the baseflow index is likely to underestimate recharge in regions of irrigation [30]. The large-scale hydrologic models [26,27,29] do not fully incorporate groundwater surface water interactions which are critical to quantifying the impact of abstraction. Only the recharge and abstraction estimates of Wada et al [26] include an estimate of uncertainty; as such this combination of data is the focus of the 'internal' uncertainty analysis described below. Given that each method has potential biases, assumptions and uncertainties but these biases are largely unknown especially spatially, we assume a priori that all potential combinations of data sources are equally likely in the 'ensemble' uncertainty analysis described below. The groundwater footprint method is essentially a steady-state aquifer water balance so GF/A A ratios greater than 1 indicate aquifer outputs are greater than aquifer inputs, suggesting groundwater stress [16]. We note that as a steady-state calculation the groundwater footprint does not include the transient response of groundwater systems to abstraction such as decreased baseflow or increased recharge, as occurs during groundwater development [8]. Also as a long-term, steady-state calculation, the groundwater footprint does not quantify sub-annual or even interannual groundwater stress. Therefore the impact of extremely seasonal climates, such as the monsoons in India or interannual stress such as droughts in the Central Valley [24] are not incorporated. In these cases other metrics such as seasonal or annual water table elevations may be an additional useful metric [28].

Groundwater stress in the United States and India
The ratio of groundwater footprint to aquifer area (GF/A A ) is mapped consistently across India and the United States for each combination of datasets outlined in table 1. Figures 1  and 2 show that groundwater stress consistently occurs in some regions, and these regions generally have previously documented groundwater depletion [5,7,[23][24][25]. In India, groundwater stress is common in northwestern India in the Gangetic Plain and in the hard rock aquifers of southeastern India. In the United States, groundwater stress is prevalent in the Central Valley of California and High Plains aquifer systems as well as other areas of the western United States. Subdividing the Ganges-Brahmaputra, High Plains and Central valley aquifer systems allows for a higher resolution assessment of groundwater stress considering substantial local variability in abstraction and recharge over these important aquifers than the previous global assessment [16]. In the Ganges-Brahmaputra aquifer systems, groundwater is acutely and consistently stressed in the west whereas it is unstressed in the east, consistent with the assessment of the Central Ground Water Board [28]. For the High Plains and Central Valley aquifer systems, groundwater stress consistently increases from north to south, in both cases as the aridity increases which is also consistent with assessments of groundwater depletion [23]. Patterns of groundwater stress are qualitatively similar to GRACE-based assessment of depletion of groundwater resources in India [25] and the United States [39], although the multiple data sources in figures 1 and 2 do not suggest groundwater stress around Houston, Alabama or the Mid-Atlantic states as GRACE data suggests [39].
Qualitative patterns in the groundwater stress in figures 1 and 2 suggest that differences in recharge estimates contribute more significantly to differences in groundwater stress than differences in abstraction estimates. This can be seen by comparing the patterns of groundwater stress in columns versus rows in figures 1 and 2. The differences between rows are much greater than the differences between columns in both the United States and India. The abstraction estimates are likely well constrained among one another since the estimates may stem from similar or the same data sources. Compared to recharge, abstraction is more measurable and reported particularly in developed countries (e.g., the USA). However, Figure 1. Groundwater stress across India for the six possible combinations of groundwater recharge and abstraction data, compared to the stage of groundwater development at bottom [28]. The combination of abstraction and recharge from Wada et al [26] is used to calculate the 'internal uncertainty' whereas all combinations are used to calculated the 'ensemble uncertainty' (figure 4).
recharge is rarely observed directly, especially at the scale at which it is examined in this study. Moreover, recharge estimates are derived from a diversity of methods each with their own uncertainties. We further quantify the uncertainty between different data sources below.
Although figures 1 and 2 importantly show that the assessment of groundwater stress depends on input data, it is also interesting to examine where results are consistent with different data sources. To do so, each aquifer was classified as stressed (GF/A A > 1) or unstressed (GF/A A < 1) for each possible combination of input data and then categorized as stressed for all, some or none of the combinations of input data ( figure 3). A minority of aquifers (11% in India and 5% in the United States) have a GF/A A > 1 regardless of input data (table S1 available at stacks.iop.org/ERL/8/044010/ mmedia). In the United States, these are the High Plains and Central Valley aquifer systems as well as the Basin and Range aquifer system in the southwest. In India this is primarily the western Ganges-Brahmaputra aquifer systems. A number of aquifers (36% in India and 17% in the United States) can be considered stressed for some input data but not others. This included much of the western United States and peninsular India. The majority of aquifers (53% in India and 78% in the United States) are not considered stressed by any input data, primarily in the eastern and northwestern United States and northeastern India.
Although a minority of aquifers in each country are stressed regardless of input data, the regions where groundwater is stressed have different extents and characteristics (table 2). We calculated the areal extent of each aquifer category as well as the population [40] and water well density in each region [41,42]. The areal proportion of India where groundwater is stressed with all or some combinations of input data (∼70%) is much greater than the United States (∼26%). Similarly with population: ∼70% of the population of India live in regions where groundwater is stressed with all or some combinations of input data, whereas only ∼30% of the population in the United States live in such regions. This means that in India and the United States, ∼700 million and ∼80 million people, respectively, could be directly impacted by groundwater stress although more people could be indirectly impacted via the virtual water trade [22,43,44]. The density of water wells is markedly different, which has implications for groundwater management and policy of common pool resources such as groundwater. The water well density in India is 2-3 orders of magnitude greater than the in the United States. In sum, in the United States, groundwater stress potentially impacts a lesser percentage of the population and regions with lower population and water well density compared to India. Although groundwater stress affects a lower percentage of the population in the United States, in both countries groundwater stress affects agriculturally important regions. For example, the High Plains and Central Valley aquifer systems substantially contribute to crop and food production in the United States with their market value of agricultural products amounting to 12% and 7% of the total in 2007 respectively [23].

The uncertainty of multiple data sources
In order to examine the importance of using multiple input data, we examine how the uncertainty within a single combination of input data (herein called 'internal uncertainty') compares to the uncertainty between different data sources (herein called 'ensemble uncertainty'). The internal uncertainty of recharge and groundwater abstraction was estimated by Wada et al [26] using a Monte Carlo simulation with 100 independent realizations of R and C. This resulted in 10 000 values for the groundwater footprint from which the mean (µ) and standard deviation (σ ) of the GF/A A ratio were computed. The ensemble uncertainty was estimated by calculating the mean and standard deviation of all the possible combinations of groundwater recharge and abstraction data from figure 1 (n = 6) and figure 2 (n = 8).
The internal and ensemble uncertainties for 3191 aquifers in the United States and 309 aquifers in India are reported as the coefficient of variation (σ/µ) to reduce the impact of outliers. The ensemble uncertainty is much greater than the internal uncertainty for most aquifers (figure 4). For example, the median coefficient of variation for the ensemble is nearly equal to the mean GF/A A (σ/µ = 0.7 for United States and σ/µ = 1.0 for India) whereas the median internal actual uncertainty is two to five times less (σ/µ = 0.18 for United States and σ/µ = 0.2 for India). This suggests that the uncertainty may be much larger than that accounted by internal uncertainty for many aquifers. The large ensemble uncertainty could be due to the difference between data sources or potentially a large unknown internal uncertainty of one of the data sets. A minority of aquifers (20% in the United States and 4% in India) have an internal uncertainty greater than an ensemble uncertainty (figure S2 available at stacks.iop.org/ERL/8/044010/mmedia). These aquifers were mapped where the internal uncertainty is greater than the ensemble uncertainty and no spatial pattern was detected (figure S3 available at stacks.iop.org/ERL/8/044010/mmedia) suggesting that there are no regions where one type of uncertainty predominates. Like the overall patterns (figure 4), the normalized ensemble uncertainty is generally greater than the normalized internal uncertainty for the important and stressed aquifers discussed above, the Central Valley, High Plains and Ganges-Brahmaputra aquifer systems (figure 5). Uncertainty, expressed as the coefficient of variation of the GF/A A ratio, is plotted against the frequency normalized by the total number of aquifers. The internal uncertainty is two to five times less than the ensemble uncertainty.
In order to further examine what controls uncertainty, we considered the possible relationship between uncertainty and GF/A A ratios. For both the normalized internal uncertainty and the normalized ensemble uncertainty, there are no relationships with GF/A A ratios in the United States or India (figure S4 available at stacks.iop.org/ERL/8/044010/ mmedia). This suggests that the uncertainty is not dependent on the GF/A A ratios but rather on the difference between the multiple data sources.

From groundwater stress to management and policy
We quantify the distribution and uncertainty of groundwater stress across nations for the first time. Here we have focused on the United States and India, because these countries have multiple input data sets, the highest national groundwater abstraction rates in the world [3], and significant localized groundwater depletion [5,7,[23][24][25]. The spatial coherence (figure 3) and consistency with other assessments of groundwater stress [28] and groundwater depletion [5,7,[23][24][25]39], suggests that our assessment of groundwater stress is reasonable. We also importantly develop a methodology that can be applied to other countries or regions. A crucial step to enabling more detailed assessment of groundwater stress globally is detailed aquifer maps, which are not available for most countries.
Although it is difficult to move from metrics of groundwater stress to groundwater management and policy, our analysis may be important for water management and policy for multiple reasons: (1) The analysis is consistent across the scale over which regional to national water management and policy is developed and implemented. Previously, aquifer stress has been assessed globally but only for regional aquifers [16] or nationally using a single data set [28]. (2) The maps (figures 1-3) can also be viewed as a 'dashboard' that highlights groundwater stress for water managers or policy makers. By comparing and integrating multiple datasets, this dashboard reveals how the data used affects how we quantify and potentially interpret groundwater stress. Categorizing aquifers based on whether they are stressed for different input data (figure 3) could be a useful screening-level tool for groundwater management and policy. For example, the minority of the aquifers that are stressed regardless of input data could be the focus of limited management resources. Research efforts to better quantify water budgets could be focused on the aquifers that are stressed for some but not all of the input data. Both of these aquifer categories may be important to focus on for future climate and water use scenarios.
(3) A large number of users makes managing common pool resources such as groundwater more difficult. The absolute number of water wells and the water well density is much greater in India than the United States, which can be taken into account in management and policy.
(4) The results indicate that uncertainty is generally greater between datasets than within datasets (figure 4) and that much of the uncertainty is due to recharge estimates (figures 1 and 2). Therefore to more fully incorporate uncertainty into management and policy, the ensemble approach using multiple input data sets is important. In the future, research further trying to constrain recharge estimates on regional to continental scales would be valuable.
Even though groundwater is locally stressed and the assessment of water stress depends on the quality and type of input data, our analysis shows the value in assessing groundwater stress consistently across a nation with multiple datasets. The two nations with the highest national groundwater abstraction rates in the world [3] as well as significant localized groundwater depletion [5,7,[23][24][25] have significant similarities and differences which both impact groundwater policy and management. In both countries the uncertainty of stress is much greater when multiple data sets are considered. However, the number of people impacted by groundwater stress and the density of wells is vastly different.