Machine learning based downscaling of GRACE-estimated groundwater in Central Valley, California

https://doi.org/10.1016/j.scitotenv.2022.161138Get rights and content

Highlights

  • Machine Learning approach to integrate in-situ groundwater and remote sensing data

  • Groundwater storage variations for one and a half-decade at 5 km resolution

  • Model the impact of two droughts on groundwater storage variations

  • Novel application for broader applicability of GRACE gravimetry data

Abstract

California's Central Valley, one of the most agriculturally productive regions, is also one of the most stressed aquifers in the world due to anthropogenic groundwater over-extraction primarily for irrigation. Groundwater depletion is further exacerbated by climate-driven droughts. Gravity Recovery and Climate Experiment (GRACE) satellite gravimetry has demonstrated the feasibility of quantifying global groundwater storage changes at uniform monthly sampling, though at a coarse resolution and is thus impractical for effective water resources management. Here, we employ the Random Forest machine learning algorithm to establish empirical relationships between GRACE-derived groundwater storage and in situ groundwater level variations over the Central Valley during 2002–2016 and achieved spatial downscaling of GRACE-observed groundwater storage changes from a few hundred km to 5 km. Validations of our modeled groundwater level with in situ groundwater level indicate excellent Nash-Sutcliffe Efficiency coefficients ranging from 0.94 to 0.97. In addition, the secular components of modeled groundwater show good agreements with those of vertical displacements observed by GPS, and CryoSat-2 radar altimetry measurements and is perfectly consistent with findings from previous studies. Our estimated groundwater loss is about 30 km3 from 2002 to 2016, which also agrees well with previous studies in Central Valley. We find the maximum groundwater storage loss rates of −5.7 ± 1.2 km3 yr−1 and -9.8 ± 1.7 km3 yr−1 occurred during the extended drought periods of January 2007–December 2009, and October 2011–September 2015, respectively while Central Valley also experienced groundwater recharges during prolonged flood episodes. The 5-km resolution Central Valley-wide groundwater storage trends reveal that groundwater depletion occurs mostly in southern San Joaquin Valley collocated with severe land subsidence due to aquifer compaction from excessive groundwater over withdrawal.

Introduction

Groundwater is an important freshwater resource that meets agricultural, industrial, and domestic needs (Siebert et al., 2010; Wada et al., 2014; Zektser et al., 2004). Over the past few decades, several aquifers worldwide such as Central Valley, High Plains, Indus Plain, middle East, and others, have faced unprecedented human-induced stress due to the population growth, expansion of the irrigated areas, and other economic activities causing a drastic increase in groundwater consumption (Bierkens and Wada, 2019; Famiglietti, 2014). Climate change might affect the natural recharge cycle of groundwater reservoirs by altering the precipitation and evapotranspiration patterns significantly. Climate extremes such as floods and droughts might drastically increase or decrease the recharge (Taylor et al., 2012). Groundwater abstraction and outflow exceeding groundwater recharge over a long period of time and in large areas have been reported as the main causes of groundwater depletion (Konikow and Kendy, 2005; Wada et al., 2010). Groundwater depletion can lead to global water security and environmental issues, food security issues (Famiglietti, 2014; Wada et al., 2010) which could trigger mass emigration. There is an urgent need for quantifying long-term groundwater storage (GWS) changes at frequent temporal samplings that can help in better management of groundwater resources and characterize the groundwater depletion in these stressed regions.

Quantifying GWS changes is especially important for Central Valley. Here, ever-increasing irrigation demands, limited availability of surface water, and climate extremes such as prolonged and intensified droughts resulting from climate change have forced farmers to depend more on groundwater. As a result of the continuing groundwater depletion, several adverse impacts such as falling groundwater levels, decreasing groundwater yields, increase in pumping costs, degrading water quality, and damage to the aquatic ecosystems and wetlands have been observed (Faunt, 2009; Faunt and Sneed, 2015; Konikow, 2015). San Joaquin Valley, a major agricultural region in Central Valley, has witnessed the largest share of such adverse impacts, which have become more severe during prolonged and recurrent droughts in California.

Several approaches for quantifying GWS changes have been applied in the past (e.g., Bierkens and Wada, 2019). Groundwater levels from in situ ground wells provide essential information about stresses acting on the aquifers and play a key role in developing groundwater models (Faunt, 2009; Taylor and Alley, 2001). However, it is infeasible to use only these data for quantifying regional GWS changes as several aquifers have poor coverage of such wells owing to high cost of their installation and maintenance. Moreover, spatio-temporal gaps in the groundwater level data might necessitate their interpolation, which might lead to additional errors (Ahamed et al., 2022; Thomas et al., 2017). Further, uncertainties in the value of storage coefficients at well sites might translate into errors when computing GWS changes (Alam et al., 2021; Scanlon et al., 2012). Another approach to quantify GWS changes is using data from Gravity Recovery and Climate Experiment (GRACE) twin-satellite gravimetry mission. GRACE has enabled a continuous and uniform global Terrestrial Water Storage (TWS) record for the time span starting from April2002 to Oct 2017, at the “true” spatial resolution longer than 666 km (full-wavelength) and monthly sampling (Frappart and Ramillien, 2018). Innovative processing of GRACE data has enabled the uniform global quantification of GWS change by removing surface water storage changes using hydrologic data and model outputs (Famiglietti et al., 2011; Rodell et al., 2009), as well as data assimilation (e.g., 50 km resolution in Mehrnegar et al. (2021); 12.5 km resolution in Schumacher et al. (2018)). However, due to the limited spatial resolution and the associated errors in disaggregating GRACE-derived TWS (Scanlon et al., 2012), the application of GRACE data directly for groundwater assessment is not feasible at the local scale (Alley and Konikow, 2015). In Central Valley, Famiglietti et al. (2011) is the first study which used GRACE-derived TWS changes and other hydrological variables to quantify GWS changes during 2002–2011. Scanlon et al. (2012) used updated GRACE processing and in situ groundwater level variations to compute groundwater depletion from 2002 to 2011. The above studies estimated GWS changes by removing soil moisture estimates simulated by Land Surface Models (LSMs) from GRACE-derived TWS (Scanlon et al., 2012). However, LSMs do not simulate irrigation water use; hence soil moisture values will be particularly erroneous in the Central Valley, where groundwater irrigation is predominant (Famiglietti et al., 2011).

Vertical deformation observed during droughts from Interferometric Synthetic Aperture Radar (InSAR) has also been inverted to derive GWS changes. Recent studies have used a combination of in situ, satellite, and modeling data to quantify GWS changes. Alam et al. (2021) used a combination of GRACE, in situ wells, water balance and hydrological modeling to quantify GWS variations during 2003–2019. Ahamed et al. (2022) used remote sensing data and an ensemble of water balance methods to quantify GWS changes in Central Valley during 2002–2020. While all these studies have confirmed the continued loss of GWS along with dramatic rates of subsidence during the last two decades, all the techniques except those incorporating in situ groundwater levels have limited capability to model GWS changes at high spatial resolutions at frequent temporal intervals. Groundwater levels in Central Valley can reflect complex variations due to withdrawal for irrigation, recharge due to partial infiltration of irrigation water, surface water impoundment, or precipitation. Climate extremes such as drought which have put unprecedented stress on groundwater reserves are also reflected in the groundwater fluctuations (Faunt, 2009). This necessitates the incorporation of the in situ groundwater level data in the groundwater models. Therefore, we propose to use Machine Learning (ML), an effective data-driven approach, to estimate GWS changes at a higher spatial resolution by downscaling GRACE-derived GWS changes to model in situ groundwater level variations. ML has been used for solving several non-linear complex problems in geoscience, (e.g., Berner et al. (2020); Chen et al. (2021); Dramsch (2020); Sun and Scanlon (2019)), as it does not require the knowledge of exact physical relationships between input and response variables. Further, ML methods can jointly use different types of data with different units, scales and accuracy, and is thus suitable for empirically modeling complex hydrological processes, such as basin-wide groundwater variations. Several studies in the past have incorporated ML algorithms to downscale GRACE data and produce GWS changes at high resolution for various aquifers (Chen et al., 2019; Chen et al., 2020; Miro and Famiglietti, 2018; Rahaman et al., 2019).

The primary objective of this study is to downscale GRACE-derived GWS changes in Central Valley, California, using the Random Forest ML algorithm to model and simulate monthly groundwater level and GWS changes at spatial resolution as fine as 5 km. This study contrasts with Miro and Famiglietti (2018) which used Artificial Neural Networks (ANN) to model annual GWS changes in the time period 2003–2010 for a portion of San Joaquin Valley. We chose the period from October 2002 to September 2016, which covers most of the operational phase of GRACE satellite data. GRACE data beyond November 2016 was excluded to avoid errors due to the accelerometer data transplant; the accelerometer instrument onboard one of the twin satellites (GRACE-B) had thermal issues and was no longer operational until the end of mission (Bandikova et al., 2019). We use GRACE data along with hydro-meteorologic/geologic data as input and in situ groundwater level data as the response variable for developing the RF model. Further, the Central Valley has a record of geodetic measurements from in situ GPS, synthetic aperture radar interferometry, extensometers, and others, which have been used to quantify the subsidence due to groundwater overdraft (Ojha et al., 2018; Sneed and Brandt, 2015). While groundwater level change and land subsidence are two different physical processes, the subsidence measurements data can be used to qualitatively compare or validate our ML-modeled groundwater levels. We then validate the ML-modeled groundwater level using GPS vertical deformation data and basin-wide subsidence rate measured by a radar altimeter over Central Valley, CA (Yang, 2020). Here we compute inelastic storage coefficients using geodetic satellite subsidence measurements for severely subsiding regions in Central Valley for validation. This approach of combining multiple hydrological and geodetic data can further enhance our understanding of aquifer dynamics. The ultimate goal of this study is to verify the feasibility of using ML-downscaled GWS change over the whole Central Valley. We compare our results and with estimates from prior studies which can further validate the overall results. A ML approach, such as the one presented here, is hypothesized to be able to produce local-scale groundwater level storage/level information for Groundwater Sustainability Agencies to make informed management decisions under Sustainable Groundwater Management Act (SGMA).

The rest of the paper is organized as follows. The study area is introduced, data and methodology along with the details of model building and validation are described in section 2. The numerical results and comparisons with previous studies are presented in section 3. The findings of the study as well as the main limitations and the future perspectives are discussed in section 4. Finally, conclusions are drawn in section 5.

Section snippets

Study area

The Central Valley aquifer system in California covers an area of 52,000 km2 (Fig. 1) and produces one-fourth of the food in the US (Faunt, 2009). Central Valley is primarily semi-arid and most precipitation occurs during the winter and early spring months and not in summer when it is most needed for irrigation and drinking (Jasechko and Perrone, 2020). San Joaquin Valley is the major agricultural region and surface water quantity here depends on seasonal snowmelt from the Sierra Nevada in the

Validation of modeling results

The results from RF models show high accuracy for both San Joaquin and the Sacramento Valley (Fig. 4). For San Joaquin Valley, correlation coefficient, RMSE, NSE, and R* for training (test) data are 0.99 (0.97), 1.35 (2.72), 0.99 (0.95), and 0.12 (0.21), respectively. The same metrics computed over Sacramento valley for training (test) data are 0.99 (0.95), 1.21 (2.12), 0.98 (0.94), and 0.14 (0.26). Additional validations of model results with respect to the out-of-bag data are provided in

Machine learning modeling

Our study achieved high accuracy for both training and test data in Sacramento and San Joaquin valleys (Fig. 4). This suggest that downscaling of GRACE data to model groundwater level variations at sites of in situ wells was successful. The model development and training process adopting cross-validation scheme, avoided overfitting. Overfitting can reduce the confidence of ML results, which be a challenge for downscaling studies as we seek to model groundwater variations at higher resolutions (

Conclusions

This study advances the application of remote sensing data in the field of hydrological sciences by demonstrating an effective and improved downscaling of GRACE-estimated groundwater storage variations in Central Valley to a spatial resolution of 5 km using Random Forest ML approach and other hydrologic, meteorologic, and geologic datasets. We applied it in the Central Valley region, which has developed an ever-increasing groundwater demand for irrigation given the lack of surface water

CRediT authorship contribution statement

Vibhor Agarwal: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing – original draft, Writing – review & editing. Orhan Akyilmaz: Conceptualization, Methodology, Resources, Writing – review & editing, Supervision. C.K. Shum: Conceptualization, Writing – review & editing, Supervision, Project administration, Funding acquisition. Wei Feng: Software, Writing – review & editing, Supervision. Ting-Yi Yang: Resources, Writing – review &

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

Vibhor Agarwal received OSU Fellowship and salary as Teaching Assistant while a PhD student at OSU. Orhan Akyilmaz and Metehan Uz is partially supported by the Scientific and Technological Research Council of Turkey - TÜBİTAK (119Y176). C.K. Shum is partially supported by NSF's Partnerships for Innovation Program (2044704), and NASA's Earth Surface Interior Focus Area Program (80NSSC20K0494). Ehsan Forootan is supported by the Danmarks Frie Forskningsfond [10.46540/2035-00247B] under DANSk-LSM

References (87)

  • M. Schumacher et al.

    Improving drought simulations within the Murray-Darling Basin by combined calibration/assimilation of GRACE data into the WaterGAP global hydrology model

    Remote Sens. Environ.

    (2018)
  • W.M. Seyoum et al.

    Improved methods for estimating local terrestrial water dynamics from GRACE in the Northern High Plains

    Adv. Water Resour.

    (2017)
  • Y. Sun et al.

    Comparison of interpolation methods for depth to groundwater and its temporal and spatial variations in the Minqin oasis of Northwest China

    Environ. Model Softw.

    (2009)
  • B.F. Thomas et al.

    GRACE groundwater drought index: evaluation of California Central Valley groundwater drought

    Remote Sens. Environ.

    (2017)
  • M. Uz et al.

    Bridging the gap between GRACE and GRACE-FO missions with deep learning aided water storage simulations

    Sci. Total Environ.

    (2022)
  • W. Yin et al.

    Comparison of physical and data-driven models to forecast groundwater level changes with the inclusion of GRACE – a case study over the state of Victoria,Australia

    J. Hydrol.

    (2021)
  • V. Agarwal

    Machine Learning Applications for Downscaling Groundwater Storage Changes Integrating Satellite Gravimetry and Other Observations [Doctoral dissertation, Ohio State University]

    (2021)
  • G. A et al.

    Computations of the viscoelastic response of a 3-D compressible Earth to surface loading: an application to Glacial Isostatic Adjustment in Antarctica and Canada

    Geophys. J. Int.

    (2013)
  • S. Alam et al.

    Post-drought groundwater storage recovery in California's Central Valley

    Water Resour. Res.

    (2021)
  • W.M. Alley et al.

    Bringing GRACE Down to Earth

    Groundwater

    (2015)
  • L.T. Berner

    Summer warming explains widespread but not uniform greening in the Arctic tundra biome

    Nat. Commun.

    (2020)
  • G.L. Bertoldi

    Ground-water Resources of the Central Valley of California

    (1989)
  • G. Biau et al.

    A random forest guided tour

    Test

    (2016)
  • M.F.P. Bierkens et al.

    Non-renewable groundwater use and groundwater depletion: a review

    Environ. Res. Lett.

    (2019)
  • L. Breiman

    Random forests

    Mach. Learn.

    (2001)
  • C. Chen et al.

    A comparative study among machine learning and numerical models for simulating groundwater dynamics in the Heihe River Basin, northwestern China

    Sci. Rep.

    (2020)
  • L. Chen et al.

    Downscaling of GRACE-derived groundwater storage based on the random forest model

    Remote Sens.

    (2019)
  • M.K. Cheng et al.

    Monthly Estimates of C20 From 5 SLR Satellites Based on GRACE RL06 Models

    (2018)
  • C. Daly et al.

    Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States

    Int. J. Climatol.

    (2008)
  • Periodic groundwater level measurements. [Data file]

  • Continuous groundwater level measurements. [Data file]

  • J.S. Famiglietti

    The global groundwater crisis

    Nat. Clim. Chang.

    (2014)
  • J.S. Famiglietti et al.

    Satellites measure recent rates of groundwater depletion in California's Central Valley

    Geophys. Res. Lett.

    (2011)
  • T.G. Farr et al.

    Progress Report: Subsidence in the Central Valley, California

    (2015)
  • C.C. Faunt

    Groundwater Availability of the Central Valley Aquifer, California: U.S. Geological Survey Professional Paper

    (2009)
  • C.C. Faunt et al.

    Development of a three-dimensional model of sedimentary texture in valley-fill deposits of Central Valley, California, USA

    Hydrogeol. J.

    (2010)
  • C.C. Faunt et al.

    Water availability and subsidence in California's Central Valley

    San Francisco Estuary Watershed Sci.

    (2015)
  • C.C. Faunt et al.

    Water availability and land subsidence in the Central Valley, California, USA

    Hydrogeol. J.

    (2016)
  • M. Feurer et al.

    Hyperparameter optimization

  • F. Frappart et al.

    Monitoring groundwater storage changes using the Gravity Recovery and Climate Experiment (GRACE) satellite mission: a review

    Remote Sens.

    (2018)
  • D.M. Hawkins et al.

    Assessing model fit by cross-validation

    J. Chem. Inf. Comput. Sci.

    (2003)
  • T. Hengl et al.

    Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables

    PeerJ

    (2018)
  • S. Jasechko et al.

    California's Central Valley groundwater wells run dry during recent drought

    Earth’sFuture

    (2020)
  • Cited by (15)

    View all citing articles on Scopus
    View full text