Machine learning based downscaling of GRACE-estimated groundwater in Central Valley, California
Graphical abstract
Introduction
Groundwater is an important freshwater resource that meets agricultural, industrial, and domestic needs (Siebert et al., 2010; Wada et al., 2014; Zektser et al., 2004). Over the past few decades, several aquifers worldwide such as Central Valley, High Plains, Indus Plain, middle East, and others, have faced unprecedented human-induced stress due to the population growth, expansion of the irrigated areas, and other economic activities causing a drastic increase in groundwater consumption (Bierkens and Wada, 2019; Famiglietti, 2014). Climate change might affect the natural recharge cycle of groundwater reservoirs by altering the precipitation and evapotranspiration patterns significantly. Climate extremes such as floods and droughts might drastically increase or decrease the recharge (Taylor et al., 2012). Groundwater abstraction and outflow exceeding groundwater recharge over a long period of time and in large areas have been reported as the main causes of groundwater depletion (Konikow and Kendy, 2005; Wada et al., 2010). Groundwater depletion can lead to global water security and environmental issues, food security issues (Famiglietti, 2014; Wada et al., 2010) which could trigger mass emigration. There is an urgent need for quantifying long-term groundwater storage (GWS) changes at frequent temporal samplings that can help in better management of groundwater resources and characterize the groundwater depletion in these stressed regions.
Quantifying GWS changes is especially important for Central Valley. Here, ever-increasing irrigation demands, limited availability of surface water, and climate extremes such as prolonged and intensified droughts resulting from climate change have forced farmers to depend more on groundwater. As a result of the continuing groundwater depletion, several adverse impacts such as falling groundwater levels, decreasing groundwater yields, increase in pumping costs, degrading water quality, and damage to the aquatic ecosystems and wetlands have been observed (Faunt, 2009; Faunt and Sneed, 2015; Konikow, 2015). San Joaquin Valley, a major agricultural region in Central Valley, has witnessed the largest share of such adverse impacts, which have become more severe during prolonged and recurrent droughts in California.
Several approaches for quantifying GWS changes have been applied in the past (e.g., Bierkens and Wada, 2019). Groundwater levels from in situ ground wells provide essential information about stresses acting on the aquifers and play a key role in developing groundwater models (Faunt, 2009; Taylor and Alley, 2001). However, it is infeasible to use only these data for quantifying regional GWS changes as several aquifers have poor coverage of such wells owing to high cost of their installation and maintenance. Moreover, spatio-temporal gaps in the groundwater level data might necessitate their interpolation, which might lead to additional errors (Ahamed et al., 2022; Thomas et al., 2017). Further, uncertainties in the value of storage coefficients at well sites might translate into errors when computing GWS changes (Alam et al., 2021; Scanlon et al., 2012). Another approach to quantify GWS changes is using data from Gravity Recovery and Climate Experiment (GRACE) twin-satellite gravimetry mission. GRACE has enabled a continuous and uniform global Terrestrial Water Storage (TWS) record for the time span starting from April2002 to Oct 2017, at the “true” spatial resolution longer than 666 km (full-wavelength) and monthly sampling (Frappart and Ramillien, 2018). Innovative processing of GRACE data has enabled the uniform global quantification of GWS change by removing surface water storage changes using hydrologic data and model outputs (Famiglietti et al., 2011; Rodell et al., 2009), as well as data assimilation (e.g., 50 km resolution in Mehrnegar et al. (2021); 12.5 km resolution in Schumacher et al. (2018)). However, due to the limited spatial resolution and the associated errors in disaggregating GRACE-derived TWS (Scanlon et al., 2012), the application of GRACE data directly for groundwater assessment is not feasible at the local scale (Alley and Konikow, 2015). In Central Valley, Famiglietti et al. (2011) is the first study which used GRACE-derived TWS changes and other hydrological variables to quantify GWS changes during 2002–2011. Scanlon et al. (2012) used updated GRACE processing and in situ groundwater level variations to compute groundwater depletion from 2002 to 2011. The above studies estimated GWS changes by removing soil moisture estimates simulated by Land Surface Models (LSMs) from GRACE-derived TWS (Scanlon et al., 2012). However, LSMs do not simulate irrigation water use; hence soil moisture values will be particularly erroneous in the Central Valley, where groundwater irrigation is predominant (Famiglietti et al., 2011).
Vertical deformation observed during droughts from Interferometric Synthetic Aperture Radar (InSAR) has also been inverted to derive GWS changes. Recent studies have used a combination of in situ, satellite, and modeling data to quantify GWS changes. Alam et al. (2021) used a combination of GRACE, in situ wells, water balance and hydrological modeling to quantify GWS variations during 2003–2019. Ahamed et al. (2022) used remote sensing data and an ensemble of water balance methods to quantify GWS changes in Central Valley during 2002–2020. While all these studies have confirmed the continued loss of GWS along with dramatic rates of subsidence during the last two decades, all the techniques except those incorporating in situ groundwater levels have limited capability to model GWS changes at high spatial resolutions at frequent temporal intervals. Groundwater levels in Central Valley can reflect complex variations due to withdrawal for irrigation, recharge due to partial infiltration of irrigation water, surface water impoundment, or precipitation. Climate extremes such as drought which have put unprecedented stress on groundwater reserves are also reflected in the groundwater fluctuations (Faunt, 2009). This necessitates the incorporation of the in situ groundwater level data in the groundwater models. Therefore, we propose to use Machine Learning (ML), an effective data-driven approach, to estimate GWS changes at a higher spatial resolution by downscaling GRACE-derived GWS changes to model in situ groundwater level variations. ML has been used for solving several non-linear complex problems in geoscience, (e.g., Berner et al. (2020); Chen et al. (2021); Dramsch (2020); Sun and Scanlon (2019)), as it does not require the knowledge of exact physical relationships between input and response variables. Further, ML methods can jointly use different types of data with different units, scales and accuracy, and is thus suitable for empirically modeling complex hydrological processes, such as basin-wide groundwater variations. Several studies in the past have incorporated ML algorithms to downscale GRACE data and produce GWS changes at high resolution for various aquifers (Chen et al., 2019; Chen et al., 2020; Miro and Famiglietti, 2018; Rahaman et al., 2019).
The primary objective of this study is to downscale GRACE-derived GWS changes in Central Valley, California, using the Random Forest ML algorithm to model and simulate monthly groundwater level and GWS changes at spatial resolution as fine as 5 km. This study contrasts with Miro and Famiglietti (2018) which used Artificial Neural Networks (ANN) to model annual GWS changes in the time period 2003–2010 for a portion of San Joaquin Valley. We chose the period from October 2002 to September 2016, which covers most of the operational phase of GRACE satellite data. GRACE data beyond November 2016 was excluded to avoid errors due to the accelerometer data transplant; the accelerometer instrument onboard one of the twin satellites (GRACE-B) had thermal issues and was no longer operational until the end of mission (Bandikova et al., 2019). We use GRACE data along with hydro-meteorologic/geologic data as input and in situ groundwater level data as the response variable for developing the RF model. Further, the Central Valley has a record of geodetic measurements from in situ GPS, synthetic aperture radar interferometry, extensometers, and others, which have been used to quantify the subsidence due to groundwater overdraft (Ojha et al., 2018; Sneed and Brandt, 2015). While groundwater level change and land subsidence are two different physical processes, the subsidence measurements data can be used to qualitatively compare or validate our ML-modeled groundwater levels. We then validate the ML-modeled groundwater level using GPS vertical deformation data and basin-wide subsidence rate measured by a radar altimeter over Central Valley, CA (Yang, 2020). Here we compute inelastic storage coefficients using geodetic satellite subsidence measurements for severely subsiding regions in Central Valley for validation. This approach of combining multiple hydrological and geodetic data can further enhance our understanding of aquifer dynamics. The ultimate goal of this study is to verify the feasibility of using ML-downscaled GWS change over the whole Central Valley. We compare our results and with estimates from prior studies which can further validate the overall results. A ML approach, such as the one presented here, is hypothesized to be able to produce local-scale groundwater level storage/level information for Groundwater Sustainability Agencies to make informed management decisions under Sustainable Groundwater Management Act (SGMA).
The rest of the paper is organized as follows. The study area is introduced, data and methodology along with the details of model building and validation are described in section 2. The numerical results and comparisons with previous studies are presented in section 3. The findings of the study as well as the main limitations and the future perspectives are discussed in section 4. Finally, conclusions are drawn in section 5.
Section snippets
Study area
The Central Valley aquifer system in California covers an area of 52,000 km2 (Fig. 1) and produces one-fourth of the food in the US (Faunt, 2009). Central Valley is primarily semi-arid and most precipitation occurs during the winter and early spring months and not in summer when it is most needed for irrigation and drinking (Jasechko and Perrone, 2020). San Joaquin Valley is the major agricultural region and surface water quantity here depends on seasonal snowmelt from the Sierra Nevada in the
Validation of modeling results
The results from RF models show high accuracy for both San Joaquin and the Sacramento Valley (Fig. 4). For San Joaquin Valley, correlation coefficient, RMSE, NSE, and R* for training (test) data are 0.99 (0.97), 1.35 (2.72), 0.99 (0.95), and 0.12 (0.21), respectively. The same metrics computed over Sacramento valley for training (test) data are 0.99 (0.95), 1.21 (2.12), 0.98 (0.94), and 0.14 (0.26). Additional validations of model results with respect to the out-of-bag data are provided in
Machine learning modeling
Our study achieved high accuracy for both training and test data in Sacramento and San Joaquin valleys (Fig. 4). This suggest that downscaling of GRACE data to model groundwater level variations at sites of in situ wells was successful. The model development and training process adopting cross-validation scheme, avoided overfitting. Overfitting can reduce the confidence of ML results, which be a challenge for downscaling studies as we seek to model groundwater variations at higher resolutions (
Conclusions
This study advances the application of remote sensing data in the field of hydrological sciences by demonstrating an effective and improved downscaling of GRACE-estimated groundwater storage variations in Central Valley to a spatial resolution of 5 km using Random Forest ML approach and other hydrologic, meteorologic, and geologic datasets. We applied it in the Central Valley region, which has developed an ever-increasing groundwater demand for irrigation given the lack of surface water
CRediT authorship contribution statement
Vibhor Agarwal: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing – original draft, Writing – review & editing. Orhan Akyilmaz: Conceptualization, Methodology, Resources, Writing – review & editing, Supervision. C.K. Shum: Conceptualization, Writing – review & editing, Supervision, Project administration, Funding acquisition. Wei Feng: Software, Writing – review & editing, Supervision. Ting-Yi Yang: Resources, Writing – review &
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
Vibhor Agarwal received OSU Fellowship and salary as Teaching Assistant while a PhD student at OSU. Orhan Akyilmaz and Metehan Uz is partially supported by the Scientific and Technological Research Council of Turkey - TÜBİTAK (119Y176). C.K. Shum is partially supported by NSF's Partnerships for Innovation Program (2044704), and NASA's Earth Surface Interior Focus Area Program (80NSSC20K0494). Ehsan Forootan is supported by the Danmarks Frie Forskningsfond [10.46540/2035-00247B] under DANSk-LSM
References (87)
- et al.
Assessing the utility of remote sensing data to accurately estimate changes in groundwater storage
Sci. Total Environ.
(2022) - et al.
GRACE accelerometer data transplant
Adv. Space Res.
(2019) - et al.
Long-term groundwater variations in Northwest India from satellite gravity measurements
Glob. Planet. Chang.
(2014) Thermokarst acceleration in Arctic tundra driven by climate change and fire disturbance
One Earth
(2021)Geostatistics
70 years of machine learning in geoscience in review
Adv. Geophys.
(2020)- et al.
Monitoring groundwater change in California's Central Valley using Sentinel-1 and GRACE observations
Geosciences (Basel)
(2019) - et al.
Exploring groundwater and soil water storage changes across the CONUS at 12.5 km resolution by a Bayesian integration of GRACE data into W3RA
Sci. Total Environ.
(2021) - et al.
Prediction of GWL with the help of GRACE TWS for unevenly spaced time series data in India: analysis of comparative performances of SVR, ANN and LRM
J. Hydrol.
(2018) Deep learning, explained: fundamentals, explainability, and bridgeability to process-based modelling
Environ. Model. Softw.
(2021)
Improving drought simulations within the Murray-Darling Basin by combined calibration/assimilation of GRACE data into the WaterGAP global hydrology model
Remote Sens. Environ.
Improved methods for estimating local terrestrial water dynamics from GRACE in the Northern High Plains
Adv. Water Resour.
Comparison of interpolation methods for depth to groundwater and its temporal and spatial variations in the Minqin oasis of Northwest China
Environ. Model Softw.
GRACE groundwater drought index: evaluation of California Central Valley groundwater drought
Remote Sens. Environ.
Bridging the gap between GRACE and GRACE-FO missions with deep learning aided water storage simulations
Sci. Total Environ.
Comparison of physical and data-driven models to forecast groundwater level changes with the inclusion of GRACE – a case study over the state of Victoria,Australia
J. Hydrol.
Machine Learning Applications for Downscaling Groundwater Storage Changes Integrating Satellite Gravimetry and Other Observations [Doctoral dissertation, Ohio State University]
Computations of the viscoelastic response of a 3-D compressible Earth to surface loading: an application to Glacial Isostatic Adjustment in Antarctica and Canada
Geophys. J. Int.
Post-drought groundwater storage recovery in California's Central Valley
Water Resour. Res.
Bringing GRACE Down to Earth
Groundwater
Summer warming explains widespread but not uniform greening in the Arctic tundra biome
Nat. Commun.
Ground-water Resources of the Central Valley of California
A random forest guided tour
Test
Non-renewable groundwater use and groundwater depletion: a review
Environ. Res. Lett.
Random forests
Mach. Learn.
A comparative study among machine learning and numerical models for simulating groundwater dynamics in the Heihe River Basin, northwestern China
Sci. Rep.
Downscaling of GRACE-derived groundwater storage based on the random forest model
Remote Sens.
Monthly Estimates of C20 From 5 SLR Satellites Based on GRACE RL06 Models
Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States
Int. J. Climatol.
Periodic groundwater level measurements. [Data file]
Continuous groundwater level measurements. [Data file]
The global groundwater crisis
Nat. Clim. Chang.
Satellites measure recent rates of groundwater depletion in California's Central Valley
Geophys. Res. Lett.
Progress Report: Subsidence in the Central Valley, California
Groundwater Availability of the Central Valley Aquifer, California: U.S. Geological Survey Professional Paper
Development of a three-dimensional model of sedimentary texture in valley-fill deposits of Central Valley, California, USA
Hydrogeol. J.
Water availability and subsidence in California's Central Valley
San Francisco Estuary Watershed Sci.
Water availability and land subsidence in the Central Valley, California, USA
Hydrogeol. J.
Hyperparameter optimization
Monitoring groundwater storage changes using the Gravity Recovery and Climate Experiment (GRACE) satellite mission: a review
Remote Sens.
Assessing model fit by cross-validation
J. Chem. Inf. Comput. Sci.
Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables
PeerJ
California's Central Valley groundwater wells run dry during recent drought
Earth’sFuture
Cited by (15)
Application of the machine learning methods for GRACE data based groundwater modeling, a systematic review
2024, Groundwater for Sustainable DevelopmentCharacterization of groundwater storage changes in the Amazon River Basin based on downscaling of GRACE/GRACE-FO data with machine learning models
2024, Science of the Total EnvironmentGroundwater level forecasting in a data-scarce region through remote sensing data downscaling, hydrological modeling, and machine learning: A case study from Morocco
2023, Journal of Hydrology: Regional StudiesA dynamical downscaling method of groundwater storage changes using GRACE data
2023, Journal of Hydrology: Regional Studies