Reconstructing Terrestrial Water Storage Variations from 1980 to 2015 in the Beishan Area of China

College of Water Sciences, Beijing Normal University, Beijing 100875, China Beijing Key Laboratory of Urban Hydrological Cycle and Sponge City Technology, Beijing Normal University, Beijing 100875, China Engineering Research Center of Groundwater Pollution Control and Remediation of Ministry of Education, Beijing Normal University, Beijing 100875, China School of Engineering, University of Newcastle, Callaghan, New South Wales, Australia


Introduction
As a critical state variable in the hydrological cycle, terrestrial water storage (TWS) integrates surface water storage (SWS, including canopy interception, reservoirs, wetlands and lakes, rivers, and snow water equivalent), soil moisture storage (SMS), and groundwater storage (GWS) [1].TWS changes also reflect changes in accumulated precipitation, evapotranspiration, and surface and subsurface runoff within a given area or basin [2].Therefore, the accurate estimation of TWS changes is essential to understand behaviors of the hydrological cycle and improve the accuracy of predictions under the influence of human activities.For the past few decades, quantification of TWS variability has been accessible from hydrological modeling only.The construction and calibration of hydrological models require ground-based observations.However, establishment and maintenance of observation stations are time and money consuming, and the distribution of stations is usually uneven in terms of spatial resolution.In addition, hydrological information is subject to parameter uncertainties due to the lack of field measurements [3].Satellite remote sensing offers new ways of measuring hydrological fluxes at unprecedented spatial coverage and resolution, especially useful for regions where in situ measurements are sparse or nonexistent [4,5].Strategies that merge in situ model and satellite observations within a framework that result in consistent water cycle records are essential.
Large variations in water mass alter the gravity field of a region, which can be measured by the Gravity Recovery and Climate Experiment (GRACE).The advent of GRACE, launched in March 2002, provides an effective and useful method for the detection of variations in TWS at large scales [6].Many previous studies have demonstrated that seasonal and interannual changes in water storage for continental-scale patterns and large river basins can be inferred from GRACE observations with unprecedented accuracy [7][8][9].However, limited by its coarse spatial resolution (~70,000 km 2 ), the GRACE data uncertainty increases for small basins because the required noise reduction by filtering leads to undesired damping of signals.In addition, the GRACE lifetime is still short and does not allow the study of long-term historical variability in TWS [10,11].
Nuclear energy is considered as an additional dependable and clean energy source all over the world.The production of nuclear energy will result in high-level radioactive waste (HLRW), which brings potential environmental dangers [12,13]; thus, the selection of a proper disposal repository for nuclear waste is very important.The characteristics of water storage variations and movement play an important role in the site selection for nuclear waste disposals.In China (specifically the Beishan area), there is tectonically stable rock and an arid climate; therefore, the Beishan area was selected as one of the most prospective site for a disposal repository of HLRWs in 1993 [14].Due to wide areas and limited ground-based observations in remote areas, the understanding of long-term water storage changes in the Beishan area is very limited.Up to now, GRACE satellites may be the only method to detect TWS changes particularly in remote regions.The area of Beishan regions is large enough to use GRACE data considering its native spatial resolution.
The objective of this study is to reconstruct long-term water storage changes in the Beishan area to better understand patterns in water storage change for assisting the site selection of HLRW.This paper is organized into the following sections: (1) assessment of the applicability of different products from four land surface models (LSMs); (2) analysis of the interannual and seasonal variation of hydrological fluxes using the water balance equation (i.e., precipitation and evapotranspiration from multisource remote sensing datasets); (3) calculation of TWS variations from GRACE observations and the water balance method, then a comparison of the differences among them; and (4) characterization of the multidecadal TWS variations reconstructed in the Beishan area.

Method and Data
2.1.Study Area.The Beishan area is located in the northwestern remote region of China and covers a total area of approximately 122,374 km 2 (Figure 1).The longitude is 96 °-100 °E, and the latitude is 40 °-43 °N.The region is surrounded by Mongolia to the north and Yumen City to the south.The area includes the Shule River basin to the west and the Heihe River basin to the east [13].There are only seasonal rivers in the study area, so no hydrological station is established within the study area, as shown in Figure 1.The average annual precipitation ranges from 60 to 100 mm, and approximately 60% of the total precipitation occurs from June to August [15].The average annual potential evaporation is 2900-3200 mm, and the mean annual potential temperature is 4-5 °C [12].The study area is divided by using the USGS/-NASA Shuttle Radar Topography Mission data in the spatial resolution of 30 m and Arc Hydro Tools in ArcGIS software, which includes procedures such as Digital Elevation Model (DEM) pretreatment, determination of the direction of flow, extraction of flow accumulation, channel network, watersheds, and watershed division.Very small fluxes are present in the parts of northeastern and southern boundaries.According to the studies [11], the flux at the boundaries of the study area is very small when compared with precipitation and the variations of the flux are negligible, and thus, the study area is considered to be a drainage basin.

Data
2.2.1.Precipitation Products.Different precipitation data sources are available to verify their consistency and calculate the total water storage change (TWSC) based on the water balance method.Precipitation (P) products, including GPCP-2, TRMM 3B43, and weather observation datasets, were evaluated in this study.GPCP-2 incorporates precipitation estimates derived from low-orbit microwave data, geosynchronous infrared data, and surface rain gauge observations and is obtained from NOAA.TRMM 3B43 is a standard monthly precipitation product that incorporates the combination of precipitation datasets and is widely used in climatological application [16].Monthly weather precipitation products are provided by the China Meteorological Administration (CMA) (hereafter noted as P_CMA), which are spatially interpolated based on 2472 gauge stations in China [17].Two CMA stations are found within the Beishan area and many stations exist in the vicinity of the southern part of the study area (Figure 1).

GLEAM ET Data. The Global Land Evaporation
Amsterdam Model (GLEAM) is a set of algorithms driven by satellite-based observations that separately estimate daily global ET changes at 0.25 °resolution [18,19].Three datasets produced using GLEAM v3 are currently available.GLEAM_ v3.1a is a global dataset spanning the 37-year period of 1980-2016, which is used in this study.There are 10 variables available, including actual evaporation (E), potential evaporation (Ep), and surface soil moisture (SMsurf).For more detailed information about GLEAM data, see a previous study by Miralles et al. [20].

GLDAS Model.
The global land data assimilation system (GLDAS) is a global, high-resolution, offline terrestrial modeling system that incorporates satellite and ground-based observations in order to produce optimal fields of land surface states and fluxes in near-real time.It generates a series of global land surface states (e.g., soil moisture and surface temperature) and fluxes (e.g., evaporation and sensible heat flux) [21][22][23].In this study, monthly precipitation, evapotranspiration, surface runoff, and subsurface runoff from CLM, VIC, and Mosaic land surface models in GLDAS-1 and from the Noah model in GLDAS-2 are used to estimate TWS variations based on a water balance method over the Beishan area.The gridded products cover a total period of 156 months and span from January 2003 to December 2015 [24].Due to the sampling and postprocessing of GRACE observations, surface mass variations at small spatial scales tend to be attenuated.The scaling factor approach is used to restore GRACE TWS changes and has been applied to individual basins and regions globally.A separate file of scaling factors derived from the NCAR CLM 4.0 is also obtained to correct for GRACE signals during low-pass filtering (i.e., destriping, truncating, and filtering) [25].

GRACE Mascon Solutions.
To evaluate the monthly GRACE-derived water mass variability, newly released GRACE mascon (mass concentration) solutions from the Center for Space Research (CSR-M) are also used.CSR mascon solutions are computed on an equal area geodesic grid comprised of hexagonal tiles.Each mascon cell is related to range-rate observations via partials with respect to the spherical harmonic coefficient expansion and truncated to a degree and order of 120 [26].Mass anomalies in each mascon tile are computed from satellite range-rate observations via their partial derivatives.The resulting mascon solutions are provided at a 0.5 °× 0.5 °grid.No additional spatial or temporal constraint beyond regularization is applied to the CSR mascon data.In addition, the CSR-M solutions have no strip errors and capture all of the signals observed by GRACE within the measurement noise level.Therefore, no postprocessing filtering is necessary for the solution.A detailed process of the mascon solutions is in Save et al. [27].All data used in this study are listed in Table 1.where ΔS/Δt represents the change in terrestrial water storage ΔS for a given time period Δt ; P is the monthly precipitation (mm/month); ET is evapotranspiration (mm/month) and R is the streamflow, which includes both surface water and subsurface water [28].In this study, the time interval is set as a month.P, R, and ET are obtained with various satellite-based products, LSM products, and in situ measurements, which are explained in Section 2.2.However, TWS changes inferred from this water balance approach are subject to uncertainties in the flux variables (particularly, uncertainties in evapotranspiration and precipitation) [29].Alternatively, streamflow records are difficult to obtain in most parts of the world, particularly over underdeveloped regions and transboundary river basins with sensitive water issues.The time derivative of water storage (TWSC) in Equation ( 1) can be estimated from 3 Geofluids the centered difference based on GRACE-derived TWS anomalies (TWSA), as shown in Firstly, monthly and annual multiple datasets, which includes precipitation, evaporation, and runoff from different remote sensing products, are compared to find the optimal data sources in our study region.Two water budget estimates of TWS variations are calculated based on different combinations from 2003 to 2015, noted as WB-CMA and WB-Noah, respectively.Secondly, for verification of the approach, the results from WB-Noah estimate are compared with the sum of three components from GLDAS Noah model, which includes soil moisture (SM), snow water equivalent (SWE), and plant canopy surface water (PCSW) within the time from 2003 to 2015.Thirdly, GRACE-derived TWS anomalies, which are used as a reference of TWS changes by previous studies [10,30], are compared with results from WB-CMA estimate.Lastly, changes of TWS variations in the past 35 years are reconstructed.The overall workflow is shown in Figure 2.

Multiple Linear Regression (MLR) Analysis of
Time-Series Data.Since water storage changes have seasonal and secular signals, a multiple linear regression analysis (MLRA) can be applied to examine the temporal variability of TWS [6].For a given time series, the regression model used can be expressed as follows:   6)-( 8)) were adopted to quantitatively compare different datasets.
The bias reflects the extent of the deviation from one dataset to a reference dataset [31].The NSE indicates the amount of agreement between two datasets [32].The R 2 measures the degree of correlation between different datasets.The higher the R 2 is, the higher the degree of correlation.There is a statistically significant correlation if R 2 was >0.7 (which is also in the acceptable range).

bias = ∑
where X i represents the reference-measured datasets, Y i represents the satellite-derived datasets, X is the average value over N for X, and N is the total number of samples from the satellite-based product.Uncertainties in the change of terrestrial water storage are obtained from each component of water balance method.
If the components are independent of each other (no covariance between any two components), the uncertainties can be simplified as Gaussian error propagation [33].
where σ is the error estimated to each component.

Comparison of Different Land Surface
Models and Remote Sensing Datasets Figure 3 shows evapotranspiration (ET) and streamflow products from four LSMs in the GLDAS at monthly and seasonal scales.It is noted that the precipitation products are the identical forcing variables among the four LSMs, which is the sum of rain and snow precipitation rates.Different ET products show good agreement in timing in monthly scales (Figure 3(a)).However, the amplitude of Noah ET variation is considerably larger than other LSM results.
In the GLDAS products, streamflow represents the total runoff for both surface and subsurface flow.The magnitude of the streamflow (R) varies up to 3 mm, which is much smaller than that of precipitation and ET.It is also observed that the streamflow in the CLM model is significantly overestimated during the period from 2003 to 2015 due to different model structures (Figure 3(c)), which is consistent with the conclusions from the previous study [34].The monthly maximum values are 0.61, 0.65, 2.95, and 0.28 mm for the Noah,  5 Geofluids VIC, CLM, and Mosaic model, respectively.Generally, the runoff is expected to be only a couple of percentage of ET.
In terms of seasonal scale, it can be observed that the four LSMs exhibit an obvious seasonal cycle; the dry season occurs from November of the current year to April of the following year.More than 65% of the seasonal evaporation occurs from June to September.In addition, discrepancies among the four LSMs are generally larger during the warm seasons than those in the cold seasons.To summarize, the phase and amplitude of ET among the LSMs are consistent, except the fact that ET from the Noah model is larger than the ones from other models in the Beishan area.
Unfortunately, there is no runoff station within the study area.We validated the Noah runoff model results only at the nearest stations instead.The Heihe River is the largest inland river in Gansu Province, which is located in the vicinity of the eastern part of the study area.Two national hydrological stations in the Heihe River are chosen and their annual runoff data were compared with the model results.The detailed information and locations of these 2 hydrological stations can be seen in Table 2 and Figure 1.
As shown in Figure 4, the observed runoff data in Gaoya and Zhengyixia stations agree well with Noah-derived results from 1980 to 2007 in these two stations, especially in recent years.The magnitude of annual runoff from hydrological stations and Noah model is consistent.Differences between these results may be caused by the reason that Noah results represent regional scale data (1 °× 1 °), and the observed results are at point scales.Based on the above discussions, we believe that the GLDAS/Noah model also can produce reasonable runoff data in the Beishan area.

Evaluation of Precipitation Products
. GLDAS Version 1 model (VIC, CLM, and Mosaic) produced unnatural trends in the simulated variables because highly uncertain forcing fields in 1995-1997 were introduced [35].GLDAS Version 2 Noah model overcame these shortcomings and created more climatologically consistent datasets using the Princeton forcing dataset [36].We also examine different precipitation products from CMA, TRMM, and GPCP in addition to GLDAS/Noah, as summarized in Table 1.
The P_CMA data are spatially interpolated based on 2472 stations in China, and many studies have found that the dataset matches well with in situ observations [37].Previous studies have illustrated that the TRMM 3B43 maintains   6 Geofluids the best consistency with rain gauge observations compared to other remote sensing precipitation products in regions of northwest China [23].
The most important observation is that, at the annual and interannual scales, the GLDAS/Noah precipitation is larger than other datasets over the period from 2003 to 2015, with biases of 65% and 33% with respect to the TRMM and GPCP, respectively.The TRMM and CMA are most consistent among others with the correlation coefficient 0.74 and 0.93 for interannual and monthly scales, respectively (Figures 5(b) and 5(c)).The mean annual precipitation with standard deviations estimated from the CMA, TRMM, GPCP, and GLDAS/Noah is 62.43 ± 18.72 mm, 66.06 ± 16.12 mm, 81.81 ± 18.13 mm, and 108.97 ± 22.21 mm, respectively.In general, the standard deviation of different monthly precipitation datasets is about 2.76 mm.
In terms of seasonal cycles, all precipitation products show peaks occurring in July (Figure 5(c)).Precipitation mainly lasts from June to September, accounting for approximately 70% of the annual precipitation, which is consistent with the CMA observations.In addition, differences in precipitation estimates among these four different products are generally larger in warm seasons than in cold seasons, with the largest difference value occurring in July (10.45 mm).
Among all satellite precipitation estimates, the seasonal and annual cycle of the TRMM dataset follows most closely with the CMA and has a comparatively low bias and RMSE values.The results are consistent with Yang et al. [6].However, the time series of the TRMM 3B43 covers the period from 1998 to 2015, which is much shorter than other products.Besides, TRMM 3B43 only contains rainfall component, and snowfall component is not taken into account.Therefore, the CMA precipitation datasets are selected for further discussion.

Evaluation of Evapotranspiration
Products.The Noah ET results are overestimated in comparison to other LSM results and this is likely caused by the overestimated P compared to CMA and TRMM [36,38].We also evaluated the Noah ET with the independent remote sensing results.Figure 6 illustrates the ET from GLEAM and Noah model at different time scales.It can be seen that the monthly ET products are basically consistent in timing, with a correlation coefficient of 0.86 (Figure 6(a)).However, Noah ET is always larger than GLEAM ET with a bias of 71% during 2003 to 2015 (Figure 6(b)).Noah model reached its maximum annual ET in 2010, with the value of 141.02 mm.The maximum ET occurred in 2012 for GLEAM, with the value of 86.33 mm.The larger ET in the Noah model is also observed in the comparison with other GLDAS LSM models (Figure 3).The GLEAM ET data are more consistent with CLM, MOS, and VIC ET results.The standard deviation of different monthly ET datasets is about 2.71 mm.
For seasonal scales, both ET products show peaks occurring in JJA (June, July, and August) and the lowest occurring in NDJ (November, December, and January).Previous studies have evaluated daily GLEAM ET products with in situ eddy covariance ET data as a benchmark from eight sites in the Chinese Flux Observation and Research Network (ChinaFLUX), and the results show that the accuracy is higher in specific river basins across China (e.g., the Liao River Basin, the Yellow River Basin, and northwestern river basins) [39].This study area belongs to northwestern river basins; therefore, ET products from the GLEAM model can be applied in this study.

Results and Discussion
The study area is located in the northwestern remote region of China.Field investigations in the Beishan area just began after 1983, and observation wells were drilled after 2010.Therefore, it is hard to obtain long-term time series of ground-measured water level data despite its importance regarding ground water contamination associated with HLRW disposal.We present historical and contemporary water storage variation from remote sensing data (such as GRACE) and water balance approach and also examine the uncertainties of the reconstructed water storage associated with data and model errors.7 Geofluids comparison of these monthly TWS variations is shown in Figure 7.The TWS variations from the GRACE data with a scaling factor applied are a few times smaller than other GRACE data but still correlated with the others.This implies that the empirical scaling factor is significantly underestimated in this Beishan region, possibly due to the incorrect representation of TWS in the CLM 4.0 model.The GRACE-filtered data (without the scaling factor) agree better with the CSR mascon solution (Figure 7).Over the period from 2003 to 2015, all TWS variations represent a decreasing trend, as quantified in Table 3.Although no empirical postprocessing is required, mascon solutions can still provide comparable TWS estimates as traditional spherical harmonic products [26,32].Therefore, the CSR mason solutions are selected to describe the characteristics of TWS variations in the following discussions.The GLDAS/Noah soil moisture (SM), snow water equivalent (SWE), and plant canopy water storage (PCSW) were jointly used to calculate the TWS variations (noted as SM + SWE + PCSW) [2,40].In order to verify the reconstruction method, we added up GLDAS Noah data related to total column SM, SWE, and PCSW within the time frame from 2003 to 2015 and then compared SM + SWE + PCSW with the WB-Noah estimate.As shown in Figure 8, the results from the WB-Noah are basically consistent with TWS variations from GLDAS model (SM + SWE + PCSW), confirming the water balance imposed in the LSM computation.
The most notable observation is that the WB-CMA estimates of TWS variations show a decreasing trend of −0.94 mm/year, which is consistent with the results of CSR mascon solutions within the solution uncertainty, while the WB-Noah presents the opposite increasing trend of 1.28 mm/year for the same period.However, the seasonal variations of GRACE TWS are larger than those of WB-CMA and WB-Noah.This may elucidate a possible contribution of groundwater component which is not considered in the Noah model.The uncertainty in the data sources will 8 Geofluids also lead to some discrepancies in results, including CMA P, GLEAM ET, and GRACE-derived TWS.CMA P data used in this paper are interpolated from in situ data in gauging stations; however, the in situ stations are limited, and the interpolation may cause errors.Moreover, CSR mascon solutions are also contaminated with satellite data error and inherent regularization effect [27].

Reconstruction of Long-Term TWS Variations.
The uncertainties in remote sensing precipitation, ET, and streamflow data consequently result in uncertainties in water budget estimates of TWS changes.However, as discussed in Long et al. [29], the magnitude of these uncertainties seems to be small such that meaningful TWS variations can be determined at the basin scale.It is consistently found that Noah P and ET flux data are overestimated in comparison to other independent LSMs and datasets, and also, the TWS from Noah does not agree with GRACE TWS particularly in the interannual change.Instead, the agreement is found from the balance between CMA precipitation and GLEAM ET datasets.We choose the monthly CMA precipitation, GLEAM ET, and Noah total runoff (albeit very small) over the Beishan   The result is consistent with the conclusion of previous study [41].The reconstructed TWS variations are basically consistent with CSR mascon between 2003 and 2015, which prove that the reconstructed datasets are reasonable.It also can be seen that the yearly WB-Noah shows the opposite trends (Figure 9(b)).Overall, the decreasing precipitation and TWS over the last 35 years and the annual changes of TWS within ~10 mm favorably suggest the Beishan area as a candidate site for the safety disposal of HLRW in terms of water storage variations.
As a disposal repository, numerical simulations are helpful to quantitatively evaluate the risk of radionuclide transport.The accuracy of the model is restricted by the lack of measured data.The reconstructed long-term time series of TWS variations can be used to calibrate the regional groundwater model so that the calibrated model is used to evaluate the influences of extreme climate and regional faults on the groundwater flow pattern.
4.4.Uncertainty Analysis and Limitation.Uncertainties in the GRACE-derived TWS anomalies include (1) inherent GRACE data errors and (2) propagation of bias/leakage corrections.In this paper, the scaled TWS anomalies were computed using the scaling factor provided by GRACE TELLUS.The measurement error and leakage error for the scaled TWS anomalies over the Beishan area were computed using the pseudocode and error estimates provided by GRACE TELLUS (https://grace.jpl.nasa.gov/data/get-data/monthlymass-grids-land/).It is calculated that the GRACE-derived TWS anomalies have a leakage error of 21.16 mm and a measurement error of 11.52 mm.As to the uncertainty of CSR mascon, a rigorous uncertainty analysis of these solutions is still in progress.In the meantime, the users are recommended to use an uncertainty of 2 cm in water height in each 0.5 °grid globally.Therefore, the uncertainty of the study area is as low as 6 mm for the case of spatially uncorrelated errors.
As for uncertainties in hydrological fluxes, the errors are calculated as one standard deviation from different simulations.For example, uncertainty in ET was estimated by the standard deviation of all ET products being used in this study.Then, uncertainties in P, R, and ET were then in quadrature to obtain the total uncertainty estimates of TWSC from the water budget.It is calculated that the error of precipitation is 2.76 mm; the error of streamflow and evaporation is 0.19 and 2.71 mm, respectively.Therefore, it is calculated that the uncertainties in TWSC are 3.87 mm.At present, there are limitations in this research.The Beishan area is located in the northwestern remote region of China.Only two meteorological stations and no hydrological station are located within the study area, which bring certain difficulties in obtaining observed data.Although the GRACE has provided the only globally available solution for measuring the TWS variations, the short lifetime of GRACE is destined for its weakness in providing long-term TWS datasets for better understanding TWS variability.This research reconstructed long-term time series of TWS variation based on appropriate product combinations and proven water balance method.If more observed data are collected in the future, we

Conclusions
Based on TWS time series acquired from the GRACE data, in association with other remote sensing products, we reconstructed the monthly, seasonal, and interannual variability of TWS variations during 1980-2015 based on the water balance method over the Beishan area.Satellite products have widespread application potential for fields including water resource managements, climate change research, and hydrological modeling, due to their wide coverage and high temporal-spatial resolution.Therefore, the precision of satellite estimates needs to be evaluated before application.The major conclusions drawn throughout this study can be summarized as follows. ( The GLEAM ET datasets are basically consistent most of GLDAS/LSMs ET with a correlation coefficient of 0.86.However, the Noah ET is always larger than that from GLEAM, while the differences somewhat increased after 2008, with a bias of 71% (4) Amplitudes of TWS variations derived from GRACE are greatly reduced after consideration of the scaling factor (likely poorly estimated in this study area), with annual changes from 13.20 ± 2.78 mm to 3.83 ± 0.83 mm.The GRACE-filtered solutions without the scaling factor applied and the independent mascon solutions agree with each other with a correlation coefficient of ~0.9 and both present a decreasing trend of −2.30 and −1.52 mm/year, respectively, in the Beishan area (5) Our water budget estimates WB-CMA show the similar decreasing trend with the CSR mascon results, with a slope of −0.94 mm/year.On the contrary, the WB-Noah estimates exhibit the opposite, increasing  11 Geofluids trend of 1.28 mm/year, likely due to overly estimated precipitation and evapotranspiration as also shown from the comparison with other data (6) The WB-CMA exhibits a decreasing trend during the extended period from 1980 to 2000, with a slope of −2.40 mm/year.The amplitude of the decreasing trend weakens in recent years, with a slope of −0.94 mm/year, which is due to the more balanced states between precipitation and evapotranspiration in recent years.As a whole, the estimated TWS variations are only within 10 mm since 2001, which suggests that TWS in the Beishan area almost remains stable, being favorable for site selection of HLRW Our 35-year long TWS time series, reconstructed with CMA and GLEAM datasets, agree with the GRACE observation for the last 13 years.They offer a unique opportunity for understanding the long-term historical variation of TWS variations in this large river basin during the past several decades.Besides, the research also provides reference for choosing the most appropriate methodology and datasets to reconstruct long-term TWS variations in data-sparse regions.Moreover, these reconstructed datasets can also be used as auxiliary data to calibrate and validate the regional groundwater flow model.
The validation of constructed results is a very difficult work because few observation data are available.We firstly validate the reliability of the method in the period from 2003 to 2015 and then reconstructed TWS variations from 1980 to 2015.The reliability of results is subject to the accuracy and resolution of GRACE data and the other P and ET data sources.However, this study will provide a preliminary reference to prove the Beishan area as a candidate site for the safety disposal of HLRW.

2 Geofluids 2 . 2 . 4 .
GRACE Gridded Data.The monthly 1 °gridded GRACE TWS datasets are available from the RL05 time-variable gravity field model, which is provided by the Center for Space Research (CSR) at the University of Texas (i.e., GRACE-filtered).

2. 3 . Methods 2 . 3 . 1 .
Water Balance Method.The equation for evaluating TWS changes using hydrological flux variables was based on a water balance equation as shown in ΔS Δt = P − ET − R, 1

Figure 1 :
Figure 1: Location and a digital elevation of the Beishan area and distribution of meteorological and hydrological stations across the region.

3. 1 .
Evaluation of Different Data from Four LSM Products.
Multi-source datasets include Precipitation (obtained from CMA, GPCP, TRMM, GLDAS four models) Evaporation (obtained from GLEAM, GLDAS four models) Runoff (obtained from GLDAS four models) Two water budget estimates obtained based on selected datasets: WB-Noah (Precipitation, evaporation and runoff from Noah) WB-CMA (Precipitation from CMA, ET from GLEAM and runoff from Noah) WB-Noah WB-CMA Prolong the time span of WB-CMA from 2003-2015 to 1980-2015 based on proven datasets and reconstruction method Calculate TWS based on SM, SWE and PCSW from GLDAS Noah model Calculate TWS derived from GRACE CSR mascon e first step: analyze multi-source datasets, and select the optimal datasets to calculate TWS based on water balance method e second step: validate WB-Noah with GLDAS-TWS to prove the reliability of water balance method e third step: validate WB-CMA with GRACE-derived TWS to prove the accuracy of WB-CMA e fourth step: reconstruct long-term time series of TWS variations from 1980 to 2015

Figure 2 :
Figure 2: Workflow of the water balance method in this study.

Figure 3 :
Figure 3: Evapotranspiration and streamflow from the GLDAS four land surface models; (a) and (c) represent monthly scales, respectively, and (b) and (d) represent mean seasonal cycles, respectively.

4. 2 .
Comparisons among TWS Variations.Two different estimates of TWS variations are calculated based on selected remote sensing products through a water balance equation.The first solution is derived by combining all hydrological flux products (e.g., precipitation, ET, and runoff) from the GLDAS/Noah model (termed as WB-Noah).The other solution is obtained by combining the CMA precipitation data with GLEAM evaporation and Noah runoff (termed as WB-CMA).Estimates of TWS variations are validated with GRACE observations during 2003-2016.

Figure 7 :
Figure 7: Monthly variations of terrestrial water storage derived from GRACE and CSR mascon data products.The orange and blue shadows represent the uncertainties in TWS variations from GRACE and CSR mascon.The uncertainties of CSR mascon solutions and the GRACE TWS with the scaling factor are computed to be around 6 mm and 12 mm, respectively (see Section 4.4).

Figure 6 :
Figure 6: The 2003-2015 (a) monthly time series, (b) annual time series, and (c) mean seasonal cycles of ET from four GLDAS Noah model and remote sensing-based products (GLEAM).

Figure 8 :
Figure 8: Comparisons of monthly TWS variations from water budget method and GRACE during the period of January 2003 to December 2015.

Figure 9 :
Figure 9: Reconstructed annual TWS variations and related hydrological fluxes from 1980 to 2015.(a) Change of related hydrological fluxes with time; (b) change of TWS variations and related hydrological fluxes difference with time.The shadows represent the uncertainties in reconstructed datasets.

Table 1 :
Data sources used in this study.
2.3.3.Data Comparison and Error Analysis.Because different remote sensing datasets were used to calculate TWS changes based on Equation (1), three indexes (i.e., the relative mean error (bias), the Nash coefficient (NSE), and the correlation coefficient R 2 , which were expressed in Equations (

Table 2 :
Information of the hydrological stations.
a Catchment area upstream of the hydrological station.

Table 3 :
Annual amplitude and phase of terrestrial water storage (TWS) from GRACE, mascon, and GLDAS from 2003 to 2014.
) As for hydrological flux outputs from the four LSMs in the Beishan area, the phase and amplitude of three ET products (VIC, CLM, and Mosaic) are basically consistent at the monthly and seasonal scale, while Noah ET is larger than others.There are large differences in streamflow products among the four LSMs.