The scenario-based variations and causes of future surface soil moisture across China in the twenty-first century

Surface soil moisture (SSM) is a key factor for water and heat exchanges between land surface and the atmosphere. It is also important to water resources, agriculture, and ecosystems. In the backdrop of global warming, SSM variations and potential causes are not well-known at regional scales. Based on soil moisture (SM) data from GLDAS-Noah and 16 global climate models (GCMs) selected from 25 GCMs in CMIP5, we analyzed spatial distribution and temporal changes of SSM in China and quantified fractional contributions of four meteorological factors to the SSM variations. The selected models have the same direction of historic trends in SSM during 1981–2005 as those in the GLDAS SSM data which were also further used to calibrate the trends simulated by the 16 GCMs. Based on the calibration results for the 16 GCMs, future SSMs for nine regions were analyzed in mainland China under four Intergovernmental Panel on Climate Change emission scenarios. No significant changes were identified in SSM across most regions of mainland China under RCP2.6 scenario. However, there is a general wetting tendency in the arid regions and drying tendency across the humid regions under all the scenarios except RCP2.6. In general, the higher the global temperature raises, the more grids with significant increase or significant decrease in SSM. These findings contradicted prevailing view that wet regions get wetter and dry regions get drier. Attribution analysis indicates that precipitation acts as the major driver for SSM variations and contributes up to 43.4% of SSM variations across China. These results provide new insights into future SSM response to climate warming and a scientific basis to mitigation and adaptation works related to SSM in the future.


Introduction
Surface soil moisture (SSM) is a pivotal variable of the terrestrial system and plays a critical role in energy exchange and hydrological cycle (Brocca et al 2012, Miralles et al 2014, Berg et al 2017, McColl et al 2017, mainly through its control on the partitioning of radiation and evapotranspiration at the surface (Albergel et al 2013. Furthermore, SSM also plays a critical role in hydro-climatic extreme events (Miralles et al 2019), vegetation changes (Chen et al 2014) and physical properties of soil (Gunda et al 2017). Soil moisture (SM) is also a state variable controlling surface runoff, soil drainage, and soil-freezethaw status (Zhang et al 2015), and for other hydrological, meteorological and ecological applications (Cui et al 2019). As a consequence, the Global Climate Observing System (GCOS) Programme recognizes SM as an essential climate variable (ECV) (Albergel et al 2013.
Recent years witnessed an increasing availability of datasets of SM sourced from satellite remote sensing and assimilation technologies (Miralles et al 2010, Zhang et al 2018, Gu et al 2019. The remotely sensed data products such as those from the Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E and AMSR-2), the Advanced Scatterometer (ASCAT), the Soil Moisture Ocean Salinity or the Soil Moisture Active Passive, and from the European Space Agency Climate Change Initiative were widely used. A limitation of these datasets is that they provide only near-surface (∼0-5 cm) SM estimates (Zohaib et al 2017, Zhang et al 2018. For this reason, assimilation technologies were employed to derive SM estimates from multiple soil layers, such as the European Centre for Medium-Range Weather Forecasts atmospheric reanalysis (ERA), The Global Land Data Assimilation System (GLDAS), the Modern-Era Retrospective analysis for Research and Application (MERRA), which were also widely used as alternatives for 'true value' (Deng et al 2020). In addition, GLDAS has been used to study global and regional SM trends (Dorigo et al 2012, Zhang et al 2018, Gu et al 2019.
SM changes can be influenced by numerous factors such as human activities (Samaniego et al 2018) like irrigation  and climate changes such as changes in precipitation and temperature (Feng and Liu 2015, Gunda et al 2017, Guillod et al 2015, Hauser et al 2016. Temperature directly affects the loss of SM to the atmosphere by influencing evapotranspiration processes, while precipitation provides inputs to SM (Varallyay 2010, Gunda et al 2017). According to the Fifth Assessment Report (AR5) by the Intergovernmental Panel on Climate Change (IPCC), global warming has become indisputable and is caused mainly by human activities (Stocker et al 2013). Global warming heavily affects hydrological cycles and water balances at different spatial scales, modifying spatiotemporal patterns of water resources (Berg et al 2017). As an example, mounting evidences indicate that increased continental precipitation and evapotranspiration most likely lead to decreased SM in a warming climate, although regional differences exist (Feng and Liu 2015, Zhao and Dai 2015. SM can influence crop yields directly by modulating the plant-available water (Wang et al 2011). China is the world's largest agricultural producer and has the world's largest population. Because of the importance of water resources and agricultural development to China's economy and people, the relationships between extreme weather events, floods, and droughts, and agricultural production have been intensively studied (Zhang et al 2015(Zhang et al , 2017. In particular, it is critical for China to bridge the knowledge gap between climate changes and SM variations in both space and time (Koster et al 2004Fan et al 2019 Characteristics of future SM are rarely explored in previous studies. Besides, it still remains a challenge to quantify how SM changes at regional scale due to substantial spatial heterogeneity of SM variations (e.g. Wang 2005). Therefore, in this study, we first used GLDAS-Noah SM dataset to assess the historical simulations of global climate models (GCMs) in CMIP5 and selected GCMs with the consistent trend to GLDAS-Noah for better reliability of future projections. Then we employed the SM median of chosen GCMs in CMIP5 as projected SM to investigate the characteristics and causes of future SM changes that can help predict agricultural drought across China during historical and future periods.

Study region and data
Given the distinct spatial heterogeneity of meteorological variables and SM over mainland China (CHN) was subdivided into nine regions (figure 1), i.e. Northeast China (NEC); North China (NCH); Central China (CCH); South China (SCH); Southwest China (SWC); the eastern part of Northwest China (ENW); the central part of Northwest China (CNW); the western part of Northwest China (WNW); Tibet Plateau (TPC). In order to investigate the effects of potential driving factors on SSM at the regional scale, we divided the area into regular rectangles to further minimize the errors sourced from the reduced number of grids in marginal areas (table 1, totally 1062  grids across China). Generally, NEC, NCH, CCH, SCH, SEC mostly locate in the monsoon area in the eastern China; ENW, CNW, WNW mostly situate in the arid and semi-arid areas in the northwestern China, and TPC is approximately in the Tibetan Plateau high-cold area (Xiao et al 2013).
We selected SM outputs of 25 GCM models from Coupled Model Intercomparison Project Phase 5 (CMIP5) in historical and future periods under historical forcing and RCP2.6, RCP4.5, RCP6.0 and RCP8.5 scenarios (table 2). The monthly SM dataset was sourced from the ESGF (the Earth System Grid Federation) node (https://esgfdata.dkrz.de/search/cmip5-dkrz/). The outputs of models considered in this study include at least two future scenarios in both the uppermost 10 cm soil layer and root-zone soil layers. Meanwhile, the models chosen in this study are with the ensemble of r1i1p1 and only the data from output1 was accepted (Taylor et al 2012). Moreover, the monthly data  of climate variables, including precipitation, 2 m air temperature, wind speed and relative humidity from the CMIP5 outputs were also analyzed to quantify climatic effects on SM. The Global Land Data Assimilation System (GLDAS) was developed to generate optimal fields of land surface states and fluxes by integrating satellite-and ground-based observed data products, using land surface modeling and data assimilation techniques (Rodell et al 2004). GLDAS drives four models, including Noah, Mosaic, VIC and CLM, which are sourced from the Goddard Earth Sciences Data and Information Services Center (http://disc.sci.gsfc.nasa.gov/hydrology/dataholdings). Noah simulates SM outputs of four layers (0-10 cm, 20-40 cm, 40-100 cm and 100-200 cm). In this study, Noah V2.0 with spatial resolution of 0.25 • × 0.25 • and time interval of 1948-2010 can Table 2. 25 GCM models from CMIP5 with modelling results of the soil moisture under historical forcing and future scenarios. These models include at least two future scenarios. The date refers to the time range covering the whole data sets at monthly scale but only the period of up to 2100 years was analyzed. The resolution is referred to as the number of grids along the longitude and latitude. Asterisk ( * ): the model has been selected to conduct further analysis. Cross symbol (×): data are not available. meet the requirement in spatial resolution at the regional scale and time length of validating modelled SM data in CMIP5. The Noah V2.0 used in this study covered the period of 1981-2005 and has been used widely in SM analysis (Zhang et al 2018, Gu et al 2019.

Pre-processing of the SM
In this study, hydrologically active soil depth varies from 3 to 14 m for GCMs (table 2) and only the first ten layers, down to 3.8 m for NorESM1 model are hydrological active (Lawrence et al 2011, Berg et al 2017. For SSM, GLDAS-Noah and the outputs of GCMs in CMIP5, which have been integrated over uppermost 10 cm, has the same surface soil depth. As a result, SSM is quantitatively comparable. Correspondingly, the deeper SM content varies widely due to obviously different SM depth. Given that GLDAS-Noah have maximum depth of 2 m and we focused more on SM changes of the upper SM column mostly accessible to ecosystems, we used SM outputs from CMIP5 to compute a 2 m SM content, which corresponds to the root-zone soil depth. Meanwhile, SM outputs were linearly interpolated to integrate the 2 m SM content for GCMs with the units of m 3 m −3 (Berg et al 2016). To reduce uncertainties and capture spatial patterns at regional scale, all SM variables were also resampled to a 1 • × 1 • grid by bilinear interpolation before further analysis. All monthly data has been integrated into annual scale to explore the variation during the whole period. Then, we analyzed the relative SM changes in terms of the ratio (%) of the future (2005-2100) simulated mean SM change and the baseline mean SM during 1981-2005 rather than the absolute SM changes since that the SM changes are heavily model-dependent. Meantime, we normalized future SM where future-minus-present differences were divided by present simulated SM (Koster et al 2009, Berg et al 2016.

Verification and selection of GCMs in CMIP5
In this study, we adopted GLDAS-Noah SM data which has been widely used in SM analysis worldwide and for China as well (Dorigo et al 2012, Gu et al 2019. Besides, we proposed the following indicator to show direction of trends at grid scales: Therefore, if Trend i1 and Trend i2 are positive or negative simultaneously, Ratio i is positive, which indicates the prediction ability of the GCM in catching the direction of trend in the ith grid. Sixteen models with positive ratios in more than half of the total 1062 grids covering the entire study region were chosen for SSM, and the median SM of the simulated SM was used to explore SM variations.
Besides, magnitude of trends should also be normalized keeping consistency between magnitude of trends in SM by models and that by GLDAS-Noah dataset. In this case, we proposed a correction factor for this purpose, i.e. C trend : wherein, Trend ic indicates the calibrated SM trend of the ith grid for GCMs, Trend im refers to the trend of the median SM in the ith grid for the chosen GCMs. The correlation factor is further used in the analysis under future scenarios. C trend is calculated as: wherein, Trend in refers to the trend of the median SM in the ith grid for the GLDAS-Noah SM dataset. Abs refers to the absolute value function.

Attribution analysis with the geographic detector method
In this study, the attribution analysis was done using the geographic detector method that is a set of statistical methods that detect spatial differentiation and reveal the driving forces behind (Wang et al 2010).
The major assumption behind geographic detector is that if an independent variable has an important influence on a dependent variable, the spatial pattern of the independent variable should be similar to that of the dependent variable. Geographic detectors method is adept at analyzing type quantities, and for sequential Quantities, ratios or intervals, as long as they are appropriately discretized (Cao et al 2013). Another unique advantage of Geodetector is to detect impacts of the interaction of two factors on the dependent variable. The general identification method of interaction is to add the product of two factors to the regression model to test its statistical significance. That is why we used Geodetector to quantify the effects of climate variables on SSM. The geographic detector can determine whether there is interaction between two variables, and the strength, direction, linearity, or non-linearity of the interaction by calculating and comparing the single factor and the factor after the two factors are superimposed. The two-factor superposition includes both the multiplicative relationship and other relationships. As long as there is a relationship standing between two variables and this relationship can be evaluated (Wang et al 2010). More detailed information about the algorithms can be found in Wang et al (2010).
where i = 1, 2, …, m is each unit of influencing factors; n D,i is the number of units; σ G 2 and σ D,i 2 are the variance of observed region and each unit. P D,G locates among 0-1, and the more the value of P D,G is, the greater the influence of the factor D on SM).
When the P D,G equal closely to 1, that is, is near to 0, that means the influencing factor has the same spatial patterns with the SM (Wu et al 2016). Figure 2 shows the ratio of the calibrated trend of historical SM from 25 GCMs to the trend of the GLDAS-Noah SM by the Sen's slope method during 1981-2005 and relevant spatial patterns of SSM trends. In general, the SSM in CMIP5 can be used to further explore the future variation, which performs better than the RZSM. There are 16 models of which the median of the ratios is positive for SSM (figure 2(a)), implying consistent trend directions of SSM in GCMs and GLDAS-Noah in more than half of the grids. Therefore, the median of 16 GCMs was calculated to show the calibrated SSM ( figure 2(d)). For the root-zone SM (RZSM, figure 2(b)), there are only nine GCMs for which consistent trend directions were identified at more than half of the grids, which indicates that GCMs have it harder to depict the trends in the RZSM than for SSM. To some extent, the RZSM is difficult to be simulated using modeling techniques such as GCMs in CMIP5 and GLDAS-Noah as well. Meanwhile, the modeled soil depth is heavily model-dependent and varies widely from GCMs. Berg et al (2016) found that the maximum monthly SM over the hydrological active layers varies from ∼1400 kg m −2 for ACCESS1-0 to ∼7000 kg m −2 for MIROC-ESM. The processing method to integrate RZSM to 2 m depth and the hypothesis of linear variation in the soil layers may also introduce uncertainties and even errors. The magnitude of trends cannot be well quantified, even the trend direction can be well captured by SSM of GLDAS-Noah and GCMs. The correction factors are 5.64 and 10.32 for the SSM and the RZSM respectively based on the chosen models (16 for SSM; 9 for RZSM, equation (3)). This indicates that the magnitude of the SSM can be predicted with higher accuracy than the RZSM and the GCMs usually underestimate the magnitude of SM change relative to the DLDAS-Noah data. Figure 2 Figure 3 illustrates the annual change of the volumetric moisture content for SSM under four scenarios: RCP2.6, RCP4.5, RCP6.0 and RCP8.5 covering 2006-2100, which has been calculated by moving average. Projection results indicate that SSM outputs of more than half of 16 models are in drying trend for all scenarios considered in this study, by which the drying trend is acceptable in the future. Feng and Fu (2013) and Dai (2013) advocated that, in the backdrop of global warming, the arid areas in many lands have expanded and will continue to be expanding in the next century. Meanwhile, the results of different models are discrepant, even the GCMs have been selected by certain criteria and the SSM based on CMIP5 outputs is comparable in quantity (Berg et al 2016). The averaged SSM based on most GCMs outputs is around 0.20-0.25 m 3 m −3 . However, SSM outputs by FGOALS-s2 and INM-CM4 predicts the SSM of ∼0.33 m 3 m −3 above the mean value, and SSM outputs by IPSL-CM5A-LR and IPSL-CM5A-MR predict the SSM of ∼0.10 m 3 m −3 below the mean value. Lu et al (2019) found that the uncertainty of future SM projection is mainly determined by models, compared with the uncertainty from the global decadal annual temperature changes (Hawkins and Sutton 2009). Under RCP8.5 scenario, SSM of 12 GCMs is in obvious drying tendency. Nearly half of models identify that, under RCP8.5 scenario, the surface soil layer displays less mean SM content than other scenarios, as a contrast, and the models display different features for different scenarios. Generally, the warmer the climate, the drier the SSM. Lu et al (2019) found that the average annual SM in the 21st century shows significantly large-scale drying and wetting tendency in a limited area under all scenarios, and the stronger radiative forcing brings stronger drying tendency, which is consistent with our findings.

Future SSM changes based on GCM outputs
Throughout the whole future periods (2006-2100) in CMIP5 simulations, the spatial patterns of SSM changes are basically similar, that is, wetting SSM in WNW, CNW, ENW, NCH and south of NEC, and drying SSM in the other regions (figure 4). From lower to higher emission scenarios, the magnitude of SSM changes gradually increases and more grids are characterized by significant SSM changes. But SSM changes under RCP6.0 scenario do not conform to this pattern. Under RCP2.6 scenario, the magnitude of SSM change is not evident and a drying SSM trend can be found in 66.8% of mainland China (15.4% with <0.05 P-value, 7.9% with <0.01 P-value). Comparatively, wetting SSM trends were identified in 33.2% of mainland China (2.8% with <0.05 P-value, 1.2% with <0.01 P-value). Also, substantial spatial heterogeneity of SSM changes across China was found, implying no obviously consistent SSM changes under RCP2.6 scenario relative to those under other scenarios. Under RCP8.5 scenario, drying SSM trends were detected across 56.9% of China (44.2% with <0.05 P-value, 38.0% with <0.01 P-value). Contrastively, wetting SSM trends were observed across 43.1% of China (29.4% with <0.05 P-value, 25.1% with <0.01 P-value). Considerable spatial heterogeneity in SSM changes can also be observed across different regions of China. However, confirmative and discernable spatial patterns of SSM changes under RCP8.5 scenario can also be identified with exceptions of CNW and NCH. These results indicated larger area of regions with significant SSM changes given higher temperature. These results are consistent with the findings that the south regions are getting dryer and the north regions except northeast regions are getting wetter in China (Berg et al 2016). Moreover, Douville and Plazzotta (2017) found that a summer mid-latitude drying on the northern continents recently appeared at the end of 20th century, which has been attributed to anthropogenic climate change. Figure 5 shows the evolution of the multi-model ensemble median of the percentage of SSM change relative to the reference time , which is generally used to explore the past and future SM variations (Rodrigo 2015;Wu et al 2015). Generally, SSM changes during the periods of 2006-2030, 2041-2065 and 2076-2100 indicated wetting tendency in arid regions and drying tendency in humid regions. Specifically, most regions in WNW are getting wetter, and wetting tendency can also be found in the north parts of CNW and the northwest parts of ENW. The south parts of China are getting drier where there are naturally humid regions.

Evolution of relative percentage of SSM under different scenarios
Under RCP2.6 scenario, the north China including WNW, TPC, NEC, ENW and north of CNW are getting wetter, and in WNW particularly during 2006-2030. The period of 2041-2065 witnessed shrinking regions with wetting tendency to north and west such as WNW, ENW and north of CNW. The period of 2076-2100 was characterized by drying tendency across the entire China. Relative to area and magnitude of wetting tendency during 2041-2065, the area and magnitude of wetting tendency shrink during 2076-2100. Meanwhile, the bounds and magnitude of drying tendency are expanding. Wherein, the obviously drying regions can be observed in the northwestern part of CCH. Under RCP4.5 scenario, the wetting tendency can be found in WNW during 2006-2030 and 2041-2065 and gradually extends to CNW during 2076-2100. However, regions with appreciable drying tendency move westwards till to the TPC. Under RCP6.0 scenario, the WNW is dominated by wetting tendency. The north China is dominated by wetting tendency during 2006-2030, by drying tendency during 2041-2065 and again by wetting tendency during 2076-2100. Consistent drying tendency can be found in the south China. Under RCP8.5 scenario, magnitude of SSM change continues to increase, i.e. the dry-wetter and wet-drier scheme.
For cold regions like NEC and TPC, wetting trend is detected during 2006-2030 and followed by a sharp drying trend under RCP6.0 and RCP8.5 scenarios. Under these scenarios, the persistently rising air temperature gives rise to melting of glaciers, snow cover and permafrost in NEC and TPC. In WNW, the increased SSM is observed during historical period relative to the reference period. The discrepancy in the SSM projections under these four scenarios considered in this study is quite small during the period of 2006-2030 due to the similar radiative forcing during this period (Meinshausen et al 2011). During 2041-2065.0 to RCP8.5 scenarios, the drying amplitude of the SSM gradually increases, drying amplitude of SSM under RCP6.0 scenario is smaller than that under RCP4.5 scenario which is related to the radiative forcings under different scenarios. RCP8.5 is referred to the highest radiative forcing scenario, and RCP2.6 is the lowest radiative forcing scenario in the following century. But in the near future, approximately by the year of 2050, the radiative forcing under RCP4.5 is higher than that under RCP6.0, that is adverse in the longterm future (Masui et al 2011, Nazarenko et al 2015. Here we also presented the evolution process of SSM to explore the SM variation at shorter time step under RCP8.5 scenario and the SM variation of each 10 years was displayed (figure 6). RCP8.5, as the highest radiative forcing, will bring to a strongest global warming related with other scenarios (Riahi et al 2011). But we found that the trends of SM variation in the same one place during different future periods are different even under the highest radiative forcing scenarios, RCP8.5. It can be seen from figure 6 that SSM at most of the grids is in consistent changing properties during the entire study period. However, the entire NEC is dominated by wetting SSM during 2011-2030, then by drying SSM during 2030-2080. During 2081-2100, SSM variation is subject to discernable spatial heterogeneity in NEC, SSM in most grids continues to decrease and increased SSM can be observed in the southwest of NEC. Figure 7 shows the temporal variations of the standardized SSM during 1981-2100. SSM overall is in a downward trend across the entire mainland China under all scenarios considered in this study, indicating a drying tendency in the future under RCP8.5 scenario in particular. In WNW, increased SSM during in 2005-2035 is observed. A wetting trend in SSM variability can be identified during 2036-2100, highlighting the possibility for increased frequencies of extremely dry and wet spells. In ENW, there is a slightly wetting SSM trend in the future. Besides, NEC, CCH, SCH, SWC and TPC witness a significant drying SSM in the future ( figure 6). Interesting is that the period of 2005-2100 shows no obvious trend in NCH and CNW; however, the variabilities are increasing. Figures 5 and 6 illustrate significant spatial heterogeneity of SSM in NCH and CNW and nearly half of the area shows different changing features of SSM.

Attribution of meteorological variables to SSM variations
Meteorological variables having potential impacts on SSM are precipitation, temperature, relative humidity and wind speed (Deng et al 2020). Precipitation amount has direct impacts on SSM changes, with the relative contribution of up to 43.4% to SSM variations (table 3) due to the most important supply from precipitation for most regions which is consistent with the previous findings all over the world in the past period (e.g. Deng et al 2020  We found different impacts of these meteorological factors on SSM changes. Specifically, SSM in the WNW is mainly influenced by precipitation changes with relative contribution of 51.1% (table 3). SSM in NCH is affected by the temperature changes with relative contribution of 28.5% (table 3) due to the fact that a sharp rise of temperature can affect SSM content by modifying evapotranspiration processes. Meanwhile, evapotranspiration also affects precipitation by changing water vapor transportation to the atmosphere (Oki and Kanae 2006). It was evidenced that SSM is in negative relation with temperature Huang 2016, Deng et al 2020). Besides, SSM changes in CNW and CCH are affected mainly by the wind speed and the relative humidity with relative contribution of 21.8% and 24.7% respectively to SSM changes. The results show that the SSM change is affected by the combination of meteorological factors, except these, the vegetation and agricultural activities in the future also have great impacts on SSM changes , Deng et al 2020.

Conclusions
We thoroughly analyzed spatiotemporal patterns of SSM across mainland China and the relevant influencing factors behind the variations in SSM. The following findings and conclusions are obtained: (a) The direction of the historic trends of SSM during 1981 and 2005 can be well captured by 16 out of the 25 CMIP5 GCMs. The historic simulation and future projections of SSM for these 16 models provides the basis for further analysis. The SSM data for the historic period by the GLDAS-Noah was used to calibrate the magnitudes of the corresponding SSM trends estimated by the selected 16 models with satisfactory performance.

Data availability statement
The data that support the findings of this study are available upon reasonable request from the authors.