Impact of alternative soil data sources on the uncertainties in simulated land-atmosphere interactions

Numerical weather-and climate prediction models rely on soil data to accurately model land surface processes. However, as soil data are produced using soil profiles and maps with multiple sources of uncertainty, wide discrepancies prevail in global soil datasets. Comparison of four commonly used soil datasets in Earth system climate models, i


Introduction
Land surface, a key component in the Earth system, is recognized to have a critical role in the terrestrial and climate systems.Land surface characters are relevant to a large variety of different processes within the boundary layer, and strongly influence the land-atmosphere interactions and coupling (Fatichi et al., 2020;Seneviratne et al., 2010).The pedosphere, i.e. the soil, although usually considered a static boundary information of the land surface model, exert significant feedback on the weather and climate system as a large amount of freshwater is stored in this sphere (Dennis and Berbery, 2021;Greve et al., 2013;Santanello et al., 2018).Particularly, as the soil is not as readily observable as the land use and land cover, it has been long time considered as one of the least developed global environmental layers with limited data accuracy (Dai et al., 2019;Hengl et al., 2017).In recent decades, by organizing and harmonizing vast soil survey information with different soil mapping approaches, researchers have produced various geospatial soil datasets from a regional to a global scale (Batjes, 2016;FAO, 2013;Hengl et al., 2017;Miller and White, 1998;Shangguan et al., 2014).Nevertheless, soil datasets are utilized in mixed in Earth system climate model simulations and some older datasets are still widely in use, although the data may be out-of-date and their resolution too coarse to match the model grid (Dai et al., 2019;DY and Fung, 2016).Understanding the impact of soil datasets commonly used in Earth system climate models can be beneficial for robust modeling capabilities.
Land surface models (LSMs) have grown in complexity to contain a wide range of physical functionalities, incorporating the hydrological, biophysical, chemical, and radiative aspects, and are integrated with weather and climate simulations (Lawrence et al., 2007;Lin and Cheng, 2016;Niu et al., 2011).Soil data is required in LSMs and in the land surface-atmosphere coupled modeling, it is assumed to obtain more reliable model predictions through accounting for feedback in the soil-vegetation-atmosphere continuum (Dirmeyer et al., 2016;Fisher and Koven, 2020;Santanello et al., 2013;Wei et al., 2021).For instance, Dennis and Berbery (2021) employed the Weather Research and Forecasting (WRF) model with the Community Land Model (CLM) to investigate the dependence of modeled climate to soil maps over the continental United States.Their results demonstrated that widespread differences in soil texture affect the soil moisture content, thus eventually leading to regional climate differences.Over an inland river basin in northwest China, Gao et al. (2008) replaced the soil information in the coupled fifth-generation non-hydrostatic Mesoscale Model (MM5) with a local soil map, and found that the area mean bias of simulated monthly precipitation was greatly reduced.In the same region, Zhang et al. (2019) further found that the streamflow simulation was largely improved by updating the soil map in a fully coupled atmospheric-hydrological modeling system.In a case study, Pedruzzi et al. (2022) illustrated the impact of erroneous or outdated land cover and soil texture data in the Numerical Weather Prediction model WRF for São Paulo, Brazil, and recommended an update of local land surface data for improved weather condition modeling and air quality modeling.DY and Fung (2016) compared the default global soil map generated by the Food and Agriculture Organization of the United Nations and an updated global soil map from Beijing Normal University for the WRF model, and examined the differences in near-surface temperature and humidity in the model simulations using the Noah LSM.They highlighted that the soil hydrological properties have a strong effect on soil moisture content over a long period of time, and found an improved prediction of surface air temperature and relative humidity with an updated soil map.Many studies confirmed the board range of soil moisture impacts on variations of atmosphere conditions including temperature and precipitations (Lin and Cheng, 2016;Seneviratne et al., 2010;Song et al., 2016), depending on the spatiotemporal scales, synoptic regimes, seasons, and regions considered.
Soils in southern Africa are under threat of degradation due to soil erosion, reduction of soil nutrients and organic matter, and loss of soil biodiversity (du Preez and van Huyssteen, 2020;Gomiero, 2016;Tamene et al., 2019).Soil profiles in this area thus have been extensively investigated in situ and collected through many soil survey programs (Dewitte et al., 2013;Leenaars et al., 2014), with comparatively detailed soil data (Batjes, 2016).However, inconsistencies in different global soil data are also evident in southern Africa (Dai et al., 2019;DY and Fung, 2016).Modeling sensitivity studies of LSMs and weather and climate models for southern Africa mainly focus on land use/land cover change descriptions (Glotfelty et al., 2021), model physical parameterizations (Crétat et al., 2012;Laux et al., 2021), and atmospheric initial conditions (Crétat et al., 2011).High-resolution climate modeling studies and applications in southern Africa are still using outdated and coarse-resolution soil data, neglecting the uncertainties associated with soil data.Additionally, as more sophisticated hydrological processes are included in regional land surface-atmosphere models, i.e., coupled land surface-hydrological-atmospheric modeling, soil data as well as their hydrophysical properties are expected to have a larger impact on model results.Accordingly, assessing the effect of soil data uncertainty is a key step towards improving the capability of current state-of-the-art coupled models to reproduce fundamental features of key climatic variables.
To this end, the research question of this study is focused on the assessment of uncertainties in climate variables attributable to soil data in a high-resolution regional land surface-hydrological-atmospheric coupled model in southern Africa.The fundamental hypothesis is that there is an apparent and direct impact of commonly used global soil datasets.In this study, we consider the following four digital global soil datasets: the Food and Agriculture Organization (FAO) Digitized Soil Map of the World (FAO, 2013), the Harmonized World Soil Database (HWSD, FAO/IIASA/ISRIC/ISS-CAS/JRC, 2012), the Global Soil Dataset for Earth System Model (GSDE, Shangguan et al., 2014), and the global gridded soil information system SoilGrids (Hengl et al., 2017), implemented in the Advanced Research version of the WRF Hydrological Modeling system (WRF-Hydro).These global soil datasets are selected as they are generally acceptably used in climate and weather simulation applications.By conducting coupled simulation for the southern Africa, south of 19 • S, we quantified the internal variability of simulation variables that link soil texture and its properties to the surface and near-surface states and fluxes.To our knowledge, this is the first effort for southern Africa that addresses coupled land surface-hydrological-atmosphere modeling uncertainties originating from different soil data.The remainder of this article is organized as follows: Section 2 describes the method and materials, including a model description, experiment design with selected soil data, reference data, and analytical methods.The results from coupled modeling are shown in Section 3, and the conclusions are discussed in Section 4.

Regional land surface-atmosphere coupled model setup
The fully coupled Weather Research and Forecasting Hydrological Modeling system, named WRF-Hydro, is used as the regional landatmosphere coupled model to assess the role of soil texture and associated hydrophysical parameters in regional land-atmosphere interactions.The WRF-Hydro is the community research model (Gochis et al., 2020) developed by the National Center for Atmospheric Research (NCAR), aiming to accurately represent the physical processes of the land surface and the atmosphere as well as their two-way interactions.In comparison to the conventional numerical weather prediction model WRF, WRF-Hydro further considers the lateral terrestrial hydrological processes and their atmospheric feedback, thereby improving the realism of earth system interactions.As a consequence, WRF-Hydro has been commonly used in recent research applications (Arnault et al., 2021a;Quenum et al., 2022;Zhang et al., 2022).A detailed description of the WRF-Hydro model can be found in Gochis et al. (2020).
The Advanced Research WRF model version 4.2 with the hydrological module of WRF-Hydro version 5.1 is conFig.dfor this study.The model domain is conFig.dwith a convection-permitting horizontal resolution of 4 km with 650 × 500 grid points, covering the southmost part of Africa, including e.g., South Africa, Lesotho, Eswatini (domain extent is shown in Fig. 1).The WRF-Hydro model is directly forced by ERA5 reanalysis at 3-hourly time intervals (Hersbach et al., 2020).Downscaling simulations are performed using single-domain approach at the convection-permitting resolution, excluding the uncertainties associated with multiple nesting and cumulus parameterization in the outer domain.The underneath statics soil textures are different in the designed experiments, which will be detailed in Section 2.2.The land use and land cover map used in all simulations are the same and are based on the Moderate Resolution Imaging Spectroradiometer (MODIS) 21-class dataset.
Based on previous climate dynamic downscaling applications over southern Africa (e.g., Abba Omar and Abiodun, 2021;Crétat et al., 2012;Ratna et al., 2014;Ratnam et al., 2013), the selected set of atmospheric physics schemes includes the shortwave and longwave radiation schemes of Dudhia (1989) and Mlawer et al. (1997), the WRF single-moment 6-class microphysics scheme of Hong and Lim (2006), and the Yousei University planetary boundary layer scheme (Hong et al., 2006) along with the revised Monin-Obuhkov surface layer scheme (Jiménez et al., 2012).As our model grid is within the convection-permitting model resolution (Prein et al., 2015), no cumulus parameterization scheme is activated.Previous studies also confirmed that by using a 4 km grid-spacing the subtropical convection over southern Africa can be reproduced (Kendon et al., 2019;Senior et al., 2021).
The community land surface model Noah with multiparameterizations (Noah-MP) is used for simulating momentum, heat, and water exchanges at the land surface within the WRF-Hydro model system.The Noah-MP LSM has four vertical soil layers within a total 2meter soil depth, and parameterizes vertical water and heat transport with diffusive Richards' equation and thermal diffusion equations.Details about the Noah-MP land surface model can be found in (Niu et al., 2011).In addition, WRF-Hydro handles lateral terrestrial water flow to parameterize water horizontal movement in the land surface.The overland and subsurface lateral flow routing are computed on a separated 400-m subgrid which represents the refined terrain gradient.The 4-km WRF-Hydro/Noah-MP grid and the 400-m subgrid interact through a disaggregation-aggregation procedure to map surface hydrological conditions, therefore, the lateral terrestrial water flow also has atmospheric feedback in the modeling system (Gochis et al., 2020).It is noted that this study aims to compare modeling simulations in order to investigate the role of soil data in the regional land-atmosphere coupled modeling, rather than to optimize the Noah-MP model schemes and hydrological parameter calibration.We therefore keep the default schemes and parameter values in Noah-MP and in WRF-Hydro water routing modules, without considering channel routing and baseflow bucket modules.
The model simulations are run for the period from January 2015 to June 2016.The first six months of the simulations are designated for model spin-up, and the entire austral year from July 2015 to June 2016 is chosen for analysis.As for the spin-up time, previous studies have commonly employed a period of 1-2 months for model spin-up in southern African regions (e.g., Crétat et al., 2012;Ratna et al., 2014).Accordingly, our choice of 6 months is considered sufficient for the model to reach an equilibrium state of surface conditions.

Soil datasets
Four commonly used digital global soil datasets from the Food and Agriculture Organization (FAO), the Harmonized World Soil Database (HWSD) version 1.2, the Global Soil Dataset for Earth System Model (GSDE), and the global gridded soil information system SoilGrids, were used in the study.The default soil data provided by WRF/WRF-Hydro preprocessing system is generated from the State Soil Geographic datasets (STATSGO) for the entire region of the United States and the FAO soil dataset for the rest of the world (FAO, 2013;Miller and White, 1998).Although the FAO soil dataset is ~10-30 years old and may not accurately reflect the soil state at the land surface, it remains the most widely used in current weather and climate model applications.The FAO soil data has a grid spacing of 5 arc minute (~9 km).HWSD is built by harmonizing the existing globally available regional and national soil databases within the 1:5000,000 scale FAO soil world map using a standardized structure (FAO/IIASA/ISRIC/ISS-CAS/JRC, 2012).To further meet the needs of various types of Earth system models, the GSDE dataset, designed for the improvement of HWSD, is developed by incorporating more local soil maps and soil profiles related to the soil maps, and with more soil properties (Dai et al., 2019;Shangguan et al., 2014).Both HWSD and GSDE have the same grid spacing of 30 arc-second (~1 km).SoilGrids is a recent global soil information system, which is produced by fitting machine-learning prediction models using more than 230 000 soil profile observations (Hengl et al., 2017).With a resolution of 250 m, SoilGrids provides the spatially most detailed estimations of global soil distribution.
In the modeling practice, the Noah-MP LSM requires empirically derived soil hydrophysical parameters paired with prescribed soil texture for physical parameterization of soil thermodynamic and hydrological processes.To ensure comparability among the four soil datasets with different resolutions and depths, all simulations utilize the default option of dominant soil texture.Soil texture is determined based on the United States Department of Agriculture (USDA) classification system, which classifies soil properties such as the percentages of silt, sand, and clay (Soil Survey Staff, 2012).The criteria for classifying the twelve soil texture types are shown in Fig. S1b (supplement), along with four additional soil categories of organic material, water, bedrock and land-ice.Table S1 provides the relevant soil hydrophysical parameters of each soil texture, including porosity, saturated matric potential, saturated hydraulic conductivity, and retention curve slope.These parameters are set as defaults in both WRF-Hydro and Noah-MP models.We emphasize that the configuration of the soil textures compositions and their corresponding hydrophysical parameters are driven by the particular use for the land surface modeling in our modeling approach.The accuracy of the soil texture classification in global soil data is often questionable, and the soil hydrophysical parameters are empirical derived, predominantly from soil samples collected in the United States (Cosby et al., 1984;Kishné et al., 2017).

Reference dataset and evaluation protocol
In the effort to validate the simulations' performance, four global gridded datasets of near-surface temperature, precipitation, land surface evapotranspiration, and surface soil moisture are used.These include the temperature dataset of Climatic Research Unit grided v4 (CRU) with a spatial resolution of 0.5 • (Harris et al., 2020), the precipitation of climate hazards infrared precipitation with stations data v2 (CHIRPS) with a spatial resolution of 0.05 • (Funk et al., 2015), the land evapotranspiration dataset of Global Land Evaporation Amsterdam Model v3.5 (GLEAM) with a spatial resolution of 0.25 • (Martens et al., 2017), and the surface soil moisture from European Space Agency Climate Change Initiative (ESA-CCI) with a spatial resolution of 0.25 • (Dorigo et al., 2017).These datasets are chosen because they are observational-based and produced using data assimilation and physical algorithms.They have been demonstrated to have the ability to represent the spatial variability of climate variables in the southern African region (Al-Yaari et al., 2019;Khosa et al., 2019;Landman et al., 2018;Pitman and Bailey, 2021).To enable direct comparison and calculation of biases and correlations, the variables from simulations are regridded to the grids of the corresponding reference datasets using bilinear interpolation.
In-situ observed near-surface soil moisture dataset from the International Soil Moisture Network ISMN (Dorigo et al., 2021) are further used for soil moisture validation at point scale.In total, data from eleven stations throughout the model domain with records covering the research period are used.These soil moistures records are retrieved by Cosmic-ray probes and GPS receivers.The hourly raw records are averaged to daily scales and then compared to the model results.
Uncertainties in model ensembles to different soil datasets are quantified by estimation of the internal variability (Alexandru et al., 2007;Lucas-Picher et al., 2008).It is measured by the spread among the ensemble members during the time of integration.The spread is calculated by the standard deviation between the member of the ensemble, that is (1) Where X(i, j, t) refers to the value of variable X on grid point (i, j) at time t and for the member n in the ensemble.N is the total number of ensemble members, here N = 4.The term X is the ensemble mean calculated as (2) The inter-member variance σ 2 X (i, j, t) is calculated for all grid points in space and archived for all time steps.To describe the spatial distribution of internal variability over the whole simulation period, the measure of internal variability is further calculated by the square root of the time- where N is the number of all time steps.σ 2 X t (i,j)represents the variability of the simulated variable over a given period and a given location (i, j).

Comparison of soil texture at lower boundary conditions
The dominant category of top-layer soil texture in each model grid cell from four datasets is visualized in Fig. 2, and Table 1 and Fig. S1a further summarize the counted corresponding percentage of each soil texture category in the study domain.Differences in soil texture between the four soil datasets are pronounced.In general, most of the soil texture is massively blocky distributed in the model domain.As shown in Fig. 2 and Table 1, the dominant soil textures in FAO and HWSD are sandy loam and sand, comprising 46.6% and 31.0% of the model domain grid, respectively.Sandy clay loam is the dominant soil texture in both GSDE and SoilGrids, comprising 22.7% and 44.1% of the domain, respectively.Spatially, FAO gives the most simplified soil texture, mainly classifying the study area as sandy loam, loam, and sandy clay loam (Fig. 2 and  S1a).SoilGrids also simplifies the soil texture mainly into three categories as loamy sand, sandy loam, and sandy clay loam.The soil textures classified in HWSD and GSDE are somewhat similar, partly attributed to the fact that the soil data sources are the same and the soil mapping approaches are similar (Dai et al., 2019).According to Shangguan et al. (2014), GSDE incorporated additional soil profiles and soil maps compared to HWSD, therefore the associated improvements are represented in Fig. 2. For example, the soil texture at the border between Namibia and South Africa is shown continuously as loam in GSDE, instead of being distinctly divided as sandy loam and clay loam by the country border in HWSD.These two soil texture categories are not adjacent to each other in the USDA soil texture triangle (Fig. S1b).Also, the soil texture over Lesotho is better classified in GSDE than identically classified in HWSD, and the soil texture of sand in the south of Botswana in HWSD is further classified in GSDE.
In terms of soil grain size, HWSD and GSDE generally describe a coarser grain size composition than FAO and SoilGrids across the study area, e.g., from Sand to Sandy Loam over Northwest arid land.Spatially, SoilGrids documents an overall continuous decrease in soil grain size from northwest to southeast.It is suggested by Dai et al. (2019) and Hengl et al. (2017) that Soilgrids largely improves the representativeness of spatial variations in soil properties at a very high grid resolution.Overall, apparent differences in soil texture are observed in the four selected global soil datasets over the study area.Given that the soil hydrophysical parameters are described differently based on the prescribed soil textures, further variations in the simulated hydrometeorological fields at the land-atmosphere interface can be expected from the model ensembles.

Evaluation of model simulations
The simulation results of dynamically downscaled hydrometeorological fields are evaluated first.The 2-meter air temperature and precipitation of the simulation ensemble are compared with reference datasets and are illustrated in Figs. 3 and 4, respectively.The temperature observation shows a cool temperature along the Drakensberg mountains, stretching to the Cape Fold Mountains in the south of South Africa.The coolest temperatures appear in northern Lesotho as a result of the significant terrain features.The coastal area and the interior areas surrounding the Namib and Kalahari deserts are warmer, with the highest temperatures over southern Mozambique (Fig. 3a The simulated precipitation from the model ensemble is also comparable with the CHIRPS precipitation (Fig. 4a-b).The spatial patterns are well captured, with more precipitation toward the east range of the Drakensberg Mountains over Mpumalanga and KwaZulu-Natal, and the southern parts of Eastern Cape and Western Cape.This indicates that both the austral summer precipitation in the east of South Africa and the winter precipitation of the Mediterranean climate in the southwest coastal area are well represented.Additionally, the precipitation gradient in the east-west direction is quite comparable between the  simulations and CHIRPS observations.Regarding the precipitation biases, a precipitation dry bias occurs along the west coast, with the highest percentage of biases over the Namibia desert (Fig. 4c).Nevertheless, the absolute bias over the desert is very small, less than 12 mm over the year.The wet precipitation bias is mostly distributed over the Drakensberg Mountains, alongside the mountain peaks (Fig. 4c).Overall, the model overestimates the mean land precipitation by about 30.2 mm compared to the CHIRPS precipitation of 342.4 mm, with a percentage bias of 8.7%.Compared with previous regional climate dynamic downscaling cases (e.g., Arnault, Jung, et al., 2021;Crétat et al., 2012;Crétat and Pohl, 2012;Ratna et al., 2014), these high-resolution WRF-Hydro simulations successfully reproduce the key variables of temperature and precipitation, and additionally, show somewhat improved results in terms of simulation biases, since this convection-permitting resolution excludes the uncertainties and biases of the convective scheme (e.g., Prein et al., 2015;Senior et al., 2021;Zhang et al., 2021).
Figs. 5 and 6 present spatial comparisons of land evapotranspiration and surface soil moisture, respectively.The patterns of both variables show slightly higher values in the eastern mountains and low values in the interior and western regions, similar to the patterns of precipitation.Compared to the reference dataset, the WRF-Hydro ensembles generally exhibit negative biases in evapotranspiration and soil moisture over the western arid area, with small absolute bias values of − 0.015 mm/day and − 0.05 m 3 /m 3 respectively.The evapotranspiration and surface soil moisture values over the eastern wetter region are comparable, with small mixed biases distributed spatially.

Internal variabilities in model ensemble simulations
The internal variability of each simulated hydrometeorological field is estimated using the square root of the time-averaged variance ( ) among ensemble members following Eq.(3).For the 2-meter temperature shown in Fig. 3d, a clear increase in internal variability is found in the interior region of the study area.This region primarily consists of soil texture dominated by sand, loamy sand, and sandy loam, located at the left angle of the USDA soil texture triangle (Fig. S1b), characterized by the largest grain size.For other areas with soil textures dominated by loam and sandy clay loam, the internal variability is very small.Although the spatial pattern of temperature is directly influenced by terrain elevation (Fig. 3b), topography does not seem to have a large effect on the internal variability of simulated temperature.The uncertainties of temperature values are considerably related to the soil texture description.
The internal variabilities of precipitation and evapotranspiration relating to soil texture are spatially distributed similarly over the entire domain.Higher internal variability is visible in the northern and eastern part of interior, whereas lower variability characterizes west and south coastal area.Overall, the internal variability of model is higher in the interior than in the coastal areas, showing a west-east and south-north gradient.In the model simulation, the lateral boundary and sea surface temperature are prescribed identically, and the prevailing winds carry moisture from the coast to the land, where they interact with the land surface and then are increasingly perturbed by soil texture disparities.These perturbations are maximized in the inland region, where they may lead to greater differences in precipitation representation.Precipitation variability is closely linked to the intensity of rainy days as well as precipitation location and amount, which are jointly triggered by regional-scale convective processes and large-scale synoptic variabilities.Consequently, higher internal variability of precipitation is clustered to the inland area in the eastern and the subtropical latitudes, including the complex terrain region of Drakensberg Mountains, due to the strong topography effect on convection triggering.In contrast, coastal areas in the west and south exhibit the lowest variability of precipitation (Fig. 4d), even though the southern coast experiences high precipitation due to the Mediterranean climate (Fig. 4b).The southeastern coastal areas receive abundant precipitation but exhibit less variability, which is also attributed to the fact that moisture sources are mostly not perturbated.Overall, simulated evapotranspiration and surface soil moisture have low internal variabilities (Figs.5d and 6d), acknowledging the long persistence of soil moisture and the fact that coupled WRF-Hydro model spatially redistributes the soil moisture from higher to lower areas by lateral flow processes.The high internal variability of evapotranspiration is shown to be partly related to the areas with high internal variability of soil moisture, such as the southern coastal areas of Mozambique and the areas close to the border of Zimbabwe.As shown in Fig. 6d and Fig. 2, the soil textures of sand and loamy sand usually exert a larger impact on the internal variability of soil moisture.The comparison of spatial patterns of simulated precipitation minus evapotranspiration (P-E), which represents the terrestrial water availability is shown in Fig. 7. Overall, common features are represented among all model ensemble members.The eastern mountain regions exhibit a large amount of excess precipitation that forms runoff, while the western and central highveld has very small net precipitation.The regions with negative net precipitation represent dry land conditions, such as the dryness in the lower Limpopo River stretching to Mozambique and the significant impact of forestry plantations on evapotranspiration.Many studies showed that forest plantations in the Kwazulu-Natal and the Eastern Cape of South Africa led to a reduction of water resources (e.g., Meijninger and Jarmain, 2014;Tuswa et al., 2019).These spatial characteristics are well captured by the coupled model simulations, although evapotranspiration in Noah-MP and WRF-Hydro is accounted for only within a 2-meter soil depth.Response to differences in soil textures on the available water are visible among model ensemble, clustered in the mountainous area such as Lesotho and the eastern flanks of Drakensberg (Fig. 7).Theses difference arise from variations in precipitation differences, soil wetness and the redistribution of water flux through lateral water processes.In HWSD and GSDE experiments, high net precipitation is evident along the southern coasts of Mozambique, associated with soil texture of sand present in the area.

Model sensitivity on spatially distributed soil moisture
A daily time series comparison is conducted to evaluate the simulated topsoil moisture against both in-situ observed and satelliteretrieved surface soil moisture at 11 sites, as shown in Fig. 8.Some gaps are present where in-situ measurements and satellite-retrieved data are missing.As soil texture varies among the simulation members, the detailed inventory of soil texture in each simulation is listed in Table 2. Similar to the findings from previous studies that compared numerical models and observations (e.g., Fersch et al., 2020;Greve et al., 2013;Lin and Cheng, 2016;Massey et al., 2016;Zhang et al., 2019), systematic differences in soil moisture content are observed at some sites.This is in fact related to the measured depth of soil wetness on the one hand, and to the physical properties of the soil on the other.Nevertheless, the simulated soil wetness as well as their variabilities are consistent with the observations, with the simulation results showing significant correlation with the observations (statistically significant at 1% level, except for the case of SoilGrids at the Gobabeb site).Additionally, the root mean square errors (RMSE) are less than 0.1 m 3 /m 3 at 9 out of 11 station sites (summarized in Table S2).The occurrence and trends of both rapid increases and decreases in soil moisture content are in agreement across the simulated time series, indicating the reasonable simulated precipitation events in spatial.In general, the simulation results are found to be in better agreement with the ESA-CCI satellite observations.Regarding the intercomparison among the four model members, soil texture is found to have a direct impact on soil moisture content.Specifically, the assigned field capacity and wilting point values for different soil textures (Table S1) directly affect the overall soil wetness.For instance, overall soil moisture is higher for Clay Loam than for Sandy Clay Loam and Loam, as seen in the Cathedral, Baynesfield and Sutherland in-situ sites.At the hyper-arid Gobabeb site, differences are particularly remarkable, highlight the distinct impact of soil texture (Fig. 8).Additionally, soil texture with more sand tends to respond rapidly to rainfall, with earlier peaks and quicker drain out.This is partly evident in Mapungubwe and Mafikeng sites, where soil moisture in HWSD responds faster due to the soil texture of Sand.Generally, soils with more clay tend to have a high water-holding capacity and dry out slowly, while soils with more sand behave the opposite.These characteristics can be derived from these site-scale comparisons, and are further comparable to results from standalone LSMs and climate modeling (e.g., de Lannoy et al., 2014;DY and Fung, 2016;Lin and Cheng, 2016;Yang et al., 2011).It is worth emphasizing that the WRF-Hydro model used in this study considers not only soil water vertical diffusion and surface evapotranspiration, but also horizontal soil water redistribution based on saturated soil exfiltration and overflow reinfiltrating (Gochis et al., 2020).As a consequence, these results underscore that soil texture exert a distinct influence on soil moisture variability even in complex models that account for detailed hydrological processes.

Effects of soil texture differences on land-atmosphere interfaces
The impact of variations in soil hydrophysical properties on landatmosphere interactions is investigated through a thorough comparison of two simulation cases.We considered GSDE and HWSD for the comparison because they were developed using a similar framework and also GSDE was developed with the purpose to improve HWSD's protocol (Shangguan et al., 2014).Our analysis indicates that there are still considerable differences in soil texture between the two datasets, despite their similar development process (as shown in Fig. 9d).Additionally, Fig. 9a-c illustrate the differences in selected soil hydrophysical parameters assigned from the lookup table in the model simulations.It is apparent that the hydrophysical parameters differ in space following the difference in soil texture.In general, soil moisture at saturation (porosity) and hydraulic conductivity are slightly higher in GSDE for South Africa and slightly lower for Lesotho (Figs. 9a, c).The parameters of wilting point and field capacity (not shown, but spatially similar to the wilting point) generally decrease as the soil grain size increases, mainly occurring in the Northwest of South Africa (Fig. 9a, d).The Because the large seasonal differences in precipitation and temperature in the study area (Fig. S2), the impact on variables crucial to land surface-atmosphere interactions is investigated in the austral summer and winter months, shown in Figs. 10 and 11, respectively.Shown in Fig. 10, obvious differences in the variables can be distinguished.In terms of surface air temperature and surface skin temperature, lower temperatures in GSDE can be identified over the middle of the northern part of the domain (MND) and the Northwest of South Africa (NSA) (Fig. 10a).Yet the hydrothermal processes associated with them are different.Over the area of MND, the soil texture transitions from Sand to Loamy Sand and Sandy Loam, increase the soil's water-holding capacity (as shown in Fig. 9).Summer rainfall in this area (spatially ca.100-300 mm, Fig. S2a) enhances soil moisture (Fig. 10d) and evapotranspiration/ latent heat (Fig. 10g), leading to an enhanced evaporative cooling effect (as seen in Fig. 10c), thus decrease the temperature at the surface.Over the area of NSA, summer temperature is very high, and rainfall is very low < 30 mm (Fig. S2a, c).Under such dry conditions, in most cases, surface soil moisture decreases to the wilting point, which can be identified in sites Upington and Sutherland in Fig. 8.The soil moisture in GSDE is lower than in HWSD (Fig. 10d) due to the decrease in soil parameters representing the wilting point (Fig. 9).Such drier soil moisture in GSDE decreases the thermal conductivity, preventing the temperature increase in the soil and leading to lower temperatures than HWSD.In the area of NSA, as the input energy fails to remove more water from the soil, the difference in latent heat follows the precipitation difference, and sensible heat increases slightly in the GSDE (Fig. 10g-h).For the eastern part of the model domain with more precipitation, differences in surface variables are also jointly impacted by the precipitation difference.The differences in runoff mainly occur in temperate areas with complex topography (Fig. 10f).As shown in Fig. 10a, the difference in planetary boundary layer height (PBLH) was impacted by the surface temperature and moisture conditions.
These results indicated that the spatial variations concerning predicted soil moisture and skin temperature are closely associated with differences in soil texture.Previous studies on standalone land surface modeling have also demonstrated the significant sensitivity of simulated terrestrial water components to input soil texture data (e.g., DY and Fung, 2016;Li et al., 2018;Zheng and Yang, 2016).While surface moisture variables are much sensitive to water-flux related physical processes, such as runoff scheme, groundwater scheme and lateral flow (e.g., Gan et al., 2019;Niu et al., 2011;Yang et al., 2021), the present results clearly indicate that the spatial characteristics of soil moisture primarily depend on the water-holding capacity of the soil, as determined by hydrophysical parameters.The changes induced by soil texture to atmosphere (i.e., sensible and latent heat fluxes) are further modulated by local climate and moisture conditions, showing decreased influences over arid regions, which is comparable to the finding of Zheng and Yang (2016).
Regarding atmospheric feedbacks, Fig. 12 illustrates the differences in moisture variables and winds in a vertical cross-section near 28 • S (position indicated in Figs.10d and 11d).Notable differences in water vapor and horizontal, horizontal and vertical wind are observed below 500 hPa, as depicted in Fig. 12a.It is noted that changes in atmospheric water vapor below 850 hPa are closely related to surface soil moisture.Higher soil moisture intensifies water vapor and causes remarkable wind differences in the near-surface layer.In the highveld in eastern interior region, where summer precipitation is mainly triggered by deep convections, changes in atmospheric water vapor have a significant impact on convective instability, thus in turn induces large differences in precipitation, as demonstrated in Fig. 10b.Moreover, changes in atmospheric horizontal wind suggest that the moisture and energy changes in surface soil can further impact atmospheric circulation, and the distinct altered low-level wind fields indicate changes in moisture convergence and atmospheric water transport through land-atmosphere coupling.
In austral winter months, the difference in soil texture, i.e., Sand in GSDE to Loamy Sand in HWSD, in the dry land of the interior (Fig. S2b,  d) enhances the cold and dryness of the surface (Fig. 11a, d-e).This difference also results in a slight decrease in atmospheric moisture in the low atmosphere (Fig. 11c, 12b).The surface heat fluxes are slightly increased, which in turn affects the thickness of the planetary boundary layer (Fig. 11g-i).Precipitation remains not much affected in the interior region (Fig. 11b), owing to the slight change in atmospheric water vapor (Fig. 12b).In contrast, the small and fragmented soil texture differences in the temperate and subtropical coastal area have a noticeable impact on moisture and heat flux variables, which further correspond to modelproduced runoff differences.Also, the correspondence of changes in PBLH and precipitation changes can be partially detected over the coastal area.Above results highlight the impact of corresponding changes in surface water and heat fluxes due to changes in soil texture on land-atmosphere interactions, which are closely related to local climate and moisture conditions and are characterized by inter-seasonal differences.

Discussions and conclusions
Global soil dataset, representing one of the boundary conditions of Earth system models, have been found to exhibit large disparities in many regions around the world.This study examines simulated landatmosphere interface variables in the context of the uncertainties in the global soil dataset over Southern Africa.Four commonly used global soil data, i.e. the FAO, HWSD, GSDE, and SoilGrids, were implemented in the fully coupled regional land surface-hydrological-atmosphere model WRF-Hydro model with the aim to (1) quantify the internal variability of simulation variables introduced by soil data variations, and (2) investigate the impact on land-atmosphere interactions associated with differences in soil texture and its hydrophysical properties.All the simulations were performed at convection-permitting high-resolution of 4 km for the period of January 2015 to June 2016, and with enhanced lateral hydrological processes compared to the conventional weather forecasting model.
Upon comparing the ensemble simulations with observation-based datasets, evaluation results show that the coupled WRF-Hydro model represents the spatial patterns of land surface hydrometeorological fields reasonably well.Specifically, the overall biases for precipitation and air temperature are 0.084 mm/day and − 0.56 • C, and for surface evapotranspiration and soil moisture are − 0.015 mm/day and − 0.05 m 3 /m 3 , respectively.Comparison with in-situ soil moisture observations reveal plausible spatiotemporal variations of surface moisture conditions.These results highlight the model's applicability in investigating land-atmosphere interactions, and in particular, indicate that the modeling perform well compared to previous modeling studies (e.g., Arnault et al., 2021b;Crétat and Pohl, 2012;Ratna et al., 2014;Ratnam et al., 2013).This partly attributes to the fact that using Z. Zhang et al. convection-permitting WRF-Hydro on the one hand revokes the uncertainties associated with cumulus schemes (Prein et al., 2015), and on the other hand improves the realism of the representation of terrestrial hydrological processes (Gochis et al., 2020;Zhang et al., 2022).
Concerning the impacts of different global soil datasets on simulated hydrometeorological variables, the results indicate that the internal variability of precipitation is more pronounced in the inland northern and eastern areas.This can be attributed to increased model perturbations from coastal to inland regions, as well as the topography-induced effects that enhance convection triggering.Similarly, the actual evapotranspiration demonstrates a comparable pattern of internal variability to precipitation as it is constrained by the availability of precipitation.Temperature and soil moistures are uncertain due to differences in soil textures.Higher temperature variability is found over the arid interior characterized by coarser soil texture.Larger soil moisture variability is mainly associated with high soil wetness and large differences in soil hydrophysical parameters.By explicitly comparing two simulations with soil data of GSDE and HWSD for austral summer and winter seasons, differences in surface variables are associated with difference in soil texture and assigned hydrophysical properties.The impact on local climate varies seasonally and differs further depending on local climatic conditions.For instance, the differences of patches of soil texture attribution from Sand to Loamy Sand and Sandy Loam result in cold and wet effects during austral summer and cold and dry effects during the winter time over the semiarid and arid interior areas.Changes in surface energy fluxes also affect atmospheric processes between seasons, as seen in the impact of surface conditions on the atmospheric water vapor, and Planetary Boundary Layer height, which is a function of turbulent eddy growth.
The results of this sensitivity study emphasize the critical role of accurate global soil data in modeling land-atmosphere interactions and underscore the need for continued improvements in soil data quality and consistency.It is worth noting that the availability of regional datasets may offer more realistic land surface characteristics and lead to some  Z. Zhang et al. improvements in regional modeling results (e.g., Gao et al., 2008;Lin and Cheng, 2016;Pedruzzi et al., 2022).However, generalized global soil dataset still continue to be relied upon in weather and climate modeling due to their wide availability, standardized information, and consistent representation.While the primary objective of this sensitivity study was to better understand the influence of soil data variations on modeling uncertainties rather than to identify a superior soil dataset, utilizing recently developed global soil data (e.g.GSDE, SoilGrids) is more likely to be a favorable option for regional modeling approaches (Dai et al., 2019).These datasets integrate more soil observations and detail the soil properties and characteristics at high spatial information, which is particularly important for model simulations at very high resolution (<4 km).However, it is important to recognize that uncertainties related to soil data alone do not solely determine land-atmosphere interactions.Regional characteristics are also influenced by additional factors such as vegetation and local climate overlay.Vegetation types and coverage significantly impact albedo, shading of the soil surface, rainfall interception and regulation of root water uptake, playing a vital role in modulating the radiation budget and water cycle at the local level even extending to surrounding areas (e.g., Boisier et al., 2012;Wang et al., 2023).Moreover, the interlinkage of vegetation dynamics with soil texture and properties is not adequately represented in current weather and climate modeling processes.Therefore, further investigation is required to better understand the combined effects of soil and vegetation on land-atmosphere interactions.Additionally, different physical schemes regarding soil thermal and hydrology process are parameterized differently in various land surface models (e.g., Gan et al., 2019;Van Den Broeke et al., 2018;Zhuo et al., 2019), and the choice of land surface model in coupled modeling may yield different responses regarding soil data uncertainties.Since this study focuses on specific regional coupled modeling study, it is important to consider the location of the study area as well as the overlying climate, as they inherently determine the baseline characteristics of land-atmosphere interactions, and the uncertainties are also influenced by interannual variability.By considering these aspects, more holistic understanding of land-atmosphere interactions can be achieved, leading to improved accuracy and reliability in regional climate modeling.
Overall, our study underscores the non-neglectable influence of soil texture on model-predicted variables at the land-atmosphere interface.It is well acknowledged that soil moisture and its memory have substantial impacts on climate simulations both at regional and global scales (Dirmeyer et al., 2006;Menéndez et al., 2019;Schär et al., 1999;Seneviratne et al., 2006), with signals in soil moisture being transmitted to and manifesting in the atmosphere states and processes through land-atmosphere coupling.At a regional scale, the magnitude of the responses could be large, due to the strong coupling strength over different areas, i.e. hot spot regions (Knist et al., 2017;Koster et al., 2002;Santanello et al., 2011).Specifically, coupling experiments projects have suggested a strong coupling between soil moisture and precipitation over the region of southern Africa (Guo et al., 2006;Seneviratne et al., 2006), and studies have identified that soil moisture exerts both positive and negative feedbacks on precipitation over the dryland area in the study area (Cook et al., 2006;Yang et al., 2018;Zhou et al., 2021).Therefore, the implications and uncertainties resulting from soil data differences extend beyond the surface water and energy fluxes shown in this study.A recent study by Dennis and Berbery (2022) indicated the changes in soil hydrophysical properties on the simulation of North American atmospheric water budget in summer months.Building upon their research, our sensitivity experiments have demonstrated the potential impacts of soil datasets on atmospheric circulation, water budgets as well as atmospheric instability, emphasizing the importance of correctly setting soil information in southern Africa for a climate simulation.To conduct a more comprehensive investigation into soil uncertainties, particularly regarding precipitation simulations, it would be beneficial to implement large ensemble modeling in long-term climate studies, provided sufficient computational resources are available.

Fig. 1 .
Fig. 1.WRF-Hydro model domain covering southern Africa and the dominating land use and land cover from MODIS dataset.The gray dots show the location of the soil moisture observation stations from ISMN.
).Such temperature patterns are well reproduced by the model simulations, showing the added values associated with the very high model resolution.By conservatively interpolating the simulated temperature into the coarser grid of the CRU dataset, temperature bias slightly varies from − 3.4 to 3.3 • C over the model domain, with a very small overall bias of − 0.56 • C. The temperature discrepancies in the mountainous area are relatively large, particular in Lesotho, exhibiting large variations with mixed distribution patterns.Apart from the inherent biases from model simulations, the regridding process introduces additional biases due to the abrupt changes in orography in the region, exacerbating the challenges of interpolation.

Fig. 3 .
Fig. 3. Spatial comparison for mean 2-meter air temperature between (a) CRU-TS reference and (b) WRF-Hydro ensemble mean for the period July 2015 to June 2016.(c) WRF-Hydro ensemble bias regarding CRU-TS reference.(d) Time-averaged variability of 2-meter temperature

Fig. 4 .
Fig. 4. Spatial comparison for daily-averaged precipitation between (a) CHIRPS reference and (b) WRF-Hydro ensemble mean for the period July 2015 to June 2016.(c) WRF-Hydro ensemble bias regarding CHIRPS reference.(d) Time-averaged variability of precipitation

Fig. 5 .
Fig. 5. Spatial comparison for daily-averaged evapotranspiration between (a) GLEAM reference and (b) WRF-Hydro ensemble mean for the period July 2015 to June 2016.(c) WRF-Hydro ensemble bias regarding GLEAM reference.(d) Time-averaged variability of evapotranspiration

Fig. 6 .
Fig. 6.Spatial comparison for daily-averaged surface soil moisture between (a) ESA-CCI reference and (b) WRF-Hydro ensemble mean for the period July 2015 to June 2016.(c) WRF-Hydro ensemble bias regarding ESA-CCI reference.(d) Time-averaged variability of surface soil moisture

Fig. 7 .
Fig. 7. Spatial patterns of simulated water availability (P minus ET) of each WRF-Hydro member.

Fig. 8 .
Fig. 8.Comparison of time series of daily surface soil moisture between in-situ observation, remote sensing observation and WRF-Hydro ensemble members at 11 ISMN station sites.

Fig. 9 .
Fig. 9. Difference of assigned soil parameters of (a) porosity, (b) wilting point and (c) saturation hydraulic conductivity between selected two WRF-Hydro members (GSDE minus HWSD).(d) The most common soil texture transitions from WRF-Hydro members HWSD to GSDE.

Fig. 12 .
Fig. 12. Vertical West-East cross-sections at around 28 • S of the water vapor content (black line represents the land surface) and soil moisture differences between WRF-Hydro members GSDE and HWSD for the austral (a) summer and (b) winter.The vectors represent the zonal and vertical winds differences.

Table 1
Averaged percentage of each soil texture category in the four soil datasets within the WRF-Hydro simulation domain, i.e. in southern Africa.
Z.Zhang et al.

Table 2
Station name, location, and soil texture category as represented in four global soil datasets at the location of 11 ISMN station sites.