A New Soil Moisture Downscaling Approach for SMAP, SMOS, and ASCAT by Predicting Sub-Grid Variability

: Several studies currently strive to improve the spatial resolution of coarse scale high temporal resolution global soil moisture products of SMOS, SMAP, and ASCAT. Soil texture heterogeneity is known to be one of the main sources of soil moisture spatial variability. With the recent development of high resolution maps of basic soil properties such as soil texture and bulk density, relevant information to estimate soil moisture variability within a satellite product grid cell is available. We use this information for the prediction of the sub-grid soil moisture variability for each SMOS, SMAP, and ASCAT grid cell. The approach is based on a method that predicts the soil moisture standard deviation as a function of the mean soil moisture based on soil texture information. It is a closed-form expression using stochastic analysis of 1D unsaturated gravitational ﬂow in an inﬁnitely long vertical proﬁle based on the Mualem-van Genuchten model and ﬁrst-order Taylor expansions. We provide a look-up table that indicates the soil moisture standard deviation for any given soil moisture mean, available at https://doi.org/10.1594/PANGAEA.878889. The resulting data set helps identify adequate regions to validate coarse scale soil moisture products by providing a measure of representativeness of small-scale measurements for the coarse grid cell. Moreover, it contains important information for downscaling coarse soil moisture observations of the SMOS, SMAP, and ASCAT missions. In this study, we present a simple application of the estimated sub-grid soil moisture heterogeneity scaling down SMAP soil moisture to 1 km resolution. Validation results in the TERENO and REMEDHUS soil moisture monitoring networks in Germany and Spain, respectively, indicate a similar or slightly improved accuracy for downscaled and original SMAP soil moisture in the time domain for the year 2016, but with a much higher spatial resolution.


Introduction
Soil moisture is an important driver for the control of weather and climate feedbacks [1]. The scientific community has well recognized the very important role of soil moisture in Earth science applications, and innovative approaches and techniques for monitoring, modelling, and using soil moisture data have been developed [2]. Global observations of soil moisture are available from several spaceborne sensors, such as the Soil Moisture Ocean Salinity (SMOS) [3], the Soil Moisture Active Passive (SMAP) [4], and the Advanced Scatterometer (ASCAT) [5] missions. However, the spatial scale of these mission data is several tens of kilometers, which is too coarse for a large variety of applications, especially when referring to the emerging hyper-resolution modelling trend [6,7] that requires more highly resolved observations of critical hydrological variables. However, that is a demand we currently seem ill-prepared to meet [8].
Recently, several reviews investigated different aspects related to soil moisture observation [2,[9][10][11][12][13], as well as soil moisture downscaling [14]. The main challenge is the very high spatio-temporal variability of soil moisture [15][16][17][18]. In the last 40 years, a large number of studies have attempted to understand the spatial and temporal variability of soil moisture from the local to the global scale [12,19,20]. Indeed, several researchers carried out detailed studies in different environmental and climatic settings to assess soil moisture variability. Since the description of the soil moisture variance as a function of the observation scale by a power law decay by Rodriguez-Iturbe et al. [21], different methods have been applied [22,23], including statistical [24][25][26] and geostatistical approaches [27][28][29][30], temporal stability analysis [31][32][33], and wavelet techniques [34][35][36][37][38]. Important to mention is the study by Famiglietti et al. [15], who analyzed more than 36,000 soil moisture measurements in the central US. In addition to a fractal scaling rule, they found that soil moisture standard deviation versus mean moisture in this humid climate exhibited a convex-upward relationship, i.e., that the standard deviation increases until mean soil moisture reaches around 0.2 m 3 m −3 and then decreases beyond that. Further studies also found a steadily increasing behavior for other regions with modifications regarding soil, climate, and vegetation [39][40][41][42][43][44]. However, Vereecken et al. [45] explained the general shape of the soil moisture standard deviation versus mean moisture as a function of hydraulic parameter variation.
According to Salvucci [46], as well as Lawrence and Hornberger [47], the origin of soil moisture heterogeneity can be found in meteorology [48,49], vegetation characteristics [39], and groundwater [41], as well as landscape attributes such as topography [50] and soil texture [51]. Not fully explored is the vegetation-soil moisture variability relationship [52,53]. In arid and semiarid regions, vegetation strongly influences soil moisture temporal variation immediately after a rain event due to interception, but also later due to different solar irradiance and resulting soil evaporation. However, these differences become less pronounced as mean annual precipitation increases [54]. The spatial variability of soil moisture is therefore related to the spatial heterogeneity of vegetation. However, Teuling and Troch [55] showed that the main discriminating factor between increasing or decreasing spatial variance of soil moisture with soil moisture mean is whether or not the soil dries below the critical moisture content that defines the transition between unstressed and stressed transpiration. To a large extent, this depends on the soil texture. Similarly, Riley and Shen [44] found that the reduction in soil moisture variance with increasing mean past a particular intermediate value of the mean depends on the magnitude of evapotranspiration due to partial water stress limitation. In addition, Clapp et al. [56] found that approximately 75% of the standard deviation of soil moisture measured at the field scale could be accounted for by analyzing soil texture. Gwak and Kim [51] showed that soil texture was a dominant factor in soil moisture distribution, and Crow et al. [57] indicated that soil texture is a dominant physical control of soil moisture at finer scales. Additionally, Wang et al. [58] indicate that soil texture plays an important role in soil moisture variability, where vegetation dampens the relationship between the mean and standard deviation of soil moisture. Vereecken et al. [45] demonstrated that the shape of the soil moisture variance over mean curve can be explained to a large extent by the spatial variance of soil hydraulic properties, indicating that soil texture is an important driver of soil moisture spatio-temporal variability. Here, in the scientific community, the general perception exists that the shape of the soil moisture variance over mean curve has to be convex, with lower variance at the dry and wet end of the curve and higher variance at intermediate conditions. Some soil moisture observations support this view, whilst others show steadily increasing soil moisture variance with mean soil moisture (see examples in Qu et al. [59]). Vereecken et al. [45] elaborated on the shape of the respective curves as a result of soil texture with the conclusion that the peak at intermediate mean soil moisture occurs for finer textured soils only. They related the curves to Brooks and Corey hydraulic parameters and found that convex shapes occur only with dominant variance in the soil pore distribution index (i.e., the Brooks and Corey parameter β). It is important to note that a dominant variability in the air entry value (α), in saturated hydraulic conductivity (lnKs), or in the vertical correlation length of the Brooks-Corey parameters would lead to a steady decrease in soil moisture variability from wet to dry mean soil moisture conditions. Thus far, a global higher resolution (~1 km) soil moisture product from SAR sensors has not been made available [2], and the disaggregation of coarse resolution products remains the focus [60][61][62][63]. Validation activities for SMAP [64][65][66][67][68][69], SMOS [65,[70][71][72][73][74], and ASCAT [5,73,75] generally found good agreement with in situ measurements. Therefore, current research strives to increase the spatio-temporal resolution of these products by auxiliary data. With thermal infrared data from multispectral satellites, a downscaling to 100 m has been achieved, exploiting the ratio of actual to potential evaporation [76]. Similarly, machine learning methods have been used to downscale SMOS by MODIS land surface temperature [77]. Verhoest et al. [78] used a copula-based probability distribution function that reflects the expected distribution function of modelled soil moisture for a given SMOS observation for downscaling. The combined use of observations from active and passive satellite microwave instruments in a soil moisture data assimilation system with a spatially distributed 3D ensemble Kalman filter have been shown to improve soil moisture results [79]. Driven by the SMAP concept [80] and the launch of the Sentinel-1 satellites [81], downscaling by radar gains importance [82]. Recently, a first combined SMAP-Sentinel-1 product has been made available at a 3 km resolution, and 1 km resolution data is available for scientific analysis [83]. The downscaling already on the brightness temperature level by backscatter is based on time series statistical analysis of the radar-radiometer data relationships [84,85], but can also be forward calculated based on physical approaches [86]. All these methods identify proxy data for downscaling and need to statistically estimate the magnitude of the disaggregation, but they neglect soil texture as an important information source for the sub-grid variability of soil moisture.
Qu et al. [59] developed a method to predict sub-grid variability of soil moisture based on basic soil data, such as texture. They derived a closed-form expression to describe how soil moisture variability depends on mean soil moisture (σ θ θ ) using stochastic analysis of 1D unsaturated gravitational flow based on the Mualem-van Genuchten (MvG) model. The method has already been proven to reliably predict soil moisture variability of small catchments [59]. Qu et al. [59] showed that their method is able to predict convex σ θ θ functions and that this specific relationship is related to the variability of the MvG parameter n. In recent times, high resolution data on soil properties for the whole globe has become increasingly available [87,88], e.g., the SoilGrids data sets at a 1 km [89] and 250 m [90] resolution, respectively. By combining the method of Qu et al. [59] and the SoilGrids data set [89], the variance over mean soil moisture relationship can be predicted for each grid of coarse scale soil satellite data products. In this study, we adapt this method to be used on global scale soil moisture products such as those of SMAP, SMOS, and ASCAT. To this end, a look-up table was developed that indicates the sub-grid soil moisture standard deviation as a function of soil moisture. This information can be used for downscaling coarse resolution soil moisture products. As an example, we scaled the SMAP soil moisture data product from its original resolution down to 1 km resolution by using field capacity as a proxy for soil moisture heterogeneity. The results were validated using two in situ soil moisture networks in Germany and Spain for the year 2016.

Soil Data Base and Moisture Variability Estimation Methods
The general procedure of the method proposed here is based on the SoilGrids data set, which is transferred to hydraulic parameters for the MvG model of unsaturated gravitational water flow. After identifying the high resolution pixels of the hydraulic parameters comprised in each coarse satellite grid cell, we related the covariance of soil water content and pressure head to the variance and covariance of MvG parameters based on the method by Qu et al. [59].

A Closed Form Expression to Estimate Soil Moisture Variability
The basis for this method proposed by Qu et al. [59] forms the MvG model that describes the water retention curve given by: and the hydraulic conductivity curve given by: In the water retention function, h is the pressure head (cm), S e is the effective saturation (-), is the saturated water content, θ r (cm 3 cm −3 ) is the residual water content, and θ (cm 3 cm −3 ) is the actual soil moisture. α (cm −1 ), n (-), and m (-) (m = 1 − 1 /n) are shape parameters. The hydraulic conductivity function is directly related to the water retention function by S e , where K s is the saturated hydraulic conductivity (cm d −1 ) and K (cm d −1 ) is the hydraulic conductivity. L is the pore connectivity parameter. The method proposed by Qu et al. [59] is based on the approach of Zhang et al. [91], who developed a first-order stochastic model for gravity-dominated flow in second-order stationary media. In this respect, a second-order stationary stochastic process is either a linear process or a non-linear process that can be transformed to a linear process by subtracting a deterministic component. Second order stationarity means that spatial heterogeneity can be fully described by the first two moments of a Gaussian distribution: the mean and the variance. Stationarity refers to the fact that the mean does not show a spatial trend (at each location, the expected mean is identical). The main idea is to explain the σ θ θ relationship as a function of the mean and standard deviation of the soil hydraulic parameters. The decomposition of the stochastic process into mean and perturbations of the parameters and variables can be performed by an infinite sum of terms, but here it is approximated by a finite number of terms of its Taylor series. By keeping the first-order terms only, Qu et al. [59] derived a relationship that expresses the standard deviation of soil water content as a function of the standard deviation in MvG model parameters. More specifically, they related the covariance of soil moisture and pressure head to the variance and covariance of MvG parameters (K s , θ s , α, and n) using Equations (1) and (3) to derive the following closed-form expression describing the σ θ θ relationship: f is the log-transformed saturated hydraulic conductivity (ln K s ) used here for mathematical convenience. ρ is the vertical correlation length of the respective parameters. The coefficients a 1 -a 3 and b 0 -b 4 contain the mean of the parameters: For more details about the derivation, we refer to the supplementary information given by Qu et al. [59]. Both Qu et al. [59] and Toth et al. [92] assume L to be 0.5 [93]. Please note that Qu et al. [59] assumed θ r to be constant, but here we calculate the spatial average of θ r for each grid cell (θ r ). To describe the full σ θ θ function, we employed the pressure head vector h = 10 {0.02:0.02:15} . In order to transform h into θ, we used the following equation: Note that the relationship between h and θ changes for every individual grid cell. Therefore, we provide a global map of θ including a pixel-based look-up table for specific θ. A linear interpolation from high accuracy θ with several digits after the decimal point to θ table = {0.01 : 0.01 : 0.6} was performed. The final results are provided in the same dimensions as the original soil moisture satellite products, i.e., the results for SMAP and SMOS are generated on 406 × 964 and 584 × 1388 grids, respectively. As the ASCAT product is provided as time series data, the sub-grid soil moisture variability is made available in table form where the location can be identified by the ASCAT grid point index.

SoilGrids
Publicly available soil profile data, such as that of the US National Cooperative Soil Survey Soil Characterization database (NCSS), of the Land Use/Cover Area frame Statistical Survey LUCAS [94], and of the Soil and Terrain Database (SOTER) [95], in addition to further national soil databases, provide the basis for the SoilGrids data set. Additional proxy information, e.g., the Shuttle Radar Topography Mission (SRTM) digital elevation model and Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite imagery, were used to derive highly resolved spatial patterns of soil properties. These covariates were converted to principle components in order to reduce noise and artefacts. The automated mapping procedure makes use of three general methods, i.e., multiple linear regression for predicting pH, texture (%), and bulk density (kg m −3 ), general linear models with a log-link function for predicting cation exchange capacity (cmol+ kg −1 ) and organic carbon content (g kg −1 ), and zero-inflated models for predicting coarse soil fragments (%) and depth to bedrock (cm). Each soil depth is modelled using a separate model that includes different combinations of covariates, resulting in soil properties at seven predefined depths (0, 5, 15, 30, 60, 100, and 200 cm). For the study at hand, we used the top soil layer (0 cm), which provides the largest contribution to a microwave signal recorded by SMAP, SMOS, and ASCAT. Additionally, we decided to use the SoilGrids data base at a 1 km resolution instead of the higher resolution data base at 250 m, because recent studies have shown a trend for satellite product disaggregation towards 1 km grids [65]. It should be noted that SoilGrids is stored in a World Geodetic System 84 (WGS84) regular grid with a 1 km resolution at the equator, i.e., the resolution at other latitudes is higher.

Toth Pedotransfer Function for MvG Model Parameterization
In order to provide MvG parameters from soil information of SoilGrids, we used the pedotransfer function by Toth et al. [92]. The advantage of this parametric pedotransfer function is that it is based on a continental scale soil data base, the European Hydropedological Data Inventory (EU-HYDI, a successor of the HYPRES data base [96]), rather than a national data base [97,98]. In addition, Toth et al. [92] provide a large variety of model approaches for different input parameters. However, other pedotransfer functions for the MvG model could be applied in a similar way.
In more detail, a linear regression based on pH, clay, silt, and cation exchange capacity, as well as information about top soil or subsoil vertical location, is established for predicting K s in their model 17, where K s has been log-transformed before (log 10K s ). Model 21 of their supplement is used for the moisture retention curve parameters, where a regression tree predicts residual water content θ r by fraction, and linear regressions predict θ s from texture and bulk density, α from texture, bulk organic carbon content, and top soil or subsoil location, and n from the same predictors. α and n were log-transformed prior to prediction (log 10α and log 10n).
For each individual SoilGrids pixel, the pedotransfer function was applied, and therefore the initial retention parameter maps have the same spatial extent as the original SoilGrids data base. Thus, in the WGS84 projection, no data is available farther South than 62 • S.

Satellite Soil Moisture Data Products
The aim of this study is to predict the sub-grid soil moisture variability of SMOS, SMAP, and ASCAT, the main systems providing global soil moisture records. To this end, the widely used higher level SMAP L3 Radiometer Global Daily 36 km product [99], the CATDS SMOS L3 Global Daily~25 km product [100], and the ASCAT H110 12.5 km product [5] have been used. The original SMOS and SMAP data grids were resampled to the global, cylindrical Equal-Area Scalable Earth Grid, Version 2.0 (EASE 2.0). This implies that the maximum number of fine pixels from the SoilGrids data base changes from equator to the poles. The number of pixels used to calculate the soil moisture mean and standard deviation will be provided with the final data set. Moreover, the H109 or H110 ASCAT product grid distributed by the EUMETSAT Satellite Application Facility on Support to Operational Hydrology and Water Management (H-SAF) with 12.5 km basic sampling has been used. For this dataset, the maximum number of fine resolution pixels per coarse grid cell also changes from the equator to the poles. The maximum number of fine pixels per coarse grid cell can be reduced by water pixels, which are excluded by the algorithm. The spatial link of fine grid pixels to coarse grid IDs has been identified within ArcGIS.

How to Use the Estimated Sub-Grid Soil Moisture Variability Data for Downscaling?
In addition to the general investigation of sub-grid heterogeneity for identifying adequate (homogeneous) validation sites and evaluating the representativeness of single soil moisture sensors for large areas covered by a satellite pixel, the estimated σ θ θ function in the final data set can be used for soil moisture disaggregation using proxy information which describes the spatial patterns within each coarse scale satellite pixel. Proxy data can be higher resolution Earth observation data with a proven relationship to soil moisture variability such as surface temperature [101,102], vegetation [103], a combination of both [104], radar backscatter [84], or even soil texture [105]. The proxy data need to be normalized for each individual grid cell. By multiplication with the provided soil moisture standard deviation at the given mean soil moisture, a disaggregation can be performed: where P i,j is the proxy data at fine scale sub-grid y-location i and x-location j, P is the mean of the proxy, and σ P is the standard deviation of the proxy. This means that we disaggregate by the standard score (also called z-scores or normal scores) of the proxy multiplied with σ θ θ . θ i,j is the predicted soil moisture at this fine scale location. Note that the resolution of the fine scale proxy data should be similar to the soil data set used to estimate σ θ θ . In order to present the principle of the downscaling method, the SMAP soil moisture is downscaled by using a soil field capacity (FC) map as the proxy, which is taken as a representation of the variability of basic soil properties calculated by Equations (1) and (2) with h = 10 2.5 (pF 2.5).
Validation of the downscaled products was performed at two soil moisture networks deployed in Germany (Terrestrial Environmental Observatories, TERENO; Rur catchment) and Spain (Soil Moisture Measurement Stations Network of the University of Salamanca, REMEDHUS). In both regions, several soil moisture validation studies were conducted before [61,65,66,70,73,74,82,106,107]. The Rur catchment comprises a heterogeneous landscape with a hilly high precipitation region covered by forest and grassland in the South, and with flat relatively dryer agricultural dominated loess region in the North [82,108,109]. The general heterogeneity also includes the soil texture [110][111][112]. Within the TERENO observatory, several field-scale wireless soil moisture sensor networks and cosmic ray neutron probes were installed [113][114][115]. Here, we use time domain reflectometry (TDR) sensors at a 5 cm depth at four locations for validation, namely Gevenich, Merzenhausen, Ruraue, and Schoneseiffen. The REMEDHUS network includes 22 stations equipped with Hydra probes (Stevens ® Water Monitoring System, Inc., Portland, OR, USA) that measure hourly soil moisture in the top 5 cm of the soil. The land use of the REMEDHUS region is mainly rainfed cereals, fallow, vineyard, and forest-pasture. Opposed to the TERENO observatory, the REMEDHUS area is a relatively homogeneous area in terms of topography. The variability in soil characteristics is somewhat lower compared to the TERENO area [107]. Those different site characteristics make them ideal to test the downscaling algorithm.
In order to evaluate the ability of the different products to capture spatial patterns of the reference dataset (TERENO and REMEDHUS), Pearson´s correlation coefficient is computed between the original SMAP, as well as the two downscaled SMAP products with the reference data yielding one correlation value per day. This method described by Kolassa et al. [116] and already implemented in performance evaluation of multi-scale soil moisture data assimilation [117], suggests using at least ten data points to calculate a spatial correlation at a given time step. Therefore, the robust correlation analysis is performed for the REMEDHUS network only.

Specific Analysis Based on Selected Grid Points
We selected four individual pixels to discuss the results in detail, as seen in Figure 1. The selection process of these pixels was driven by the intention to characterize two very different σ θ θ characteristics, i.e., one relatively homogenous and one relatively heterogeneous, respectively, for two regions. First, two coarse scale pixels were selected, located in Germany covering the North of  [118], the first is characterized by soils evolved from Pleistocene morainal plains overlying Jurassic and Triassic rocks (heterogeneous), where the latter soil developed from Devonic sediments as well as fluviatile sediments from the Rhine river system (homogeneous). Then, two coarse scale pixels located in the United States covering central Oregon (OR, 43.72 • N, 121.18 • W) and South Iowa (IA, 41.06 • N, 93.92 • W) were selected. The first covers an area where sediments are alternating with volcanic material of different ages (heterogeneous), and the latter is located in the Southern Iowa drift plain covered by loess (homogeneous) [119]. In all regions, the geologic basic material dominates soil evolution towards textural homogeneity or heterogeneity of the coarse pixels.
The clear differentiation in degree of soil textural heterogeneity is visible in Figure 1 for both the US and the German regions. Moreover, the different satellite product grids show large similarities, but the ASCAT grid has a finer resolution, so local heterogeneities in soil texture have a stronger impact on the σ θ θ characteristics. This is especially the case for the relatively homogeneous pixels of RLP and IA. For example, at θ = 0.3, σ θ is larger than 0.06 for both NRW and OR and all satellite grids, wheras for RLP and IA, SMOS and SMAP show low σ θ of less than 0.02 in contrast to ASCAT, with σ θ θ = 0.3 ≈ 0.03 denoting an intermediate soil moisture standard deviation. By using satellite soil moisture data, the time series of soil moisture standard deviation for each product can be calculated. Figure 2 visualizes the respective time series for the SMAP grids of NRW, RLP, OR, and IA for the year 2016. ASCAT and SMOS time series appear similar. The SMAP mean soil moisture records for the close-by grid cells NRW and RLP are relatively similar with a small offset during the summer period. However, due to the different spatial soil texture heterogeneity, the soil moisture standard deviation is completely different. The heterogeneous NRW grid cell has a σ θ with strong temporal dynamics ranging between 0.02 and 0.07, where the RLP grid cell is characterized by stable σ θ over time of about 0.015. This is in correspondence to the SMAP σ θ θ functions in Figure 1. For the US grid cells, the differentiation in homogeneous and heterogeneous cells is similar, with a more emphasized difference in the general soil moisture magnitude. That is, soil moisture conditions are generally 0.2 m 3 m −3 wetter in IA than in OR. Here, the dryer OR grid cell is characterized by higher soil moisture spatial variability throughout the year than IA. This is in correspondence to the SMAP σ θ θ functions in Figure 1. The very flat σ θ θ curve for IA indicates that Iowa is a suitable target region for coarse scale soil moisture validation studies, which has already been reported and utilized in several studies [120][121][122][123].

Discussion of Global Heterogeneity Maps
For reasons of brevity, the discussion of the spatial results is performed for the SMAP data set in the following. SMOS and ASCAT show similar patterns of the σ θ θ relationship, but with slightly different absolute values related to the spatial extent of individual pixels. Figure 3 presents the σ θ θ relationship for specific values of θ (θ = 0.1, 0.2, 0.3, 0.4 cm 3 cm −3 ). At σ θ θ = 0.1 , generally low σ θ values exist in accordance with the curves in Figure 1, but several regions can already be characterized by elevated σ θ . Especially domains in the vicinity of large rivers such as the Nile and Amazon gained diversified texture by the power of large water masses. Similarly, in the Sahel, the climate is the main reason for elevated σ θ , because it is a climate region with strong seasonal changes influencing different forms of soil texture development. Volcanic activity and high topography in South East Asia, the Himalaya, and the Andes favor large diversity in soil texture and therefore also elevated σ θ . In Northern Europe and Central Canada, where pleistocene morainal plains cover parental rock material, soil development exhibits small-scale texture differences increasing σ θ . These general patterns intensify with increasing θ. The overall patterns of sub-grid soil moisture standard deviation shown in Figure 3 are well-comparable with the results of other methods to identify sub-grid variability, such as the Miller-Miller scaling approach of Montzka et al. [118].
In order to identify the impact of the proposed closed-form expression to describe σ θ θ for satellite data, we calculated the annual mean soil moisture for the year 2016 for the SMAP product. The result is presented in Figure 4a. The overall soil moisture patterns follow the climatic division with very wet soils in the west-continental tropics, very dry conditions in the subtropics, and moderate soil moisture in temperate regions. Using this observed soil moisture as areal mean soil moisture (θ 2016 ), the provided look-up table directly outputs the soil moisture standard deviation (σ θ θ 2016 ) for the given grid-cell, shown in Figure 4b. In most regions of the world, soil texture drives soil moisture heterogeneity, but for some regions, a consideration of soil texture variability in soil moisture downscaling is of the utmost importance, visible in the particularly high σ θ θ 2016 . For SMAP, these regions are: The mentioned high textural variability in Sahel results in moderate σ θ θ 2016 only, because the annual soil moisture mean is relatively low here. Regions where considering soil texture in SMAP downscaling might be secondary are of course the deserts, but also steppe regions such as (Inner-) Mongolia, Mexico and the US Great Plains, the Northern Brazilian Highlands including the Serras de Borborema and do Espinaco, and the Argentinian Pampas, due to very low soil moisture and/or low textural variability. Therefore, simple resampling of coarse soil moisture observations might be sufficient to receive adequate higher spatial resolution soil moisture maps. Figure 4c shows the saturated soil moisture or porosity θ s to indicate the wet end at which the maximum soil moisture input gives reasonable σ θ estimates. Especially in the tropics for individual points in time, θ can be larger than θ s due to the different basis data sets so that a reasonable prediction of σ θ is not possible.  At the dry end, i.e., in desert regions of Sahara, Arab peninsula, Namib, and central Australia, the SoilGrids data base identified very high sand fractions, which is reasonable. However, as already mentioned, the Toth et al. [92] pedotransfer function uses a constant value of 0.041 for θ r for almost all soil textures (sand fraction larger than 2%). Therefore, the SMAP σ θ θ 2016 value for desert soils can be smaller than θ r , precluding the prediction of sub-grid soil moisture standard deviation. This problem can also occur when using other pedotransfer functions providing θ r such as the ones of Schaap et al. [124] and Vereecken et al. [125]. On the other hand, pedotransfer functions assuming θ r = 0 or θ r ∼ = 0 such as Weynants et al. [126] would solve this problem, but at the cost of larger uncertainties for moderate and wet soils. The pedotransfer function of Wösten et al. [96] uses θ r = 0.01, which can be seen as a compromise and would reduce the problem of unidentified sub-grid soil moisture standard deviation for very dry soil conditions.

Publication of the Sub-Grid Heterogeneity Product
The final data set is provided in netcdf format at https://doi.org/10.1594/PANGAEA.878889 and contains the information given in Table 1. Latitude, longitude, number of valid pixels, and the mean soil moisture to be matched with the satellite soil moisture observation is given. θ r and θ s are also stored in the data set to identify the valid range of soil moisture for the respective pixel. Because for the SMAP and SMOS soil moisture retrieval different soil texture information has been used than utilized in this study, discrepancies may occur when observations fall outside of the valid range indicated here. We recommend removing these observations from the estimation of sub-grid soil moisture variability, because validity may not be given. For ASCAT, the grid point index and cell number are also provided. Here, θ r , θ s , mean soil moisture, and soil moisture standard deviation are given in cm 3 cm −3 , although original ASCAT data is provided as the degree of saturation ranging between 0 and 1. A multiplication with the porosity provided with ASCAT data based on NASA´s Land Data Assimilation System [127] or the Harmonized World Soil Database [128] using the Saxton and Rawls [129] pedotransfer function could convert ASCAT data into volumetric units (cm 3 cm −3 ). However, when estimating the soil moisture sub-grid variability for further use, we recommend multiplication with the porosity θ s provided here for consistency reasons.

Downscaling Results
In order to present and discuss predicted σ θ θ application for downscaling results, we selected the region around the Upper Rhine valley in the Southwest Germany, France, and Switzerland. Figure 5 shows the FC map calculated from surface level soil texture taken from the SoilGrids data base. The low FC values within the Upper Rhine valley clearly differentiate it from the high FC values of organic-rich top soils of the Black Forest, the Vosges, and the German low mountain ranges. This FC map is now used as a proxy for soil moisture downscaling.  Figure 6a shows the soil moisture mean values for 2016 at the original SMAP grid for the region around the Upper Rhine valley. On this very coarse resolution, no real soil moisture patterns can be identified. Figure 6b shows the downscaled soil moisture mean at a 1 km resolution. Neighboring coarse pixels with similar soil moisture appear in similar soil moisture values, but at a high resolution with added sub-grid pattern information, showing the potential of this downscaling by predicting the σ θ θ relationship. However, the different mean soil moisture values of the coarse grid are still clearly visible in the high resolution downscaled soil moisture map, which makes it inadequate for further utilization. Similar patterns also emerge when using other downscaling methods such as that of Das et al. [84], Merlin et al. [76], and Molero et al. [62], in case the spatial discrepancy between coarse and fine resolution is high (e.g., 36 km to ≤ 1 km). Thus, further methods need to be applied in order to balance the sharp edges at the grid border. Here, based on the valid assumption that radiometer footprints follow a Gaussian contribution pattern and the SMAP Level 3 grid is an interpolated result of the raw data, we apply a simple interpolation to graduate soil moisture mean θ I and soil moisture standard deviation σ I,θ between grid centers, i.e.:  Figure 6c shows the downscaled soil moisture based on interpolation ( θ I,i,j ). Here, sharp grid borders are much less pronounced and the remaining edges originate from the proxy normal scores. Now, the soil property pattern is clearly visible in the soil moisture patterns, which is a large improvement compared to the original SMAP data. Larger water bodies such as Lake Constance and other pre-Alpine lakes have a clear impact on soil moisture observations from SMAP as the pixels are generally wetter than surrounding pixels. Here, an adaption of the water mask for the SMAP data processing may help reduce this impact. Table 2 lists the temporal validation metrics of the original SMAP product compared to the downscaling and downscaled/interpolated results for the sites within the TERENO and REMEDHUS networks. Except for Ruraue, the bias is relatively low for the TERENO stations, and for all sites, the comparison to the products evaluated here results in adequate unbiased root-mean-squared deviation (ubRMSD) and high correlation coefficients (R). There is a small but not always significant trend for improved ubRMSD and R for the downscaled products. For example, for the site Merzenhausen, ubRMSD decreases from 0.043 to 0.037 for the downscaled/interpolated result. Similarly, R increases for that site from 0.803 to 0.817. The results in REMEDHUS confirm the good performance of the downloading procedure. The downscaled product improves or at least maintains the standards of the original SMAP product. However, the differences between the original and the downscaled SMAP are in general less marked than in the TERENO network, probably due to the smaller variability of the texture characteristics in the REMEDHUS area, where the soils are in general very sandy [130] and thus less influenced by the texture-based nature of the downscaling algorithm. As found in similar validation experiments [107], the highest bias and errors occur in H9, owing to the fact that this station is located in a valley area occasionally flooded. Regarding the validation of spatial patterns, the annual mean correlation coefficient is very similar for the three products in the REMEDHUS region (SMAP original: 0.3204, SMAP downscaled: 0.3331, SMAP downscaled/interpolated: 0.3187), suggesting that their performance capturing the spatial patterns is similar. However, the smaller spatial R in comparison with the temporal one should be noted, as also shown in other studies that used soil moisture data from passive satellites, even in the case of downscaled data products [61,131]. The temporal course of the spatial correlation of the three products for the REMEDHUS network provides interesting insights on the downscaling results, even outdoing the temporal correlation analysis, where no remarkable differences were found for the original SMAP, the downscaled, and the interpolated/downscaled series (Table 2). In the case of the spatial patterns (Figure 7), although the three series follow similar trends, the downscaled/interpolated product showed remarkably higher values at many dates (14 occasions of spatial correlation R > 0.5, compared to two for both SMAP original and SMAP downscaled). Examining the spatial patterns resulting from each method in Figure 6, it can be easily seen that the interpolated map removed the blocky structure both of the original and the downscaled maps, whilst ingesting the sub-grid spatial variability from the soil map. Thus, even in a relatively homogeneous soil area like REMEDHUS, the instantaneous pattern of the 19 stations soil moisture is better captured in many cases. Generally, at a level of 0.4 during spring, late autumn, and winter, the spatial correlation decreases during summer months for all products. During these very dry conditions with soil moisture below 0.1 m 3 m −3 , the spatial variability of soil moisture is further decreased, thus reducing the correlations. Further implementation of alternative proxies obtained from weather radar networks may support the downscaling procedure. Another explanation is the sampling of SMAP data towards the fixed 36 km grid, whereas the real -3dB radiometer footprint at a 40 • incidence angle is an ellipse with a~40 km mean radius. Both explanations indicate the potential of the recently published 9 km SMAP radiometer data as a basis for downscaling towards a 1 km resolution.

Validity of the Approach
The first set of results of this study is the sub-grid soil moisture standard deviation maps for SMAP, SMOS, and ASCAT. As the soil moisture standard deviation prediction is fully based on texture information, the accuracy of this first result depends on the accuracy of the SoilGrids1km data set. The~110,000 soil profiles used to fit global spatial prediction models per soil variable are not evenly distributed over the globe. Very dense sampling is obtained in US, Mexico, and Europe, whereas very sparse information is available for Central Asia, Australia, and North Africa [89]. In dense sampling regions, the model fitting procedure gains high performance, whereas in sparse sampling regions, the fitting might not cover the full soil heterogeneity. Hengl et al. [89] mentioned that their regression models account for ca. 20-50% of observed variability in the target variables. They assume that it is unlikely that any effort to map the distribution of soils at a resolution of 1 km could explain a much larger proportion of the total variation in soil properties, as much of this variation occurs over distances of less than 1 km [39]. This explanation is supported by a higher resolution study that generated SoilGrids250m by the same method, which gained a much higher performance [90]. As stated in the introduction, soil texture can be a dominant feature for soil moisture spatial heterogeneity, but it is not the only one. Further heterogeneity is introduced by the spatial variability of topography, vegetation, land use, and rainfall. Therefore, the presented data of soil moisture sub-grid variability only provides the first insights into the actual heterogeneity of the region. In addition, it has to be noted that the continuous spatial maps of SoilGrids1km have been generated from both soil sample point data and spatially distributed by spatial predictors such as topography from SRTM and vegetation information from MODIS. For example, in SoilGrids1km, the distribution of sand, silt, and clay fractions, i.e., the main contributions to the pedotransfer function, is mainly controlled by the predictors topography and lithology [89]. Therefore, single contributors cannot be disentangled so that in the SoilGrids1km texture information, topography and vegetation information is also implemented.
The second set of results of this study show the exemplified application of the sub-grid soil moisture standard deviation maps for SMAP downscaling towards a 1 km resolution. The validation of the downscaling procedure in Section 3.4 gives a hint on the potential of using the predicted sub-grid soil moisture variability for soil moisture downscaling not only for the SMAP mission, but also for the SMOS and ASCAT mission data. In addition to the issues mentioned earlier in this section, the downscaling proxy needs to show the spatial patterns of soil moisture at the sub-grid scale. FC used in this study for downscaling is a static proxy where the FC normal scores do not vary with time, but the high resolution spatial soil moisture patterns may vary with moisture conditions. Here, applying alternative temporal dynamic proxy data such as backscatter [84] or soil evaporative efficiency [76] could improve the results of this study.
As the presented downscaling approach uses soil moisture estimates with a coarse resolution, a water mask is already applied for SMAP, SMOS, and ASCAT. Therefore, water pixels in the SoilGrids1km data base, indicated with no data (see for example Lake Constance in Figure 6), are neglected and do not contribute to the spatial statistics during downscaling to avoid twofold application. This is in contrast to downscaling algorithms starting at the brightness temperature level, e.g., the SMAP-Sentinel-1 product, where the water mask based on MODIS 250 m data is implemented during the soil moisture retrieval of already downscaled brightness temperature.

Conclusions and Outlook
In this study, we adapted the method of Qu et al. [59] about predicting the sub-grid variability of soil moisture for global scale soil moisture product grids of SMAP, SMOS, and ASCAT. The method uses a closed-form expression to describe how soil moisture variability depends on mean soil moisture using stochastic analysis of 1D unsaturated gravitational flow based on the Mualem-van Genuchten (MvG) model. By implementation of high resolution soil properties data provided by SoilGrids at 1 km [89], it is possible to predict the standard deviation over mean soil moisture relationship for each coarse scale grid cell. The final result is a look-up table map that indicates the sub-grid soil moisture standard deviation of a SMAP, SMOS, or ASCAT pixel when providing the coarse grid soil moisture. It is made available at https://doi.org/10.1594/PANGAEA.878889.
Results were analyzed temporally for specific coarse pixels over Germany and US, indicating feasible prediction of the soil moisture standard deviation over mean curves with strong differences between homogeneous and heterogeneous grid cells with respect to basic soil properties. The application for SMAP soil moisture time series for the year 2016 provided the alternating sub-grid soil moisture standard deviation for the respective pixels, which are in good agreement with the SMAP σ θ θ functions. Referring to the US pixels, Iowa is generally wetter than Oregon, but the dryer Oregon grid cell is characterized by higher soil moisture spatial variability throughout the year than Iowa. The latter shows a very low and flat σ θ θ curve, which makes Iowa a perfect target region for coarse scale soil moisture validation. The spatial analysis was performed on a global scale. We discuss the spatial patterns of the standard deviation over mean relationship for specific mean soil moisture values, and additionally for the SMAP mean soil moisture of the year 2016. Due to generally low soil moisture, the sub-grid soil moisture variability is negligible in deserts and steppe regions. In contrast, medium wet regions in the vicinity of large rivers, with strong seasonal climatic variation, pleistocene morainal plains, high volcanic activity, or high topographical dynamics are characterized by high sub-grid soil moisture variability.
The disaggregation of coarse resolution soil moisture products is an important task to enable its usage for regional applications. Several disaggregation methods need information on the scaling magnitude of a disaggregation proxy, i.e., to identify how large the deviation of individual fine pixels from areal mean soil moisture is. This is typically done by time series analysis or specific models. Here, we provide the statistics to disaggregate soil moisture to finer resolutions, where it is possible to provide the soil moisture patterns by surface temperature observations or radar backscatter. In order to illustrate that the predicted sub-grid heterogeneity can be used for soil moisture downscaling, we generated a higher resolution soil moisture data set for SMAP by soil texture as a downscaling proxy. More specifically, the static field capacity information from the SoilGrids data set was used to scale SMAP soil moisture down to a 1 km resolution. Validation results in the TERENO and REMEDHUS networks indicate a similar or slightly improved accuracy for downscaled and original SMAP soil moisture in the time domain, but with a much higher spatial resolution. Applying temporal dynamic proxy data such as microwave backscatter or soil evaporative efficiency could improve the encouraging results. In order to reduce the blocky structure in downscaled SMAP soil moisture data due to the large-scale discrepancy between the 36 km product and the 1 km product, we interpolated the coarse soil moisture prior to downscaling. Here, the use of the 9 km SMAP product might circumvent the need for interpolation, and in addition provides a better spatial performance compared to reference measurements.