Hybrid Optimal Design of the Eco-Hydrological Wireless Sensor Network in the Middle Reach of the Heihe River Basin, China

The eco-hydrological wireless sensor network (EHWSN) in the middle reaches of the Heihe River Basin in China is designed to capture the spatial and temporal variability and to estimate the ground truth for validating the remote sensing productions. However, there is no available prior information about a target variable. To meet both requirements, a hybrid model-based sampling method without any spatial autocorrelation assumptions is developed to optimize the distribution of EHWSN nodes based on geostatistics. This hybrid model incorporates two sub-criteria: one for the variogram modeling to represent the variability, another for improving the spatial prediction to evaluate remote sensing productions. The reasonability of the optimized EHWSN is validated from representativeness, the variogram modeling and the spatial accuracy through using 15 types of simulation fields generated with the unconditional geostatistical stochastic simulation. The sampling design shows good representativeness; variograms estimated by samples have less than 3% mean error relative to true variograms. Then, fields at multiple scales are predicted. As the scale increases, estimated fields have higher similarities to simulation fields at block sizes exceeding 240 m. The validations prove that this hybrid sampling method is effective for both objectives when we do not know the characteristics of an optimized variables.

Second, an optimal design focuses on the precision of spatial statistical inference. Optimization strategies mainly include minimizing the maximum or average kriging variance using known variogram models [17,18] or evenly distributing the samples in the study region to indirectly reduce the kriging variance. The latter strategy uses the free model without assuming variogram parameters, employing methods, such as minimizing the mean of shortest distances (MMSD) [19], maximum entropy [20], fractal dimension [21] and mean squared distance to sides, vertices and boundaries [22]. These methods, which improve spatial predictions, are only applied in ideal fields that meet the second-order stationary assumption. However, land surface variables sometimes possess stratified characteristics, especially in larger research areas, and the second-order stationary assumption does not apply in these cases. A sampling method to address the spatial stratification based on the MSN (means of surfaces with non-homogeneity) theory was proposed by Wang et al. [23] and Hu and Wang [24]. It estimates a variogram for each stratum and requires a larger number of samples.
Of the above methods for spatial sampling, one type for estimating variogram leads to sample clustering, which provides poor field coverage for the spatial prediction. Though the other type of method can improve spatial prediction accuracy, it is inappropriate for variogram modeling, due to the limited availability of sampling for short lag classes. To eliminate the deficiencies in both methods, the third type of hybrid criteria is proposed. It is divided into two categories. The first category is the combined use of the WM and MMSD methods [25,26]. In this category, either WM or MMSD is performed first, followed by the other, each with a specified number of samples. This is a sequential optimization process that decreases the utilization rate of the samples. It is difficult to reasonably allocate the number of samples for each sub-criterion. The second category is the simultaneous minimization of kriging variance and improving estimation of variogram parameters by assuming a known variogram [27]. However, in practice, it is difficult to know the variogram of a target variable, and the rationality of the assumption parameters cannot be evaluated.
The ENWSN is required not only to capture the spatial variability of the observed variables, but also to infer true values at the pixel scale to validate remote sensing products. Therefore, a hybrid criterion needs to be established. Because we do not know the characteristic of the specific target variable, the criterion should be based on the free model. Furthermore, to use samples more efficiently, all samples should make a contribution to each sub-criterion of the hybrid criterion. Therefore, each sub-criterion should not be performed independently, but instead, both sub-criteria should be performed together. The existing hybrid methods, however, are not suitable. To achieve both goals of the EHWSN, we develop a combination criterion without any assumptions of the spatial autocorrelation structure of surface variables. It is expressed as an integrated objective function that makes the sample distribution as uniform as possible in both geography and feature (lag distance) space. This paper is structured as follows. Section 2 introduces the requirements of the EHWSN and the study area. Section 3 describes the optimization criterion and assessment methods. Section 4 shows the tests of the developed hybrid criterion, the final results of the spatial EHWSN distribution and validates the results with a series of evaluation indexes, and Section 5 explains the merits and remaining questions associated with this hybrid criterion.

Objective of EHWSN
The EHWSN in the middle reaches of HRB aims to integrate a variety of distributed ecological and hydrological observations to capture the spatial-temporal variations of key eco-hydrological variables, including soil moisture, soil temperature and land surface temperature, and to obtain the ground truth for the validation of remote sensing products over a heterogeneous land surface. To better utilize multi-source satellite/airborne remote sensing data sets in studies of eco-hydrological processes, the EHWSN employs multi-scale validation for remote sensing products using kriging technology via nested and densely-distributed WSN nodes. Optimal spatial sampling of EHWSN nodes should be performed to help achieve the above objectives.

Experimental Area
The EHWSN is installed in a 5.5 km × 5.5 km observation matrix region located in the middle reaches of HRB, which covers both the Yingke and Daman irrigation districts of Zhangye oasis, in northwest China ( Figure 1). The main crop type is seed corn, covering approximately 75% of the total area. Other plants, such as wheat, vegetables and fruits, are also represented. There is a dense canal network with five types of canals that forms the area's irrigation system. This irrigation management is the main source of land surface heterogeneity. To effectively establish a nested WSN, the observation matrix is divided into three sub-regions: (A) the intensive region, with the same area as a MODIS pixel; (B) a 4 × 4 MODIS pixels region and (C) the surrounding region.

Arrangement of the EHWSN Nodes
A total of 180 EHWSN nodes will be installed, including 50 WATERNET nodes, 50 SoilNET nodes and 80 BNUNET nodes. The WATERNET nodes primarily observe the soil moisture, soil temperature and soil salinity at soil layer depths of 4 cm and 10 cm. The SoilNET nodes measure soil moisture and soil temperate at soil layer depths of 4 cm, 10 cm, 20 cm and 40 cm. The BNUNET nodes also observe soil moisture at 4 cm and soil temperature at 4 cm, 10 cm and 20 cm.
There are some artificial and natural condition limitations on the spatial distribution of EHWSN nodes. In total, 17 automatic meteorological stations (AMSs) have been installed in the observation matrix to measure the evapotranspiration over the heterogeneous land surface by observing the boundary layer conditions or the flux exchange between the atmosphere and land surface. Additionally, 17 BNUNET nodes have been artificially fixed in Region C, so that there is one node per production group. These pre-specified EHWSN nodes, together with the AMSs, are considered initial points during the optimization process ( Figure 2). The remaining 163 nodes are optimized in Regions A and B. Of these, 56 nodes (50 SoilNET nodes and 6 WATERNET nodes) are designed to reveal the spatial variation at a scale of hundreds of meters in Region A. The remaining 63 BNUNET nodes and 44 WATERNET nodes in Region B are used to capture the spatial variation at an approximately kilometer scale. All of the EHWSN nodes are deployed on vegetation-covered land, because the instruments cannot be installed on other land surface types, such as roads, residential buildings, wind-defended forest or irrigation channels ( Figure 2).

Hybrid Criterion
According to the requirements of EHWSN, we need to both effectively estimate the variogram parameters and ensure the spatial prediction accuracy when making inferences. Both requirements can be attributed to the optimal design of the sampling network; namely, how to simultaneously achieve the variogram estimation and minimizing the spatial estimation variance to the satisfied accuracy with the specified number of EHWSN nodes without assuming any variograms.
Based on the above goals, we establish an integrated hybrid model to simultaneously satisfy the two sub-criteria. This model is described by the following equation: where hybrid Φ is a weighted sum of two sub-criterions with weighted coefficients w1 and w2 and S is the optimized point set. EP represents a method that is effective for estimating variogram parameters, and SP is a method that is good for spatial prediction. Due to different dimensions, both and must be normalized in order to be added together.

Sub-Criterion to Estimate the Variogram Parameters
In geostatistics, the theoretical variogram 2 ( , ) x h γ is a function that describes the degree of spatial dependence of a random spatial field V [28]. The estimator is the arithmetic mean of the squared differences between measurements Z at points x and x + h. The classical estimator of the variogram is defined as follows [29]: (2) where N(h) is the number of experimental pairs [Z(x), Z(x + h)] with distance ℎ and Z(x) is the value at location x.
We choose the WM criterion [14] for the EP method, which relies only on the distances between the points and does not depend on assumptions of the spatial autocorrelation structure of a variable. A predefined distribution of the number of coupled pairs in all lag classes should be optimized to improve the accuracy of variogram modeling. The desired distribution can be based on expert judgment, and Russo [30] suggests that a uniform distribution can reduce the estimation uncertainty. The objective function is a simple standard deviation between the expected number of point pairs and the realized value in the i-th lag class : where S is a set of sampling points and np is the number of lag classes. The function is normalized as a coefficient of variation (CV) through division by : where the value can approximated by the following: where Dmax is the maximum distance in the study area and LS the lag size. A rule of thumb is to multiply the number of lag classes by the lag size, which should equal approximately half of the maximum distance in the region. Np denotes the total number of paired points and is given by , where N denotes the number of samples.

Sub-Criterion to Minimize the Estimation Variance
Normalization is difficult for the existing SP method based on a free model, such as MMSD, because we have little understanding of the statistical characteristics of objective function (e.g., mean, maximum and minimum). Thus, we need to develop a normalized SP criterion that can be compared with the WM criterion.
Yfantis et al. [31] confirmed that an equilateral triangle-shaped sampling network reduces estimation variance relative to a square or hexagonal network, and the best results are achieved when the regular point pattern forms equal-area Delaunay triangles. However, the measure of regularity is not sensitive to the area variance. With this in mind, a method to minimize the difference between the actual Delaunay triangle side length and the expected length (the MDS criterion) is proposed. The MDS is defined the same as in the WM criterion, i.e., as a standard deviation with the following form: (6) where is the i-th side length of the Delaunay triangle network generated by the point set S, and the Delaunay triangle diagram is calculated by Fortune [32]. is the desired side length. The MDS criterion is chosen for the SP method, and is defined as follows: (7) where is normalized as CV though division by . Because Thiessen polygons generated by equilateral Delaunay triangles are equal-area in an infinite region, we can approximate the value as follows: (8) where A denotes the area of study region and N is the total number of samples.
is the approximate area of the Thiessen polygons.

Determination of Weight Coefficients
Both and are converted to CV, and the weight coefficients w in Equation (1) are calculated with the following equation: where wi and CVi are the weight coefficient and the coefficient of variation of the i-th indicator, respectively, and is the number of indicators. Equation (9) implies that . The two weight values in Equation (1) are not constant; their values change with CVi during the optimization process, but their sum is equal to 1.

Optimization Algorithm
Our goal is to develop an optimal sampling scheme with a fixed number of sampling points via minimization of the value. It is necessary to find an effective way to optimize the objective function. In this paper, the simulated spatial annealing optimization algorithm (SSA) is employed to optimize a global sampling scheme [19,25,27,[33][34][35]. SSA is a probabilistic method based on the Metropolis selection criterion [36], which can be written as follows: where T represents the annealing temperature, a positive control parameter that decreases with the optimization process, and i is the number of iterations. The parameter T is calculated by the follow equation: (11) where is a parameter determined by users as a value slightly less than 1. In this paper, the value is 0.95.

Unconditional Geostatistical Stochastic Simulation
To evaluate the ability of this proposed hybrid criterion to describe the spatial distribution characteristics of a regional variable, a two-dimensional stationary and isotropic field with values defined on the grid (x, y) with is generated by stochastic simulation. The grid values are simulated using a sequential Gaussian simulation, in which ordinary kriging is used to estimate the local conditional probability distribution (LCPD) [37,38]. The simulation procedure requires the generation of spatial correlation values corresponding to a specified variogram or correlogram. Several models can be used for variogram modeling. In this study, we select the exponential model to express the spatial variation: where c0 is the nugget, c is the sill and a is the range.
Assuming that a regional variable obeys the normal distribution with mean and standard deviation , the simulated value in each grid is given by the following equation [38]: where mean(x, y) and std (x, y) are the kriging-based estimated mean and standard deviation for the grid (x, y), respectively. xp is drawn from the standard normal distribution and computed by the following equation: where gauinv is the numerically approximated inverse of the standard normal distribution function [39] and p, which represents a probability distribution function value, is a random number in the range 0-1. The LCPD on the grid (x, y) is estimated by searching all nearby grids with known value in the dependence distance. If there are less than 10 nearby grids, a value for the grid (x, y) is randomly chosen ( , ) ( , )  Figure 3 shows one realization for each variogram.

Assessment Index
To validate the representativeness of samples, the MAE (mean absolute error) is defined to represent the degree of bias: (15) where is the prediction and is the true value.
After obtaining the experimental variogram, a curve is fitted for spatial interpolation. To compare the fitted curve with the true one, the RE (relative error) is defined as follows: (16) where F and F * represent the true and fitted variograms with different parameters p, respectively.
For the estimated block size, the prediction accuracy is usually assessed by the block kriging variance (BKV) as follows: (17) where BKV is the estimated variance for the block A, is the area-to-area covariance over Area A, is the Lagrange multiplier and D is a vector based on the estimated point-to-area covariance. To compare with the estimated fields from the different variograms and block sizes, the mean normalized BKVs (MBKVnorm) is calculated by the minimum-maximum normalization method: (18) where N is the number of estimation grids. and are the maximum and minimum BKVs, respectively. The is the block kriging variance in the i-th grid.
Kriging variance only represents the estimation uncertainty, hence the similarity between the estimated field and the true field, and is defined as follows: (19) where G and S are histograms of two images. One image is simulated by stochastic simulation, and another is estimated by kriging. The image value is divided into N bins. The variables and are average values in the i-th bin.

Performance Test of Hybrid Criterion
As shown in Figure 4, there is a large contradiction between the spatial distributions of the WM and MDS criteria. WM is favorable for modeling spatial variation, but causes excessive aggregation, which negatively affects spatial estimation. MDS insufficiently captures the spatial variation due to the lack of information at short lag classes. Additionally, points near the boundaries do not have enough neighbors, leading to biases at the corners in MDS. This problem can be solved though initializing a few points in the corners.
The results of the hybrid optimal criterion are more or less similar to designs that are optimized by either WM or MDS. Though the hybrid model does not perfectly inherit all advantages of the sub-criteria, their defects are remedied. Compared to WM, the spatial distribution of samples generated by the hybrid criterion is superior for spatial statistical inferences, and compared to MDS, the distribution of samples in lag space is more complete. Because the initial values of both sub-criteria are different, the spatial distribution of samples will tend toward the criterion with the smaller initial value if equal weights are assigned to each sub-criterion. In this test, the final results will tend toward the WM criterion. Hence, variable weight coefficients are reasonable during the entire optimization process.

EHWSN Optimization15
We have proposed a hybrid model-based optimization method with two sub-criteria for parameter estimation and spatial statistical inference. This method is applied to the EHWSN sampling design in the middle reaches of HRB. The relevant parameters are as follows: in Equation (5) is equal to 200 and in Equation (8) is equal to 130 m in Region A and 380 m in Region B. The size of each field block is approximately 40 m × 40 m. To avoid having more than one node in a field block (only the variability between fields is considered), a pair of points separated by less than 40 m is forbidden. The lag separation distance should coincide with the field spacing; thus, the lag size is set equal to 40 m. Lag classes less than the dependence distance should be given priority. However, it is difficult to select a suitable distance. Therefore, we attempt to obtain the dependence distance using the TVDI (Temperature-Vegetation Dryness Index) derived from a Thematic Mapper (TM) image. This image substitutes for soil moisture [40], because there is no prior high-resolution information of the soil moisture distribution. The TVDI can approximately represent the spatial distribution of soil moisture. Ultimately, the dependence distance is set equal to 680 m ( Figure 5). The final result is shown in Figure 6.

Validation
The optimization results of EHWSN need to be verified using simulated fields from several perspectives, including the representativeness, the accuracies of parameter estimation and spatial prediction.

Representativeness
The inferred values of target variables from spatial predictions at unsampled locations are based on the hypothesis that samples are representative. If data are sampled in an unrepresentative manner, the biased data cannot represent the overall properties in the area of interest, and the spatial predictions based on geostatistical techniques might be worse than those based on simple methods, such as inverse distance interpolation, surface interpolation and splines. To verify the representativeness, we compare the means and standard deviations (SD) between the samples and the populations from 50 stochastic realizations of each of 15 variogram parameter combinations. MAE values for mean and SD closing to zero indicate spatially representative samples. As shown in Figure 7, the scatterplot represents the degree of biases for 15 types of simulated fields. The final optimization results in the middle reaches of HRB exhibits good representativeness, and the maximum MAE is approximately 0.12. The representativeness of samples gradually increases with increasing the nugget and decreasing range. Intuitively, samples from fields with smaller nuggets and larger ranges should be more representative, but the results in Figure 7 are the opposite. This is because there are intensive observation nodes in Region A. Table 1 lists the MAEs of samples in Regions A and B. Large differences are observed between both regions. Regardless of the variability, samples in Region B with relatively uniform spatial distributions have low MAE values for different simulated fields. The MAE index is sensitive to the variability in regional variables when the spatial distribution of samples is excessively concentrated. Although cluster observations may lead to a greater degree of deviation from the population, the local cluster points is necessary for estimating variogram parameters.

Figure 7. Scatter diagrams of the means (circles) and standard deviations (black dots).
The Y-and X-axes represent the samples and the population, respectively. Table 1. Biases of the means and standard deviations between the samples and the population.

Parameter Estimation
To capture the spatial variability, a specified number of paired points for 16 lag classes ( Figure 6) are optimized to estimate the variogram. According to previous experience, at least 50 paired points are required [41]. Pairs of points in lag classes of 60 m, 100 m, 140 m, 180 m and 220 m are generated in the intensive Region A.
The dispersion and mean of the experimental variogram at different scales is described in Figure 8. With variations in the nugget and range, differences are generated in the dispersion at different scales. Small scales (60 m, 100 m and 140 m) especially exhibit large changes in dispersion. When the range increases, the dispersion of small scales decreases, because of the increased structure property, while at larger scales (>140 m), the dispersion increases slightly. At all estimated scales, the dispersion changes due to nugget variations are the opposite of the changes due to variations in range. The trend of the average line of the experimental variogram is close to the theoretical function, and it excellently expresses the spatial variability. When the structural characteristic is dominant (the proportion of the partial sill to sill is larger than that of the nugget to sill), the biases between theoretical and experimental variograms is low at small scales and high at large scales. However, with the growth of randomness, the biases at estimated scales are opposite.
To test the accuracy of estimated variogram parameters, the RE between the true variogram and the fitted curve is calculated ( Table 2). For a dependence distance of 100 m, the estimated parameters are unavailable, because only one experimental variogram value in the distance of less than 100 m is used to fit the curve. With that exception, the relative error increases with increases in range when there is no non-spatial variance (Nugget = 0). When the major component of variance is non-spatial (Nugget = 0.6), the relative error of simulated fields decreases with increasing range. The maximum relative error is less than 6%, and the mean relative error is 3%.

Spatial Prediction
After estimating the variogram, the interpolation accuracy also needs to be evaluated. Generally, the interpolation accuracy is evaluated by the kriging variance, which expresses the estimated uncertainty. We apply block kriging to obtain the estimated fields with eight grid sizes ranging from 40 m to 320 m. If the variogram parameters and estimated block sizes have been determined, the kriging variance is only related to the spatial distribution of samples and is independent of the sample values. As shown in Figure 9a, for each type of estimation field, the mean normalized kriging variances (MBKVnorm) in Equation (18) become small with increasing estimation scale. This is the spatial variance within a larger block that is cancelled out, leaving less uncertainty. For each estimation scale, the MBKVnorm becomes large with decreasing dependence distance or increasing nugget value, which leads to the growth of uncertainty.
Kriging variance, however, cannot express the biases between true and estimated values. Therefore, we use another index to evaluate the estimation accuracy. The similarity between the simulated field and the estimated field with same block sizes is calculated by Equation (19). The similarity changes with estimation scales and parameters of variograms is opposite of the MBKVnorm changes. As Figure 9b shows, the estimated field with the lowest randomness and maximum estimation scale (320 m) shows a maximum similarity of about 70% to the true field. In contrast, the estimated field with the strongest randomness and minimum estimation scale (40 m) shows a minimum similarity of about 32%.

Conclusions and Discussions
A hybrid sampling method is proposed to optimize samples for both spatial estimation and spatial interpolation when there is lack of prior information on the target variable. This hybrid method is used to optimally design the EHWSN in the HRB, and its effectiveness has been verified in terms of representativeness, parameter estimation and prediction accuracy, using various simulation fields.
The samples collected by the hybrid sampling method show the excellent representativeness. The relatively even spatial sampling in Region B enhances the representativeness of samples for different types of simulation fields. Though the nested samples in Region A introduce a slight sampling bias, it can improve the estimation accuracy of variogram parameters at small scales.
For variogram modeling, the stronger a regional variable shows spatial randomness, the more paired points are needed to capture the variability at small scales. When the structural features of a regional variable are obvious, more paired points are needed at larger scales. In this research, reliable prior information about the target variable is unavailable; therefore, using the equal treatment of paired points in each lag to estimate the variogram parameters is reasonable.
One of our objectives is to estimate ground truth at different remote sensing pixel scales. Both accurate parameter estimation and high sample representativeness are helpful for achieving this goal. The sampling design for estimation at block sizes exceeding 240 m has higher similarities. The differences of fluctuations between similarity curve lines with different block sizes indicates the influences of variogram parameters variation on the estimation accuracy. With the growth of the estimation scale, the amplitudes of fluctuations become small, which means that variogram parameters have less impact on the estimation accuracy. This information is meaningful, and if we estimate a quite large block, it is not necessary to be overly concerned with the estimation of variogram parameters. Instead, the sampling design may make more contributions to enhance the representativeness of samples.
For meeting multi-scale estimation requirements, the nested structure is designed. However, the cluster points may lead to some problems, such as decreasing the representativeness of the samples, enhancing the bias in the estimated variability at small scales and bringing a negative effect on the spatial prediction. Ideally, the multi-cluster sampling, that points with a uniform spatial distribution are combined with multi-cluster points evenly distributed across the study area, can effectively eliminate these problems. However, such a sampling design needs to encompass a large number of samples. Due to budget limitations, only one point cluster is produced in our experiments. The validations prove that the nested sampling design is effective for both variogram modeling and spatial prediction based on limited samples.
In addition, unbiased sampling is important in the optimal design. In this work, the hybrid criterion considers both parameter estimation and spatial statistical inference. However, there is no quantitative expression to represent in the objective function. Therefore, future research should investigate how to quantify representation during the optimization process.