Abstract

The soil monitoring network plays an important role in detecting the spatial distribution of soil attributes and facilitates sustainable land-use decision making. Reduced costs, higher speed, greater scope, and a loss of accuracy are necessary to design a regional monitoring network effectively. In this paper, we present a stochastic optimization design method for regional soil carbon and water content monitoring networks with a minimum sample size based on a modified particle swarm optimization algorithm equipped with multiobjective optimization technique. Our effort is to reconcile the conflicts between various objectives, that is, kriging variance, survey budget, spatial accessibility, spatial interval, and the amount of monitoring sites. We applied the method to optimize the soil monitoring networks in a semiarid loess hilly area located in northwest China. The results reveal that the proposed method is both effective and robust and outperforms the standard binary particle swarm optimization and spatial simulated annealing algorithm.

1. Introduction

Soil provides essential support to ecosystems, for example, soil organic matter sink, water storage, and food security. It is important for soil protection and sustainable development to monitor the changes in soil quality and functions. So far, several soil monitoring networks (SMNs) have been designed in many countries from national/country to regional scale [14], proving to be efficient to monitor the measurable indicators of the threats to the soil; especially the ones at regional scale would be more promising in future due to its feature to finely describe the spatial pattern of soil variables [5].

A soil monitoring network is defined as “a set of sites/areas where changes in soil characteristics are documented through periodic assessment of an extended set of soil parameters” [1]. The design of an efficient and effective SMN must achieve a combination of different objectives and constraints related to the indicators of soil quality and quantity [6, 7], for example, the maximum accuracy, the minimum cost and effective sampling size, and so forth [810]. Examples also include the objectives of minimizing the estimation error and spatial interpolation error of the sampling design when a fixed number of samples are specified [11]. In the case of monitoring soil dynamics, the design must balance the spatial and temporal distributions of sampling points to simulate the spatiotemporal changes of soil attributes accurately [7]. In terms of multivariate soil sampling, the design needs to characterize the spatial pattern of various soil attributes simultaneously [12]. Moreover, some constraints, for example, poor infrastructures, buildings, and topography, are also taken into account in the optimization of networks [13].

In practice, these objectives/criteria usually generate conflicts in feature, geographic space, or time series [7, 11, 14, 15], which makes the design of sampling network more complex. Extensive methods capable of integrating multiple optimization objectives were employed to design the SMNs. These methods can be classified into three categories, that is, judgmental, design-based, and model-based sampling methods. The judgmental methods determine the sampling pattern and the sample size qualitatively, with the application of the statistical knowledge and prior information of soil variability, for example, the actual land-use condition, digital elevation model, and sampling barriers. This type of methods features direct and easy implementation but cannot measure the bias of the sample design [16]. Under this circumstance, two types of quantitative sampling methods, the design-based ones based on the classical sampling techniques and the model-based ones followed primarily in spatial coverage technique and geostatistics, have been widely applied [1719]. The former type is suitable for the estimation of global mean and variance and the descriptive analysis of large-scale surveys, whereas the latter one is more appropriate for soil mapping and the prediction at unsampled sites [20, 21]. The model-based methods, in particular, perform well on designing the representative sampling networks or optimizing an existing network by considering the monitoring objectives and constraints [7, 22]. Examples include the combinations of minimum kriging variance [11], minimum mean of the shortest distances [23], maximum entropy [24], and Warrick-Myers criterion [25] with the model-based sampling technique. As a heavy computational burden imposed by lots of monitoring objectives and constraints increases, stochastic search algorithms are employed to accelerate the search procedure of the optimal soil monitoring/sampling networks, for example, sequential searching approach, spatial simulated annealing algorithm (SSA), and genetic algorithm [2629].

The purpose of this paper is to propose a novel soil monitoring network design method on the basis of a modified particle swarm optimization algorithm (MPSO) and multiple objectives optimization technique. The method is capable of balancing the conflict between the minimum average kriging error (MKE) and the minimum survey budget (MSB) and subject to the constraints including spatial accessibility, sampling interval, and amount limitation of monitoring sites. We took a semiarid loess hilly area located in Hengshan County of China as an example and applied the method to redesign the soil monitoring network to validate its efficiency.

2. Materials and Methods

2.1. Study Area and Data

The study area, Hengshan County, is located in the transition region between loess and aeolian sandy areas in northern Shanxi Province (Figure 1). This area encompasses approximately 4,282 km2, with a typical semiarid continental monsoon climate. The annual precipitation has an average value of 390 mm, with most occurring in the summer. Loess hill is the main geomorphological type in the study area, accounting for 70% of the area. The main soil types, for example, loessial and aeolian sandy soils, are moderately to highly erodible, and only a small percentage, for example, bog soil and paddy soil, is suitable for cultivation.

Land covers change rapidly in the semiarid loess hilly areas. Some changes are caused by natural factors but most result from human-induced processes such as agricultural practices, urban growth, and long-term inappropriate land use. The rapid changes have caused serious land degradation and water loss; as a result, approximately 40 million tons of nutrients are washed away every year [3032]. This terrible situation threatens local food security and economic development and requires the soil monitoring network to assist the governors in making scientific land-use decisions. However, the varied topography, loose soil, and inhomogeneous distribution of soil variables hamper the sampling design and make the data difficult and expensive to collect, so it is necessary to consider multiple competing criteria/objectives and a set of constraints to guarantee the accuracy of the field survey.

251 observations were set up randomly to survey the area in 2004 (Figure 1). Eleven soil variables, including vegetation fractional coverage, soil bulk density, soil organic matter, soil water content, total phosphorus, total nitrogen, total potassium, pH value, NDVI index, soil types, and vegetation types, were measured at each sampling site. Two of these variables, soil water content (SWC) and soil organic matter (SOM), which are the major influencing factors on soil quality [33], were selected as the monitoring variables. The monitoring networks of these two variables were optimized separately by using the proposed method. Prior information, for example, the actual land-use map, digital elevation map, and watershed map at the scale of 1 : 50,000, was also included in the study. The actual land-use map and watershed map were interpreted from a Landsat TM image in 2004 and used to restrict the spatial arrangement of the monitoring sites. The digital elevation map was derived from the earth science data interface at the University of Maryland and employed to estimate the survey budget at each monitoring site.

2.2. Modeling Spatial Distributions of Soil Variables

Assume that the area of interest is discretized into a lattice grid of cells, denotes one cell in the grid, and represents a sampling scenario of and comprises a set of which are selected as the monitoring sites. Let represent the observation value of one soil variable at the location ; then the spatial variability of soil attributes can be expressed by the empirical covariance and the semivariogram if the random process is second-order stationary: where , , , denotes the total amount of monitoring sites in , and denotes a priori variance of the measured data.

In order to describe the spatial distributions of the soil variables more exhaustively, the empirical covariance and semivariogram need to be fitted by some theoretical models, for example, the exponential function, the Gaussian function, or the linear function. We estimated the semivariograms of two soil variables by using ArcGIS 10.2. The autocovariance of soil water content (SWC) under the ordinary kriging model was characterized by a Gaussian function with a 0.4331 nugget, a sill of 0.0669, and a range parameter of 2162.73 m. The autocovariance of soil organic matter (SOM) under the ordinary kriging model featured an exponential function with a 0.0293 nugget, a sill of 0.2788, and a range parameter of 2683.95 m (Table 1). Both variograms can be used to estimate the kriging prediction error of the monitoring networks and simulate the spatial pattern of two variables in combination with a sequential Gaussian simulation method [34].

2.3. Problem Formulation

The design of the SMNs aims to optimize the spatial pattern of the monitoring sites to balance the sampling accuracy and the survey cost with a minimum sample size [35]. Thus, minimum average kriging prediction error () and minimum survey budget () serve as two objectives to be optimized, and spatial accessibility, sampling interval, and amount of the monitoring sites are considered as the constraints. Two objectives are integrated by using a dynamic weighting aggregation method (DWA) as where is the weights’ change frequency and and are two weights that can be changed gradually, such that

2.3.1. Optimization Criteria

(1) Minimum Average Kriging Prediction Error. MKE minimizes the average kriging prediction error between the predicted value and the measured value, with respect to the set of monitoring sites over the area . The objective is capable of searching for the best linear unbiased solution when the regional variable is spatially dependent (4). With a known covariance function , the calculation of MKE only relies on the spatial pattern of monitoring sites. This strategy can create a new optimal monitoring network, delete/add sites to an existing network, or change the locations of the sites in an optimal way [36]. Consider where is a newly added monitoring site, is an existing monitoring site, is the a priori variance of , are kriging weighting coefficients, and is the Lagrange multiplier.

(2) Minimum Survey Budget. The goal of MSB is to set up as many sites as possible with a limited budget or to cost as little as possible with a fixed site amount. In practice, the survey budget contains the basic part, which is to organize the survey and employ surveyors, and the unit part spent on carrying out the field sampling. The unit cost of the sample will change if the sampling location makes a difference. The cost in the plains is lower than that in the hilly area or mountains. The survey budget is defined as where denotes the basic part of the survey budget, and are the unit sampling cost and sampling amount in a plains area (slope ranges from 0 to 6), and are the unit sampling cost and sampling amount in a hilly area (slope ranges from 6 to 15), and and are the unit cost and amount in a mountain area (slope is more than 15), respectively.

2.3.2. Constraints

Spatial accessibility of the measurement sites will be required in all kinds of soil surveys. Three categories of sampling barriers will hamper the design of the monitoring networks, including denial of access to plots by landowners, accommodation of replacement of plots to deal with the damage caused by the measurement process, and inaccessibility caused by the poor infrastructure or buildings in the target area. In the loess hilly areas, the fragmental loess landscape and its fragile ecosystem hamper the implementation of the soil monitoring, as well as buildings, watershed area, and steep slope over 60°. We define the spatial accessibility constraint with respect to the set as shown in (6), where equals 1 if the site is located within the accessible areas and 0 otherwise. Consider

The sampling interval constraint represents the spatial dependence between monitoring sites. Too great a distance will make the network unable to describe the spatial feature of the soil variable, while too small a distance will cause a higher monitoring outlay. According to Kerry and Oliver [37], a moderate distance between two plots varies from 1/10 to half of the range of the soil variables. Let denote the distance between any two sites in the set and represent the range of the soil variable; then satisfies

The amount constraint determines the total amounts of monitoring sites in the interval and the amounts of site in each type of slope areas, which can be expressed as where and with respect to SOM and SWC are set to 200 and 500 according to survey budget, respectively.

2.4. Optimizing the Monitoring Network

The particle swarm optimization algorithm (PSO) proposed by Kennedy and Eberhart was employed to search for the optimal monitoring network that satisfies the weighted objective [38]. Due to the simple structure and extendible framework of the PSO, it has been applied in many fields, for example, site selection, land-use allocation, parameter selection, and so forth. We modified the standard PSO algorithm by introducing a binary projection operator to keep the value of each dimension in a particle in 1 or 0. The improved algorithm is more convenient to perform the optimization in a continuous variable to expand the searching space and to project the continuous variable into a binary one to generate a monitoring network. Figure 2 illustrates the iterative optimization process of the improved PSO.

(1) Initializing a Particle. Assume that denotes a swarm with particles, a particle represents a sampling network scenario expressed as , , and all the selected units in particle denote the spatial pattern of a monitoring network (Figure 3). Then, the MPSO initializes the particles with the existing sampling pattern comprising 251 points.

(2) Calculate the Local and the Global Best Particles. Let be the velocity and the best visited location of the particle in time and represent the best visited location of the whole swarm. The weighted objective function serves as the fitness function of the algorithm to evaluate all the particles and find out and with the minimum fitness function values at each iteration.

(3) Update the Locations of the Particles. Let denote the binary location of the particle in time and and represent the continuous location and velocity of the particle in time , respectively. Accordingly, is the binary projection of . The algorithm updates the velocity, the continuous and binary locations of the particle in time by changing the value of the th dimension of as where denote a random number and and are social and individual acceleration coefficient, respectively. The sigmoid function is used to transform the continuous value of the th dimension into 0 or 1. After the transformation, the unit is selected as a monitoring site if the value of the th dimension of equals 1.

A swarm including 30 particles was employed to optimize the monitoring networks of SOM and SWC. The maximum iteration was set to 600. The acceleration coefficients and were set to 2.02 and 2, respectively [39]. The inertia weight was originally set to 0.9 and then gradually decreased to 0.1 as (10). All the simulations were conducted using a set of functions written in standard C++ language and the Geospatial Data Abstraction Library (GDAL/OGR). Consider where equal 0.9 and 0.1, respectively, and is the current iteration in the simulation.

3. Results and Discussions

3.1. Optimal Monitoring Networks

The optimal monitoring networks of soil organic matter (SOM) and soil water content (SWC) were obtained by the MPSO algorithm with the weighted and two single optimization objectives separately. For the MSB, the amounts of monitoring sites were fixed to 300. We repeated the MPSO 100 times and selected the solutions with the best fitness as the optimal networks (Figure 4). It is observed that monitoring sites of SOM and SWC with respect to the weighted criterion are evenly distributed over the study area, whereas the results of the single objectives have some empty spaces at the places where the arrangement of the monitoring sites is available, for example, the regions marked with red polygons in Figure 4. Certainly, the results of the weighted criterion for both SOM and SWC are not particularly even, which might be partially attributed to the limitation of the sampling barriers in the areas, for example, buildings, watershed, and steep slope.

In order to compare different monitoring networks, we have simulated multiple realizations of the spatial field of two soil variables. The average normalized fitness values, the average amount of monitoring sites, the average convergence rate, and the average running time were employed to evaluate the monitoring networks corresponding to three optimization criteria. Table 2 presents the mean values of all above merits during 100 simulations. The results illustrate that the weighted criterion outperforms in reconciliation of the conflicts but increases the amount of the monitoring sites and computational burden of the MPSO. To depict the spatial variation of SOM and SWC exhaustively, the weighted criterion needs to set up 341 and 313 points over the whole region, respectively, which are more than the number of points that the MKE and the MSB do. Consequently, the running time and convergence iteration of the MPSO combined with the weighted criterion increase significantly.

We further measured the performance of the networks by estimating the simulation error (SE) and total survey cost. The simulation error is the mean squared error incurred when predicting the true values by the measured values. Figure 5 shows that the weighted criterion gets smaller SE than other two single strategies do. The SE values of SOM decline by 1.93% and 0.85% if the weighted criterion is employed instead of MKE and MSB, while those of SWC decrease by 0.15% and 0.23%, respectively.

The proportion between three unit sampling costs was fixed to = 3 : 5 : 8 according to the prior surveys, and then sampling networks were evaluated with the objective of survey cost. As expected, the MSB performed better than the other two objectives, and the objective of kriging error obtained a moderate fitness values of MSB compared with MSB and the weighted one (Table 2). Figure 6 illustrates the quantitative proportion of the monitoring sites in different slope areas corresponding to the three criteria. It is observed that the weighted objective set up the monitoring sites in three types of slope areas equally for both SOM and SWC, whereas the MSB placed most of sampling points in the plain area, which reduced the total survey budget as shown in Table 2. The results imply the high performance of the weighted objective in obtaining the representative sampling designs and reducing the survey cost.

3.2. Performance of the MPSO

In this section, we compared the performance of the MPSO, the standard binary PSO (BPSO), and the spatial simulated annealing algorithm (SSA). The original and terminal temperature of the SSA was set to 2.0 and 0.1, respectively, and the disturbance coefficient was fixed to 0.9.

Table 3 illustrates the slight differences between the fitness values of the MPSO and the SSA for both SOM and SWC and the MPSO converged with slightly better fitness values. However, the MPSO only takes 782.19 s and 636.85 s to optimize the sampling networks for SOM and SWC, the running time of which comprises 39.34% and 36.1% of that of the SSA. As expected, the MPSO improves the searching ability compared with the standard particle swarm optimization algorithm. The comparisons of the modified and the standard BPSO show that the modification of the PSO increases the weighted fitness of the optimal networks by 1.89% and 0.69% for SOM and SWC at the expense of the convergent rate and running time.

Figure 7 explains the reason for this improvement of the MPSO. The convergence curves of the MPSO are smoother than those of the SSA, implying that the MPSO changes the locations of the particles only if the optimal particle in the swarm gets better. This mechanism accelerates the searching rate of the MPSO at the expense of the diversity of the swarm. The SSA, however, accepts a worse solution during the simulations in order to increase the possibility of avoiding the local minima and causes its curves to fluctuate remarkably. Meanwhile, the movements of the modified particles are continuous due to the continuous projection operator, which enlarge the search space of the optimal monitoring networks and have higher probability to obtain the optimal solution compared with the binary ones.

Certainly, several issues need to be noted when the proposed method is used to optimize a soil monitoring network. For instance, the variogram model of the soil variable must be known and able to describe the spatial variation precisely [17]. In addition, the MPSO must be repeated several times because its random and adaptive searching mechanism makes the optimal solution nonunique in different simulations [23].

4. Conclusions

Monitoring networks for soil organic matter (SOM) and soil water content (SWC) in the loess hilly area in China were designed by using a multiobjective particle swarm optimization model. The objective function was a combination of two dynamic weighted criteria, that is, minimum mean kriging prediction error (MKE) and minimum survey budget (MSB), and the constraints comprised sampling interval, spatial accessibility, and an amount of monitoring sites. A modified particle swarm optimization algorithm was employed to search for the optimal monitoring networks for the two soil variables.

The results show that the weighted criterion can obtain the networks with better weighted fitness values, lower simulation error, and more accurate values of the statistical merits compared with the single objectives, which implies that multiple objectives might be more efficient in practice [40]. For both SOM and SWC, the weighted criterion can obtain the representative monitoring networks and reduce the survey budget simultaneously. Compared with the standard BPSO and the SSA algorithm, the modified PSO with the weighted criterion can avoid local minima and obtain the best fitness values due to the combination of the binary projection operator and the continuous PSO, although more iterations are necessary. The results reveal the proposed method is both effective and robust.

Further work could focus on making efforts to improve its computational efficiency with the application of parallel models and taking into account uncertainty factors during the soil sampling process.

Conflict of Interests

The authors declare no conflict of interests.

Acknowledgments

The authors would like to thank the anonymous reviewers for their helpful suggestions and comments. This research was supported by the “863” National High Technology Program in China (ID: 2011AA120304), the China Postdoctoral Science Foundation (ID: 2012M511253), and the National Natural Science Foundation Project (ID: 41371429). They also appreciate land use laboratory of China Land Surveying and Planning Institute for its support of data collection.