A review of geostatistical simulation models applied to satellite remote sensing: Methods and applications Remote Sensing of Environment

Despite an ever-increasing number of spaceborne, airborne, and ground-based data acquisition platforms, remote sensing data are still often spatially incomplete or temporally irregular. While deterministic interpolation techniques are often used, they tend to create unrealistic spatial patterns and generally do not provide uncer- tainty quantification. Geostatistical simulation models are effective in generating an ensemble of realistic and equally probable realizations of an unmeasured phenomenon, allowing data uncertainty to be propagated. These models are commonly used in several fields of earth science, and in recent years, they have been applied widely to remotely sensed data. This study provides the first review of the applications of geostatistical simulation to remote sensing data. We review recent geostatistical simulation models relevant to satellite remote sensing data and discuss the characteristics and advantages of each approach. Finally, the applications of each geostatistical simulation model are categorized in different domains of natural sciences, including soil, vegetation, topography, and atmospheric science.


Introduction
The role of remotely sensed data has become ubiquitous in many fields of science in recent decades. Many computational methods have been developed to process, extract, and interpolate information based on satellite data, and among them, the role of geostatistical simulation has become increasingly prevalent. Geostatistics addresses the interpolation of missing or incomplete data and, as such, is typically applied before classification in a typical workflow. The uncertainties at an individual (gap) location in remotely sensed data are influenced by many factors, including the location, the model, the parameters selected for the model, and the spatial support, i.e., the area over which the variable is measured. There is another type of uncertainty that considers a number of locations jointly (multiple-pixel uncertainty) (Foody and Atkinson, 2003). Most of these uncertainties have been considered by many studies. One source of uncertainty that has not been considered very often in the remote sensing literature is uncertainty about spatial support or spatial uncertainty. However, in other fields where geostatistics is often applied, researchers face mostly missing data (some point measurements with many unknown in between), in remote sensing, the case is rather the opposite. However, as satellite images are affected by acquisition limitations, such as clouds and shadows, recently, attention to this kind of uncertainty has increased. For instance, geostatistical simulation methods have been used in different applications based on remotely sensed data, including downscaling of spatial data (Boucher andKyriakidis, 2006, 2007;Zhang et al., 2017), gap-filling (Mariethoz et al., 2012;Yin et al., 2017b;Yin et al., 2017a), mapping (Berterretche et al., 2005;Wang et al., 2002), increasing the accuracy of remotely sensed land cover maps (Wang et al., 2018;Zhang et al., 2016a), and designing the sampling of in situ data (Anderson et al., 2006;Chu et al., 2009;Lin et al., 2009).
While many review articles have been devoted to other computational methods-in particular, those related to classification, such as support vector machines (Mountrakis et al., 2011), random forest (Belgiu and Drȃguţ, 2016), fusion methods (Ghassemian, 2016), and deep learning (Ma et al., 2019)-to date, a review study on geostatistical simulation is lacking in the context of remote sensing data. Van der Meer (2012) provided a general review on the use of geostatistics in remote sensing, but with a focus on estimation rather than geostatistical simulation methods, which have become prevalent in recent years. Moreover, no review article includes the recent progress made in geostatistical simulation methods for remote sensing. In this review, we fill this gap by providing an overview of geostatistical simulation methods and their applications to satellite remotely sensed data.
The most widely known geostatistical approaches in remote sensing are deterministic interpolation techniques known as estimation methods. These estimation approaches can be illustrated with an example using a synthetic data infilling problem. Suppose there is a Sentinel-2 image (see Fig. 1 (a)) that contains some gaps caused by the presence of clouds (see Fig. 1 (b)). One must predict (interpolate) the values of the pixels within the gaps. Solutions to this problem include filling the gaps using estimation methods such as inverse distance weighting (IDW) (Zimmerman et al., 1999) or ordinary kriging (OK) (Goovaerts, 1997). Both approaches assume that the value of an unknown pixel is the weighted average of known pixels within the neighborhood (Lu and Wong, 2008). While IDW assigns weights based on the squared distance between data, OK determines these weights based on the spatial behavior of the data through a variogram model (in this case, an omnidirectional exponential model; see Fig. 1 (e)). OK is a geostatistical method that provides the variance of predicted values, whereas IDW is not a geostatistical method because it does not rely on any inference. However, both approaches provide a single estimation. In contrast, one could imagine several possible sets of values that would fit the gaps equally as well in these images; i.e., the boundaries between fields are not uniquely determined based on the available data.
Another feature of estimation is that it typically results in oversmoothed values. Visually, the estimated values in the gaps based on both IDW and OK approaches are excessively smooth compared to the textural information in the rest of the image (see Fig. 1 (c)-(d)). This phenomenon mainly occurs because such estimation methods do not consider the complex properties of spatial patterns, such as connectivity of high and low values. In real-world problems, smoothed predictions can often not be used to reach a conclusion or a decision. While IDW presents the smallest root-mean-squared error (RMSE), it is clear that the RMSE is not an appropriate measure of image texture, which in both cases is poorly reproduced.
In contrast to estimation methods, geostatistical simulation reproduces the spatial variability of the observed variable. This method generates realizations (multiple realistic scenarios), which are conditional, meaning that the realizations fit the available data. Available data can be categorized into two general groups: hard and soft data. Hard data are measurements with a low degree of uncertainty, which mostly come from direct local measurements. In contrast, soft data are measurements with a high degree of uncertainty, which mostly come from other dependent variables. The use of multiple realizations allows the production of a probability distribution for each unknown value, which can be either parametric (e.g., represented by a mean and variance) or nonparametric (Lantuéjoul, 2013). Notably, the realizations must have textural properties similar to the true image, allowing them to be used in remote sensing workflows where texture matters. For instance, one may be interested in determining the connectivity of the fields in Fig. 1 to determine the pathways for the migration of insects (McRae et al., 2008). In such an application, the RMSE (representing the point-to-point estimation error) is less important than the topological structure of the generated landscape elements. Geostatistical simulation addresses this issue by focusing on reproducing the textural characteristics.
In this review, we focus on the geostatistical simulation algorithms that are commonly used in satellite remote sensing applications. Other geostatistical simulation algorithms are only touched upon. This paper is organized as follows: after the Introduction section, geostatistical simulation methods are discussed in Section 2. Section 3 examines applications of geostatistical simulation methods in remote sensing for different fields of the natural sciences. Finally, after a discussion in Section 4, Section 5 provides a summary and recommendations.

General concepts
Geostatistical simulation methods are useful tools to generate several equally probable realizations of a spatial phenomenon. These multiple realizations can be used to quantify the uncertainty of the generated patterns.
Before deciding which geostatistical simulation approach should be used for a given problem, one important factor to be determined is the nature of the variable. Three main types of variables are as follows (Emery and Silva, 2009;Gustafson, 1998;Vann et al., 2002): i. Continuous variables that usually represent physical properties such as reflectance in a given waveband, biomass, or percent tree cover. ii. Categorical variables that are defined as a fixed number of states or categories, such as soil type or land cover. iii. Objects with different shapes, locations, and orientations, such as buildings, trees, or water bodies.
Important factors to consider in the selection of geostatistical simulation approaches include the relevant level of spatial complexity at the studies scale. Depending on the spatial complexity and the amount of data available, geostatistical simulation methods can be categorized into three principal groups: object-based simulation, two-point stochastic simulation, and multiple-point stochastic simulation, which are summarized below and in Fig. 2.
Object-based simulation methods consider distinct stochastic objects that are defined based on a specific statistical distribution (Deutsch and Wang, 1996;Pyrcz et al., 2009). This category of methods has been used in domains such as geological models, but it is not commonly used in remote sensing applications. Considering that defining objects' geometry is challenging in object-based simulations, these models' applications in analyzing satellite data are restricted. One of the main reasons is that considering objects' geometry is cumbersome in geostatistical simulation applications based on satellite images. Accordingly, this category is not further discussed in this review.
Two-point stochastic simulation methods are based on descriptions of the spatial structure using a variogram (or covariance function) of a set of pixels or points. Such descriptions are based on the relationship between pairs of pixels in a dataset (hence termed two-point simulation). The values produced by such models are generally Gaussian random fields (GRFs), and data that are not normally distributed must undergo a Gaussian transformation to be used in this framework. Commonly used methods in the remote sensing field can be roughly divided into continuous and categorical GRFs depending on the type of variable considered. While two-point stochastic simulation methods do not suffer from the smoothing effects of kriging (see Fig. 4), they are often limited in the complexity of the patterns they can generate.
Multiple-point-based stochastic simulation (MPS) methods generate realizations that incorporate information coming from a training image (TI) . TIs can be a set of images or spatiotemporal properties that are observed or generated using physical or statistical methods. Contrary to the two-point stochastic simulation framework, MPS approaches can generate complex structures based on the spatial dependencies observed in the TI. The downside to these methods, however, is that finding a suitable TI is not always straightforward. Indeed, the selection of an appropriate TI can strongly impact the results: the TI should be large enough to statistically represent the diversity of patterns and shapes found in the area of interest; otherwise, it may fail to reproduce the random field model statistics (Emery and Lantuéjoul, 2014). In remote sensing applications, TIs are often obtainable due to the profusion of data available globally. For a given area of interest, two strategies can be envisioned to obtain a representative TI: 1) one can use geographical regionalized TIs, i.e., imagery acquired on another region with similar geographical properties (Gravey et al., 2019;Oriani et al., 2020b;Zhang et al., 2016a), or 2) it is possible to use temporal regionalization whereby the TI is taken from the same area but on a different date (Yin et al., 2017b;Yin et al., 2017a). Importantly, most physical processes and satellite images are nonstationary, meaning that their statistical properties are spatially Fig. 1. Illustrative representation of estimation methods for filling artificial gaps in a Sentinel-2 (near infrared (NIR) band) image from a farmland area in Croatia (45 • 38 ′ N, 18 • 15 ′ E). Note the artifacts of kriging that are due to the use of a large but still limited neighborhood of 1000 pixels.
variable, such as transitioning between different vegetation and land cover types. Various approaches have been developed to simulate nonstationary variables with MPS, notably by using TIs that are themselves nonstationary in combination with auxiliary variables that describe the trends in features, dividing the target and the training data into a number of stationary zones, using geometrical operations such as rotation on the simulated data events, or considering invariant distances used in some simulation algorithms . One point about TIs that should also be checked is the consistency between the statistics of TIs and the data such as hard, soft, and auxiliary. There might be some nonobvious inconsistencies in the case of complex spatial dependence. Such validation can be carried out by validating TIs against point data, e.g., by comparing the higher-order moments or cumulants extracted from the data and the TI, validation of the stationary statistics by generating some realizations, and comparing statistical similarities between the data and model .
MPS methods can also be categorized into continuous and categorical approaches, and they can be further divided into pixel-based and pattern-based approaches depending on whether pixels are simulated one by one or simultaneously in groups. In general, pixel-based geostatistical simulation methods can handle conditioning data without problems. This is not the case with pattern-based methods that can produce undesirable artifacts when working with large amounts of conditioning data. Moreover, it has been shown that increasing the amount of conditioning data improves the statistical properties of the generated simulations using two-point models, while it may increase the computation time (Emery and Pelaez, 2011;Madani and Abulkhair, 2020). These various simulation approaches are illustrated in Fig. 2 and described in detail in the next section.
Another important concept when considering spatially distributed variables is the scale, also known as support in geostatistics. Scale defines the space on which measurements are defined and characteristics such as size, geometry, and orientation (Atkinson and Tate, 2000). A common example of support in remote sensing is the pixel size. Changes in the scale of measurements affect the information obtained. As an example of the scaling effect on satellite measurements, where the sensed quantity is aggregated over finite areal units, the NIR band of the sentinel-2 image in Fig. 1 is upscaled from 10 m to 15 m, using the nearest neighbor method. The change in support also induces a change in the spatial properties of the variable (Atkinson and Tate, 2000), as shown in Fig. 3, where the variograms of both original and upscaled images are represented. It is particularly notable that larger support sizes (i.e., lower resolutions) tend to reduce the variance of the variable considered, which is a direct consequence of the smoothing effect of upscaling. Similarly, downscaling should result in an increased variance, which poses the question of how to generate an added variance in a realistic way (as opposed to adding noise, which would increase the variance without adding information). Similarly, when using conditional simulation, one must be aware that for a given variable, pointbased measurements and pixel-based measurements have a different distribution, so the conditioning should not be exact due to infra-pixel variability.
While MPS cannot mathematically model scale effects, they can incorporate datasets having different scales as multivariate TIs. However, multivariate TIs are larger and, therefore, more computationally intensive to use (Ge et al., 2019). Scaling satellite images is further discussed in Sections 3.2.3, 3.3.2, 3.4.2, 3.6.1, and 3.6.2.

Continuous variable GRFs
The SGS method generates realizations of a Gaussian function defined by its mean and covariance matrix. In this process, kriging and kriging variance are used sequentially to estimate the mean and variance of the Gaussian distribution at each pixel of the target area. A summary of the SGS algorithm is given as follows: 1) Infer the variogram based on known portions of the studied variable. 2) Define a random path through all unmeasured pixels of the simulation grid. 3) For each pixel along the path: a. Use kriging to determine the conditional mean and variance based on the known and previously simulated data; b. Draw a random value from a Gaussian conditional cumulative distribution function (CCDF) whose mean and variance have been determined in the previous step, and add the drawn value to the simulation grid. 4) Generate another realization based on a different random path.
The distribution of the conditioning data should be Gaussian to use the SGS method. Otherwise, a transformation such as normal-score transforms or histogram anamorphosis through Hermite polynomials is needed to transform the data into a Gaussian distribution. If this is the case, a back-transformation of the obtained results is also required (Chiles and Delfiner, 1999). Fig. 4 shows the application of the SGS method to fill the gaps in the same image as used previously with estimation approaches. While the simulated patterns exhibit stochasticity and variability across multiple realizations, the generated patterns do not present the expected sharp boundaries between fields. Here, there is not a single RMSE value but one RMSE per realization, which allows a probability distribution of the RMSE to be drawn.

Categorical variable GRFs
The structure of the SIS algorithm is similar to that of the SGS algorithm in the sense that kriging is used sequentially for each pixel, but in this case, indicator kriging and indicator variograms are used (Deutsch, 2006;Emery, 2004a). This method has been used in remote sensing applications to simulate categorical variables that do not have an order relationship. The main steps of the SIS method are as follows: 1) For a categorical variable with k classes, create k indicator variables, i.e., binary variables taking a value of 1 where the corresponding category occurs, and 0 otherwise. 2) Infer the indicator variogram for each indicator variable.
3) Define a random path through all unmeasured pixels of the simulation grid. 4) For each pixel along the path: a. Use indicator kriging to determine the CCDF of the k categories. b. Draw a random value from the CCDF, and add the drawn value to the simulation grid. 5) Generate another realization based on a different random path.
TGS is another approach to obtain categorical GRFs. This method uses a threshold to truncate a continuous multi-Gaussian random function, such as that obtained by the SGS method (Galli et al., 1994). n thresholds result in n + 1 categories. Although few studies have used TGS in remote sensing applications, it is still presented in this study as an effective simulation method that could be used in remote sensing applications in the future.

TI-based simulation methods
As visible from the comparison of Figs. 4 and 1 (a), GRF methods do not reproduce some of the features and heterogeneous patterns present in the original image. The MPS framework aims to address the simulation of such complex patterns by replacing the variogram as a description of the spatial structure by using a nonparametric model based on a TI (Gómez-Hernández and Wen, 1995). As a result, this method considers entire patterns rather than statistics between pairs of pixels. In our example, the informed parts of the image contain information on the patterns present, which can be used to generate similar features inside the gaps. The resultant simulation obtained by an MPS method using the same dataset is illustrated in Fig. 5, with features that resemble those of the original image. These methods can generate a set of simulations using one or an ensemble of TIs.
TI-based methods were initially developed for categorical data and then further extended to address both categorical and continuous variables Tang et al., 2013). These methods are typically based on the sequential simulation principle, where pixels are visited in a given sequential order, called the simulation path. The most common type of simulation path is the random path. However, other paths have been used, such as the unilateral path (i.e., simulation rowby-row) and the multigrid path (Nussbaumer et al., 2018). The details of categorical and continuous TI-based algorithms are discussed in the following sections.

Markov random fields (MRFs)
MRFs are often grouped with MPS because they also allow the use of TIs to generate new patterns. MRFs use the Markov property in the spatial domain, meaning that a simulated value depends on only its local neighborhood. In MRF models, conditional realizations are generated based on a template of neighboring values around each pixel. The random function is defined through an exponential-type parametric model involving high-order interactions. The exponential-type parametric model is known up to a normalization constant, and the simulation is carried out iteratively using the Markov chain Monte Carlo (MCMC) approach Tjelmeland and Besag, 1998). The MCMC approach is a methodology for simulating a distribution that is difficult to obtain due to some practical complexities  (Daly, 2005;Mariethoz and Caers, 2014). Because MRFs are computationally impractical for large or high-dimensional simulations (3D or 4D), Markov mesh models (MMMs) have been introduced, which are computationally lighter than MRFs for geoscience applications Stien and Kolbjørnsen, 2011). As MMMs have not been used in remote sensing applications to the best of our knowledge, they are not covered in this review.
Markov chain random fields (MCRFs), which are a specific type of MRF, generate values conditionally to sample data by providing a theoretically sound framework for conditional simulation . The difference between MCRFs and MRFs is that MCRFs are built on a "directional chain", assuming that the conditional probability of a random function at any pixel depends on the local neighborhood in different directions. In practice, the nearest known neighbors in several cardinal directions are considered (Daly, 2005).
Recently, as an extension of the MCRF model, the Markov chain random field cosimulation method (co-MCRF) has been used in many studies to improve land use/land cover (LULC) classification accuracy. The role of cosimulation is to incorporate a preclassified image as a data layer (Li et al., 2015;Zhang et al., 2019).
The co-MCRF algorithm has 4 steps, as follows: 1) Define a circular search radius, and split the pixels into two sets of known and unknown positions. 2) Define a simulation path through all unmeasured pixels of the simulation grid. 3) For each unknown pixel: a. Search for the nearest known pixels in each of the four quadrants within the defined circle, and estimate the local cumulative conditional probability distribution function (CCPDF) of the unknown pixel based on the colocated datum of a covariate field (i.e., a preclassified image). b. Draw a random value from the obtained CCPDF, and add the drawn value to the known set. 4) Generate another realization based on a different simulation path.

Single Normal equation simulation (SNESIM)
SNESIM is an MPS simulation algorithm that is suited for categorical data. The principle of this method is first to compile a database of patterns (also called data events), which is thereafter used as a "model" to build a realization. To build this database, SNESIM first scans the TI based on a given shape called the template. For each data event (a given configuration of pixel values), a histogram of values for the central pixel is stored, which can then be queried for the generation of new patterns. In SNESIM, the database has a tree structure, which can be very expensive in terms of memory usage. Thus, Straubhaar et al. (2011) proposed replacing the tree with a list structure that requires less memory. The main steps of SNESIM are as follows (Strebelle, 2002): 1) Build a search tree from the TI and template. 2) Define a simulation path through all unmeasured pixels of the simulation grid. 3) For each pixel along the path: a. Retain the data event associated with the pixel, and retrieve the conditional probability distribution function (CPDF) in the search tree related to the data event. b. Draw a random value from the CPDF, and add the drawn value to the simulation grid. 4) Generate another realization based on a different random path.

Direct sampling (DS)
DS is a pixel-based simulation method that is suited for both categorical and continuous variables. As its name implies, it samples the TI directly instead of storing the available data events in a database (Mariethoz et al., 2010). The DS simulation steps are as follows: 1) Define a random path through all unmeasured pixels of the simulation grid. 2) For each pixel along the path: a. Select a pixel to be simulated and extract the corresponding data event. b. Search the TI for the data event. For each visited TI pixel, compute a similarity metric between the searched data event and the visited pixel in the TI.
3) Stop the search at the first matching data event in the TI for which the distance is under a predefined threshold, and paste the corresponding TI value at the center of the data event in the simulation. 4) Generate another realization based on a different random path. To find the first matching data event in step 3, a similarity metric is needed. Assume that Z(x i ), Z(y i ) and n are the data event found in the simulation, the one found in the TI, and the number of neighbors, respectively; the similarity metrics that have been used in the remote sensing applications are summarized in Table 1.
Quantile sampling (QS) has been recently introduced as an equivalent but more computationally efficient implementation of DS . The results obtained with DS and QS are equivalent.
A number of features have been added to DS, which has been exploited for incorporating different types of data, notably by modifying the similarity metric used to compare data events. This method can be used to impose the local proportion of a given category in the realizations , combine continuous and categorical data, incorporate data that are available at different scales (Oriani et al., 2020a), or quantify the misfit related to block data when scanning the TI .

Filter-based simulation (FILTERSIM)
FILTERSIM is a pattern-based simulation, meaning that it does not generate a realization one pixel at a time, like the pixel-based approaches described above, but works by patterns that are groups of pixels pasted together. This method is suited for both categorical and continuous variables, and its principle is to apply different filters to reduce the pattern complexity and computational cost (Zhang et al., 2006). Wu et al. (2008) proposed using filter score comparison to improve FILTERSIM in regard to the simulation process. The FILTERSIM simulation steps are as follows: 1) Compute filter scores based on the patterns in the TI, cluster patterns based on their filter score similarity, and compute an average pattern named the prototype pattern for each of the clusters. 2) Define a random path through all unmeasured pixels of the simulation grid. 3) For each pixel along the path: a. Extract the data event using a predefined template. b. Find the most similar prototype to the data event, and select a random pattern from the corresponding cluster. c. Paste the selected pattern to the realization being simulated.

4) Generate another realization based on a different random path.
There are other pattern-based MPS methods such as simulation of pattern (SIMPAT) (Arpat and Caers, 2007) and cross-correlation-based simulation (CCSIM) (Tahmasebi et al., 2012), as well as pixel-based methods, cumulants (Dimitrakopoulos et al., 2010), or a combination of pixel-based and pattern-based methods such as hybrid pixel-and pattern-based simulation (HYPPS) (Tahmasebi, 2017), which have introduced and obtained acceptable results in other applications. However, to the best of our knowledge, these methods have not been used in remote sensing applications and represent a potential for future applications to remotely sensed data. Table 2 summarizes the characteristics of the methods discussed in Section 2.

Applications
In recent decades, remote sensing technologies have developed rapidly as a tool for acquiring geospatial and atmospheric data with applications ranging from geosciences to economics. Depending on the application, different types of information have to be extracted from the imagery (Ge and Bai, 2011). In many cases, spectral information alone is not sufficient, and a combination of spectral and spatial information can be required.
Geostatistical simulation methods have been applied recently to remotely sensed data for different purposes, such as downscaling, sampling design, uncertainty quantification, and mapping (Chica-Olmo and Abarca-Hernandez, 2000; Fiorentino et al., 2011;Ge and Bai, 2011;Ge et al., 2008b;Liao, 2015;Zhang et al., 2014). As object-based simulation algorithms have not been commonly used in remote sensing applications, we focus on two-point-based stochastic simulation and MPS methods. The applications of these models in different geoscience domains are discussed, such as soil science, vegetation mapping, topography, climate, and atmospheric science.

Soil
Remotely sensed data have provided numerous spatial data for soil studies. As soil properties vary continuously over space, geostatistics has been used by soil scientists for digital soil mapping . These methods also include the use of geostatistical simulation algorithms based on remote sensing for mapping soil erosion factors, mapping the distribution of different soil elements, mapping contamination and analyzing human health risks. These studies show that geostatistical simulation can be used for mapping while incorporating auxiliary information and for sampling design. However, Malone et al. (2016) showed that MPS has the potential for quantitative extrapolation of soil information based on the use of ground-based gamma-ray detector data. Most studies in soil science have used twopoint-based stochastic simulation methods, and the use of MPS methods in digital soil mapping based on satellite data is an avenue to explore in future research.

Mapping
One of the first published applications of geostatistics simulation in digital soil mapping was the interpolation of a soil erosion factor by Wang et al. (2002). Soil erosion is a global environmental issue that requires monitoring. Remote sensing data provide cost-effective information that can be used to determine the rate of erosion. In addition, quantifying uncertainty is essential for decision-making, which is enabled by geostatistical simulation algorithms. The authors use the cover management variable, also called the C factor, which quantifies Table 1 A summary of similarity metrics between the TI and the simulation.   the rate of soil erosion due to agricultural practices and environmental factors. The C factor is mapped based on sequential Gaussian cosimulation (SGCS) with Landsat TM images. The simulation steps of SGCS are similar to those of SGS, except that the mean and variance of the Gaussian CCDF are determined by cokriging using auxiliary variables. The spatial uncertainty is determined by estimating the variances in the obtained realizations. The authors compared three traditional methods and three geostatistical methods. The nongeostatistical approaches compared were 1) assigning an average of the C factor values for each vegetation type, 2) linear regression, and 3) log-linear regression methods. Three geostatistical methods are as follows: 1) Colocated cokriging with a TM band ratio image as a covariate. A TM band ratio image is an image that is obtained from a combination of Landsat TM bands (e.g., normalized difference vegetation index (NDVI)). 2) SGCS without the TM band ratio image. 3) SGCS with the TM band ratio image. The obtained results demonstrate that the SGCS with a TM ratio resulted in the best results. The simulation without TM images resulted in much worse prediction even when compared to nongeostatistical methods. Moreover, as the SGCS method is conditioned on the previously simulated values and the image datum, it ensures that each simulated value is coherent with its neighbors. The mapping of a continuous soil variable based on different data sources was further explored by Grunwald et al. (2006). They applied geostatistical simulation to map soil components as a continuous variable. They aimed to estimate the spatial and temporal variability in soil nitrate-nitrogen using SIS to incorporate in situ soil measurements in the simulations as well as auxiliary information such as climate, landscape, and LULC obtained from Landsat ETM, topographic, geologic, and geographic data. The study area was across the Santa Fe River watershed in north-central Florida, USA. The proposed approach was three-fold, consisting of i) using logistic regression to drive prior probabilities of soil nitrate-nitrogen across the study area based on secondary information, ii) updating these probabilities using indicator kriging, and iii) generating a set of realizations of the spatial distribution of soil NO 3 -N using SIS (see Fig. 6). The results highlighted that incorporating the secondary information in the analysis and utilizing indicator kriging  instead of OK led to reduced prediction errors. Mapping the probability of the presence of contaminants in soils is important for preserving human health and designing remediation strategies. For example, Zhao et al. (2012) enabled such probabilistic mapping by using SIS to map the soil data and developed an approach to find a link between the soil pH and heavy metal contamination. They discussed the transfer of heavy metals within the soil-crop system near the Dabaoshan Mine in southern China. The proposed approach first generated spatial patterns of soil data using SIS and then used multiple linear regression models to predict the heavy metal concentration in the soil-crop system. Next, land use was determined from ALOS imagery (JAXA, Tokyo, Japan) imagery acquired from the study area and was linked to the SIS results. Finally, models and land use maps were integrated into a dose-response model to obtain human health risk patterns from heavy metals. The results demonstrated that high human health risk was correlated with high heavy metal contamination, agricultural or residential land cover as well as low pH.
The mapping of toxic soil metals was further investigated by Huang et al. (2016) in a case study of the Xiandao district in Changsha city in Hunan province of China. In the proposed method, a serious risk of human exposure to toxic metals was determined using the spatial distribution of metals and land cover maps, which were obtained using SIS and Landsat images, respectively. Finally, an average risk map of toxic metals was obtained by averaging 1000 realizations. They also used the ensemble of realizations to determine the locations where the contamination risk was greater than an acceptable limit to provide a risk management policy. The results demonstrated that spatial patterns of toxic soil metals were correlated with LULC.

Sampling design
The study of Wang et al. (2002) was followed by Anderson et al. (2006), who used SGCS for sampling design. It is important to plan a cost-efficient sampling design to ensure the success of in situ data acquisition campaigns. Failing to consider spatial variability can lead to suboptimal sampling designs, and geostatistical simulation methods can be used to produce a robust sampling design that captures uncertain features across realizations. Anderson et al. (2006) introduced a sampling method based on SGCS for mapping the C factor. The proposed approach was based on the spatial correlation between field data and Landsat images. This approach estimated the local variability in the desired parameter using SGCS to combine in situ and remotely sensed data. Then, it calculated the distance between samples based on local spatial variability within a given neighborhood. The proposed method was used to sample and map the C factor and to monitor the dynamics of soil erosion based on Landsat TM images and field data. The results confirmed that the proposed method resulted in a better sampling design than simple random sampling.

Vegetation
Physical and physiological processes in plants are controlled by observable attributes, such as the leaf area index (LAI), temperature, or evapotranspiration (Landsberg and Gower, 1997). These attributes are most needed as maps, which are often derived from satellite data. However, such data layers are often incomplete or uncertain, and geostatistical simulation methods can play an important role in incorporating the information they contain (Berterretche et al., 2005). Furthermore, uncertainty quantification is essential for ecological interpretation and decision making (Rossi et al., 1993). The applications of geostatistical simulation algorithms using remote sensing data in the vegetation domain can be classified into four main domains: 1) error assessments in land-cover patterns and assignments, 2) upscaling and Fig. 6. Two realizations of the spatial distribution of soil nitrate-nitrogen generated using SIS; ©2006 Wiley, Reprinted, with permission, from (Grunwald et al., 2006). downscaling, 3) mapping, and 4) sampling design. Two-point-based stochastic simulation methods are mostly used for vegetation studies. Further works are needed to assess the potential of MPS methods for applications to vegetation with remote sensing data.

Mapping
Some vegetation indices, such as LAI, are measured at only a small number of locations. The mapping of these indices requires interpolating these locations while simultaneously including remote sensing data that are spatially continuous but indirect. Berterretche et al. (2005) compared several regression and geostatistical methods, including kriging and SGS, to map the LAI using Landsat images and field data. For applications where local accuracy is important, the authors recommended using kriging with external drift. However, when the focus was on preserving global patterns and heterogeneity, SGS was recommended.
Geostatistical simulation methods in vegetation applications were further studied by Shen and Zhang (2016). The proposed approach aimed to map the regional forest carbon density and its distribution based on SGCS and a nonlinear regression of a unary cubic equation. They evaluated the proposed method on Landsat TM data over Xianju County in Zhejiang province of China. The results demonstrated that SGCS reflected the distribution of carbon density better than the regression.

Error assessment
A number of studies have used geostatistical simulations to model the spatial uncertainty of vegetation cover mapped from remote sensing measurements. For instance, De Bruin (2000) used SIS to generate a set of equally probable maps from which uncertainties regarding land-cover patterns could be inferred. An approach was proposed based on SIS with collocated indicator cokriging to model the spatial uncertainty in the estimates of the areal extent of land-cover types. The simulations were based on the soft indicator data obtained from posterior probability vectors of classified Landsat satellite images and a sample of hard reference data. The method was tested using a case study of olive tree plantations in southern Spain. The results demonstrated that conditioning on hard data in areas where the remotely sensed images were not very informative had a significant positive effect on the estimates and their uncertainties.
A similar application related to land cover uncertainty quantification was investigated by Magnussen and De Bruin (2003). The proposed approach used SIS to generate probabilistic maps of forest inventory cover type in New Brunswick using classified Landsat images as a covariate. Maximum posterior probability (MAP) maps were obtained by computing the mode of many SIS realizations. According to the results, the accuracy of the MAP forest cover type maps was higher than that obtained by the maximum likelihood classification.

Upscaling and downscaling
The spatial resolutions of remotely sensed datasets vary. Integrating such data, therefore, requires an upscaling or downscaling procedure. As mentioned, if the resolution of the generated data is coarser than that of the input dataset, the procedure is called upscaling. Geostatistical simulation methods have obtained acceptable results for upscaling. For example, Mahmud et al. (2014) showed that Image Quilting, an MPS method, obtained acceptable results in upscaling hydraulic conductivity measurements. They first upscaled the coarse TI to larger scales using a classical upscaling method. The obtained TI set is then used as the input of an MPS model to create multiscale models. The potential of using geostatistical simulation for upscaling remotely sensed images of an area with vegetation land cover was investigated by Wang et al. (2004) in a case study area at Fort Hood, Texas, USA. The authors compared four upscaling methods based on simple and ordinary cokriging estimators and an SGCS algorithm for pixels and blocks. These algorithms were used to upscale vegetation-tree cover from 30 m resolution to 90 m with the aid of Landsat TM images, and the results were compared with maps generated directly using the sample data at a support size of 90 m. The best results were obtained by the PsK_PSUP method whereby a block (i. e., a window consisting of smaller pixels) was divided into a fixed number of pixels. Next, the pixel estimates were obtained by using pixel SGCS and then upscaled to blocks.
If the resolution of the generated data is finer than that of the input dataset, the procedure is called downscaling. Kyriakidis (2004) provided a geostatistical estimation solution for downscaling called area-to-point kriging (ATPK). ATPK is based on the variogram defined on point support. To this end, punctual variograms are obtained from the variogram of a remote sensing image (areal measurements as pixels). This estimation method was further developed as a simulation method that can be used for directly downscaling satellite images (Kyriakidis and Yoo, 2005). More downscaling applications are discussed in Sections 3.6.1 and 3.6.2.

Sampling design
Similarly, for soil science applications, geostatistical simulation has been used to improve the design of in situ data sampling for vegetation studies. Lin et al. (2009) showed that the reconstructed NDVI images with kriging and SGS based on the selected samples using conditional Latin hypercube sampling (cLHS) could capture the statistics of the NDVI images. Lin et al. (2011) further investigated different sampling schemes and validated them based on fully known SPOT-derived NDVI data before and after the large ChiChi earthquake and large typhoons in Taiwan. They used cLHS and the variance quadtree technique to select spatial samples. The NDVI images were then reconstructed by using SGS based on the selected samples. The results indicated that the proposed method could simulate the NDVI and capture the spatial characteristics of disturbed landscapes. Following the study of Lin et al. (2009 studied the effects of cLHS sample selection on SGS simulation. They investigated the effects of large disturbance impacts on the spatial characteristics of landscape changes using an approach that integrated cLHS, SGS, and spatial analysis tools such as Moran's I in NDVI images obtained from SPOT imagery. The results confirmed that 8% of the NDVI samples selected by cLHS could be used by SGS to simulate the NDVI while preserving the key spatial patterns.

Topography
The topography is typically represented by digital elevation models (DEMs), which are raster maps. Up-to-date and precise information on Earth's elevation, which is essential in many applications, can be obtained using remotely sensed data. The main aspects of using geostatistics simulation methods in modeling topography are assessments of the errors in DEMs and downscaling.

Error assessment
Initial work in this field focused primarily on DEM uncertainty. Chu et al. (2014) studied the LiDAR sampling effect on DEM uncertainty. They first proposed using cLHS and simple random sampling to select LiDAR samples, and then SGS was applied to generate DEM realizations. Based on the DEM realization, they established topographical uncertainty. The approach was evaluated on airborne LiDAR data for a city in Taiwan. The resulting DEM uncertainty was most dependent on the sampling method for low sampling density values. The proposed method obtained acceptable results in replicating the DEM with a low data density. In addition, the proposed method provided further insight into identifying the uncertainty of spatial features.
Assessment of the errors in DEMs was further explored by Leon et al. (2014), who incorporated DEM errors in coastal inundation mapping based on a geostatistical simulation model. The correlation of the elevation errors with land cover and terrain variables was explored, and then SGS was used to simulate the elevation errors through a Monte Carlo approach. Next, the simulated errors were added to the original elevation, and the probability of future inundation was obtained by the frequency of realizations where a DEM cell was inundated. This approach resulted in a larger inundated area than that obtained by the deterministic approach, showing that the use of deterministic approaches can result in overly conservative estimates.

Downscaling
Terrain analysis often requires elevation information at scales finer than the available DEMs. Geostatistical simulation approaches provide downscaling frameworks with uncertainty associated with the subpixel predictions. Tang et al. (2015b) proposed using FILTERSIM as an approach for spatial downscaling of coarse-resolution global multiresolution terrain elevation data (GMTED2010) from China using fineresolution shuttle radar topography mission (SRTM) data. The proposed method developed new filters for FILTERSIM, as the conventional FILTERSIM does not consider the nonstationarity of spatial structures when gathering similar patterns in the prototyping process. They proposed a feature-based prototyping process in which the topographic indices were specific for topographic characterization, such as the DEM residual surface, the vector ruggedness measure, and the ridge valley class. The results were compared with the results of two data fusion methods, including the traditional geostatistical interpolation methods and the conventional FILTERSIM algorithm. The results showed that the proposed procedure better preserved spatial structures than the two other tested methods.
An application related to the simulation of topography is the fusion of different types of river bathymetry data. Jha et al. (2013b) used conditional DS to combine a punctual bathymetry dataset of the Mississippi River and low-resolution sonar surveys, using as additional information high-resolution TIs consisting of bathymetry surveys acquired several months earlier, when the riverbed was significantly different.
More recently, research on DEM downscaling by Rasera et al. (2020) used a multiresolution TI. The authors proposed a multiple-point statistical simulation approach for downscaling a coarse DEM. The multiresolution TI was used to import fine-scale structures to the coarse-scale image through a series of conditional downscaling iterations. The technique was tested on two DEM datasets of mountain ranges in Switzerland, resulting in improved performance compared to the results of area-to-point simulation, bicubic interpolation, or DS, even when the statistics of the TI differed from the conditioning data statistics.

Climate and atmospheric science
The availability of remotely sensed data for observing Earth's climate and atmosphere is growing rapidly. Geostatistical simulation methods can be used to characterize spatiotemporally varying fields such as evapotranspiration, rainfall, and temperature (Mariethoz et al., 2012). These applications can be grouped in the downscaling and mapping of climate parameters.

Mapping
An application of geostatistical simulation in atmospheric science based on remotely sensed data is weather generation. Wojcik et al. (2009) proposed an approach for the simulation of spatial rainfall occurrence patterns conditioned on remote sensing measurements. Their method first used Geostationary Operational Environmental Satellites cloud top temperatures to identify areas where rainfall may occur. The rainy areas were generated inside the obtained regions in the subsequent steps. To this end, the proposed algorithm generated the rainy areas for individual replicates using an MPS procedure based on a TI derived from ground-based weather radar, which used the patterns of precipitation within a specified template. Finally, rain rates were generated within each rainy area based on a multiplicative cascade model. In a case study involving summer 2004 data from the central U. S., the generated rainfall simulations were visually and statistically similar to ground-based weather radar data.
The application of MPS simulation to climate data was further explored by Oriani et al. (2017) to generate a high-resolution daily rainfall field using rainfall radar images, DEM, and indices characterizing weather patterns. This method first used DS to select radar images to be used as TI from the available dataset based on daily synoptic conditions and then used these TIs to simulate daily rainfall spatial fields using DS, which were conditioned by elevation. A case study was shown for an area in the eastern Mediterranean during the 2002-2003 wet season, where the simulated rain fields preserved the correct statistical relation with elevation, the temporal weather patterns, and the spatial features.

Upscaling and downscaling
Hydrology and agricultural models often require climate data such as rainfall at finer spatial scales than those available in climate models or reanalysis data. As climate data are spatially correlated variables, geostatistical simulation methods provide useful tools for analyzing them. Preliminary work in this field was carried out by Allcroft and Glasbey (2003). In their proposed approach, TGS is used to disaggregate rainfall at fine scales. The method was tested on 12 h of hourly data from the Arkansas-Red Basin River Forecast Center Web site, confirming the efficiency of the proposed method.
The downscaling of climate data using geostatistical simulation was further investigated by Jha et al. (2013a), who used DS to jointly downscale three different variables (temperature, soil moisture, and latent heat flux) derived from a regional climate model in the Murray-Darling basin in southeast Australia. To consider the joint relationship between all three variables, a multivariate distance criterion was used, which was a linear combination of univariate distances. The simulations preserved realistic patterns during the downscaling approach. While the algorithms required TIs at both coarse and fine scales, these TIs could be obtained from satellite images. The study of Jha et al. (2013a) was extended to consider the spatiotemporal relationships when downscaling daily precipitation and temperature data (Jha et al., 2015). To consider the spatial and temporal characteristics, information at different spatial resolutions and from previous days (time dimension) was represented as additional variables in this study. The approach was applied to an area in southeastern Australia, including twenty years of daily P and daily T at both 50 and 10 km. This study considered two different strategies: 1) not considering temporal dependence and 2) including temporal dependence. The parameters in the DS approach were selected mostly based on the guidelines provided by Meerschman et al. (2013). The weights of the distance criterion were selected based on sensitivity analysis. The results indicated that the downscaling procedure was able to produce complex patterns with the same features as the 20-year training data. However, the inclusion of temporal information only slightly improved the downscaling outputs.
A similar application related to downscaling rainfall maps based on field data and coarse resolution remote sensing-based products was proposed by Park and Kyriakidis (2019). They used point-level validation data to generate a fine resolution reference map using SGS. The reference map was upscaled to the resolution of the remote sensing products, and an error map was obtained using pixel-by-pixel comparisons of the reference map and the satellite-based product. This approach has two main applications: obtaining the uncertainty of the reference map generated from validation data and resolving the resolution differences between coarse resolution remote sensing products and validation data. An experiment involving the Tropical Rainfall Measuring Mission monthly precipitation products over Korea and monthly accumulated rainfall measurements obtained from automatic weather stations demonstrated the validity of the proposed method.

LULC mapping
LULC data based on remote sensing imagery plays an important role in monitoring the changes in Earth's surface, urban planning, and disaster management. Several factors affect the accuracy of land cover maps (Li and Zhang, 2011), such as limited spectral and spatial resolution or image complexity. Accordingly, it is essential to characterize the uncertainty of classified images while trying to improve their accuracy. Geostatistical simulations have been used for different applications in this domain, including classifying remotely sensed data, improving fraction images, and postprocessing classified images.
Initial work to generate land cover maps by geostatistical simulation was carried out by Kyriakidis and Dungan (2001) to generate thematic maps with high accuracy based on a combination of image-derived (soft data) and field-based class (hard data) labels. This method used Landsat images from Montana, USA. SIS was used to generate simulations of the reference classification, while hard and soft data integration through SIS was accomplished by indicator kriging. A location-dependent model of classification uncertainty was also obtained. In the case study, spatial classification accuracy and propagating classification uncertainty to ecological model predictions via stochastic simulation were also evaluated to confirm the effectiveness of the proposed method.
Other geostatistical simulation methods have been used to classify satellite images. Ge et al. (2008a) used SNESIM to combine spectral information and spatial information. In the proposed method, maximum likelihood classification was used to obtain a probability field. Then, sample pixels used in maximum likelihood were taken as hard data for SNESIM to obtain another probability field. Finally, the obtained probability fields were fused and used as category probability maps in the SNESIM simulations. The method was tested on a Landsat image from China, where the proposed method obtained better results than a maximum likelihood classifier. Ge and Bai (2011) further investigated their classification method by applying it using a multigrid approach to a high-resolution SPOT image. The multigrid approach was used to combine different scale templates.
A similar application related to land cover classification using MCRF was proposed by Li and Zhang (2011). After determining the number of class labels and obtaining a set of class samples using image interpretation and/or the gathering of field data, they estimated transition probability diagram (transiogram) models from the dataset. MCRF was then used to simulate land cover classes that were conditional to the dataset and the occurrence probability of each class at a pixel obtained using the simulation results. The classified map was then obtained from the maximum occurrence probabilities. The approach was tested on a satellite image from China with main classes, resulting in better results than two unsupervised classification methods and maximum likelihood classification, but at the expense of increased computation time.
More recently, a combination of a classification method and an MPS method was investigated by Tang et al. (2016). They proposed an MPSbased spatially weighted k-nearest neighbor (k− NN) method for classifying remotely sensed data. This method requires a TI (i.e., classified image) that has a similar class spatial distribution to that of the study area (it can be an initial classification or a simulation result) to classify the image. The TI is first projected into a spectral feature space to find the k nearest training neighbors of each pixel. Then, these k nearest neighbor pixels are used to construct the multigrid data template used to scan the TI. The multiple-point probabilities correspond to the frequency of replicates. The class label is then selected based on the obtained probabilities. The proposed approach was tested on WorldView-2 and IKONOS images, resulting in more accurate classification than other compared methods, including k-NN, IDW, Bayesian, and support vector machine.
Most spectral-based classification algorithms do not consider spatial information, although the integration of contextual information improves the accuracy of the obtained classified image . Indeed, most satellite imagery contains mixed-class pixels rather than pure-class pixels. This phenomenon leads to a reduction in the accuracy of classified images. Soft classifications provide classified outputs at the subpixel level (i.e., fraction images) to solve mixed pixel problems; however, the fraction images contain errors that propagate into hard-classification maps. Geostatistical methods are able to use the spatial structural information of surface objects to improve the outputs of soft classification algorithms. For example, Ge (2013) proposed an approach based on MPS to reduce the uncertainty in fraction images. In the proposed methodology, after obtaining fraction images using a soft classification, the spatial properties of classes were characterized using SNESIM, and improved fraction images were obtained. Accordingly, a posterior probability field was derived from multiple SNESIM realizations. Finally, the posterior probability field was fused with the soft classification output, and subpixel mapping was performed. The TI in this study was obtained by setting a threshold on the fraction image. The test case was a Landsat image of farmlands in China, and the results were validated with a SPOT image from the same area. The method increased the identification of the farmland boundaries; however, other studies may need to test the proposed method in a study area with more than one surface object/class.
Geostatistical methods also provide a useful tool for postprocessing classified images by incorporating spatial dependence into land cover classes. Based on the work of Li et al. (2015), who suggested that the co-MCRF approach could improve classified images, Zhang et al. (2016b) investigated the use of co-MCRF on expert-interpreted sample datasets and classified images. They tested the approach on a Landsat image of a large area with a complex landscape. This approach resulted in an improvement in the accuracy of the obtained LULC maps compared to those obtained from pixel-based classifiers. Following this study, Zhang et al. (2016a) proposed an approach to integrate two spectral similarity indices as constraining factors into co-MCRF, which was called SS-coMCRF. This method aimed to reduce the geometric feature loss in the land cover postclassification procedure (Zhang et al., 2016a). The role of the constraining factors in estimating the local probability distribution was to update the transition probability. The constraining factors were used to adjust the contribution of the nearest data according to their spectral similarity. The two spectral similarity indices used in this study were the Jaccard index and the spectral correlation measure. The algorithm was tested on two different satellite products, Landsat and QuickBird, which were acquired from Wuhan and Shenzhen cities, respectively. The results exhibited an improvement over co-MCRF. Furthermore, Wang et al. (2018) investigated SS-coMCRF to postprocess object-based LULC classification and found that it was effective in 4 different case studies.
In the same vein, Tang et al. (2013) proposed to increase the accuracy of a remotely sensed land cover classification using SNESIM. The method started with classifying a Landsat image using maximum likelihood classification based on 300 training pixels (hard data), and the result was used as the TI for the next step. SNESIM was then applied to the remotely sensed data based on the available TI, hard data, and probabilities of each class per pixel obtained from the maximum likelihood classification (soft data). The final outcome was the average of 100 SNESIM realizations, which resulted in spatial smoothing that reintroduced some spatial dependence. The proposed method increased the accuracy of the remote sensing classification compared to both spatial postprocessing filtering and a contextual MRF classifier.
One of the advantages of geostatistical simulation is the quantification of uncertainty. This is particularly relevant for LULC mapping, where the uncertainty of categorical data can be significant. For instance, De Bruin (2000) estimated a per-pixel probability distribution of classes at a given pixel based on an ensemble of realizations obtained by indicator cosimulation. A similar approach based on multiple realizations has been used by Li and Zhang (2011) in the context of Markov chain sequential simulation and by Tang et al. (2013) with MPS, additionally using a maximum likelihood classification as soft data. It should be noted that soft class labels are called compositional data, which are nonnegative values that represent the relative importance of the parts forming a whole. To ensure that the constant sum constraint is respected in the realizations, two-point geostatistical simulations can be performed on log-ratio scores. In this case, multivariate geostatistical models are applied to the transformed scores, and then back-transforms are used (Tolosana-Delgado et al., 2019). For instance, Park and Jang (2020) integrated the KOMPSAT-2 image obtained from the Baramarae tidal flat in Korea with the log-ratio transform of sediment composition data using simple kriging with locally varying means. Then, they obtained sediment compositions using inverse log-ratio transformation, which were classified to obtain sediment maps.
3.6. Generic applications to satellite-obtained image/map enhancement 3.6.1. Super-resolution mapping Spatial and spectral resolutions have important effects on extracting information from satellite data. Downscaling is a procedure in which the data resolution becomes finer compared to the original data. Recently, geostatistical simulation methods have been used to downscale remote sensing data. The early literature on downscaling based on geostatistical simulation methods focused on generating a fine-resolution map of class labels. Boucher and Kyriakidis (2006) proposed a geostatistical approach based on indicator cokriging and SIS to create subpixel land cover maps from coarse class fraction data. In the proposed method, indicator cokriging determined the probability that a pixel belongs to a particular class at the target resolution, given the coarse resolution fractions. These indicator-cokriging-derived probabilities were then used in SIS to generate subpixel land cover maps. The proposed method was tested in China using Landsat images. The results showed that the proposed method could be efficiently used for subpixel LULC mapping. The proposed method required that informed fine pixels be located at a distance smaller than the coarse pixel extent to compute reliable variogram values for the subpixel lag distance. In the case where fine resolution sample data were not available, it was necessary to synthesize a prior model using high-resolution images from a nearby area with similar classes or in the same area but from the past.
Following the study of Boucher and Kyriakidis (2006), Boucher and Kyriakidis (2007) employed SIS to generate a fine-resolution map of class labels based on a combination of coarse class fraction data as well as fine class label data. They investigated the integration of a set of fineresolution class labels obtained independently of remotely sensed measurements, which was achieved within the indicator kriging formalism. Similar to the study of Boucher and Kyriakidis (2006), indicator cokriging was used to determine the probability that a pixel belonged to a particular class at the target resolution, given the coarse resolution fractions as well as a sparse set of class labels at some informed fine pixels. However, contrary to the method proposed by Boucher and Kyriakidis (2006), instead of parametric indicator semivariogram models, a set of experimental indicator semivariogram maps were used to quantify the texture of class labels at the target resolution. The designed experiment utilized super-resolution mapping using a semisynthetic dataset of impervious surfaces from Guangzhou, China. The results demonstrated that the integration of fine spatial resolution information obtained more realistic subpixel class maps than those obtained based on only the coarse fraction data.
Moreover, Boucher (2009) suggested using an MPS method to downscale remote sensing data. He proposed a novel method based on SNESIM for subpixel mapping of coarse satellite images. In the proposed method, the block indicator kriging (BIK) method was first used to downscale the coarse fraction data to fine resolution probabilities. Then, TI-derived probabilities and the BIK-derived probabilities from the last step were integrated. To integrate these probabilities, the TI should upscale into proportion maps at the sensor resolution where each coarse pixel was linked to its underlying TI pattern. They also introduced a search tree partitioning approach to improve the SNESIM simulation results, which included two steps: 1) partitioning the upscaled TI using a clustering method and 2) setting up a search tree for each partition class. They tested the proposed method on Landsat images from China. They concluded, based on the obtained results, that combining the partition approach and the BIK-derived probability generated more accurate and faster realizations than those obtained when only one of the methods was used.

Downscaling continua
The use of MPS for downscaling was further investigated by Mariethoz et al. (2011). They proposed a downscaling method based on DS, which required only a coarse image as a TI. In the proposed method, the coarse pixel was first divided into four (or more) fine pixels, and the coarse pixel's value was assigned to one of them; then, it used the coarse image as the TI, and the remaining fine pixel values were simulated using DS. They tested the proposed method on two different datasets: 1) a Landsat classified image (i.e., categorical variable) and 2) an aerial photograph (i.e., continuous variable). The results demonstrated that the proposed method obtained acceptable results, as it performed superresolution mapping while maintaining the patterns present in the coarse image. Truong et al. (2014) started from the ascertainment that it is impossible to infer the nugget of the point-scale variogram based only on coarse observations. In consequence, the authors used a statistical expert elicitation procedure to derive prior probability density functions of the nugget parameter, which are used as priors in Bayesian inference. Then, they generated unconditional simulations using SGS and finally conditioned these simulations to the observations based on ATPK. They concluded that expert knowledge provides a viable option for a case study where MODIS atmospheric temperature profile data were downscaled based on combining ATPK and SGS. Following this study, Wang et al. (2020) showed the importance of using the point spread function (PSF), which affects the signal attributed to a remotely sensed pixel. They propose a method to perform downscaling based on ATPK when the PSF is available (e.g., provided by the manufacturer of the sensor) and to infer the PSF otherwise.
ATPK was further explored by Tang et al. (2015a), who developed a form of pansharpening that downscaled multispectral bands while retaining the spatial structure of the panchromatic band. The proposed method was a combination of a two-point geostatistical method (areato-point cokriging, ATPCK) and the FILTERSIM multiple-point technique: ATPCK-FILTERSIM. This method used ATPCK on the fine resolution image to provide conditioning data (soft data) and then downscaled images using FILTERSIM based on the soft data and the coarse resolution image (hard data), which was also used as the TI. They tested the proposed approach on two different datasets, including Landsat and QuickBird images. The results showed that the MPS method could restore the spatial structures of the panchromatic image.
In addition, Zhang et al. (2017) proposed using isometric mapping (ISOMAP), which reduces the dimensionality of nonlinear data, and multiple-point statistics to perform super-resolution reconstruction of remote sensing images. Their proposed method was similar to FILTER-SIM, which characterizes the patterns by some filters to reduce the pattern complexity, while their proposed method used ISOMAP. Their method first scanned a TI to construct a pattern dataset and then performed ISOMAP on the pattern dataset for dimension reduction. Finally, the dimension-reduced pattern dataset was clustered, and simulation using coarse fraction information existing in low-resolution images (soft data) and the already simulated pixels (hard data) was performed. They tested the proposed procedure on two categorical datasets and two Landsat images as continuous datasets. The results showed that the super-resolution outputs preserved the structural characteristics of the input data.
The enhanced spectral resolution of remote sensing images based on the geostatistical simulation method was introduced for the first time by Gravey et al. (2019). They proposed an approach for enhancing the spectral resolution of archived satellite data based on an MPS procedure. The proposed method was based on matching the spatio-spectral patterns, which characterize the spectral signature of pixels, as well as in their spatial neighborhood, between a target image (an old image) and a TI (a recent image), and spectral information of this correspondence was imported from the TI to the target image. They developed the narrow distribution selection (NDS) algorithm. In NDS, the selection of a pattern is conditioned to a match in its neighborhood and to narrow the CPDF of the value for a target pixel. Similar to DS, NDS is sequential. However, the simulation path is not random, and it follows an order defined by the narrowness criterion. The proposed algorithm was tested on a target image, whose spectral bands were previously known, to validate the results. This method was also tested on some of the oldest satellite acquisitions, the Corona images, for which no reference color data are available. They compared the proposed method with DS and the Welsh approach (Welsh et al., 2002), which searched for a similar location in the TI by pattern matching in a window of 5 by 5 pixels. The color at the location with the highest match was then transferred to the target image. The results demonstrated that DS provided acceptable results. However, due to its computational cost, the NDS approach was recommended only if very high-quality spectral enhancement was required. Oriani et al. (2020a) proposed a method to downscale highresolution satellite images (up to ~3-4 m) using limited training data of very-high-resolution (~1 m) images. Their method relied on highresolution information from a nearby area or an archived image from the area, and it was based on the DS algorithm. To determine the nonstationarity in the image and to distinguish different types of objects that can be detected in low-resolution images, in addition to the raw image bands, they added additional information, as the k-means classification computed on the low-resolution bands. The results confirmed good overall performance in generating realistic high-resolution structures related to both natural and human-made features (Fig. 7).

Gap-filling
Remotely sensed data mostly contain some gaps due to clouds, cloud shadows, or even systematic errors. Moreover, as the time interval of satellite revisiting is more than one day for most satellites, and during these days, changes might have occurred, some researchers have focused on satellite data gap filling. Geostatistical simulation methods, which are able to preserve texture information, have recently been introduced as effective methods for gap filling.
Gap-filling uncertainties are essential in data analysis. To the best of our knowledge, only the DS method has been used as a simulation approach to gap-fill remotely sensed data. Mariethoz et al. (2012) proposed a method that fills the gaps using DS based on known values at different locations/dates. It tries to find a known pixel whose neighbors have values similar to the neighbors of the unknown pixel. In the case of using several variables in the gap-filling procedure, the DS method defines neighborhoods across the different variables and uses a weighted average of the distances taken individually for each neighborhood. They tested the proposed method on the data obtained using a regional Fig. 7. Comparison of a realization of a downscaled image obtained from DS based on the TI derived from WorldView and the input low-resolution image by Planet labs; © 2020 IEEE. Reprinted, with permission, from (Oriani et al., 2020a). climate model of southeastern Australia (i.e., latent heat flux, surface temperature, and soil moisture). The obtained results confirmed the ability of the proposed procedure to produce texturally realistic gapfilled data.
The use of the DS method in gap-filling was further investigated by Yin et al. (2017b). The proposed method worked without using any additional TIs if a significant portion of the target image was already known. In this case, i.e., the univariate case, the non-gap locations in the target image function were used as the training data to fill the gaps. In addition to the univariate case, in the multivariate case, the reflectance value in the target image together with the reflectance value from the same area but at a different date was incorporated into the simulation. This method means that both temporal and spatial features were used to fill the gaps. In this study, 6 regions dominated by different land covers obtained from ETM+ were used. For each study region, nine individual cases, including both univariate and bivariate analyses, were selected. According to the obtained results, the DS method performed satisfactorily in filling Landsat 7 gaps. Moreover, there was often no need to introduce additional input images to fill the gaps in homogeneous areas, as the information within the target image may be good enough to produce acceptable results.
Moreover, Yin et al. (2017a) compared Landsat-7 gap-filling results obtained using the DS method and three other algorithms, including 1) kriging and cokriging; 2) geostatistical neighborhood similar pixel interpolator, and 3) weighted linear regression. A geostatistical neighborhood similar pixel interpolator needs at least one additional Landsat image acquired on another date in which the gaps have values. A geostatistical neighborhood similar pixel interpolator first removes the trend of the image, and then the residuals are estimated using kriging. On the other hand, weighted linear regression finds a linear relationship between similar pixels in coincident images. The gaps are filled using this linear relationship in this method. The results suggest that for a homogeneous area, all methods yield acceptable results. However, for a heterogeneous case study, a geostatistical neighborhood similar pixel interpolator performs better than the other algorithms. Moreover, for a case study with an abrupt change, DS yields the best results. Fig. 8 summarizes geostatistical simulation models used in remote sensing studies. In Fig. 8, the goals, outputs, and models used are outlined for each application.

Remarks on Geostatistical simulation workflows
Considering the reviewed studies, we formulate a conceptual workflow for analyzing satellite data using geostatistical simulations. This workflow is presented in Fig. 9 and consists of the following three main steps: 1) Exploring the nature of the variables: The first step is to determine the nature of the variables. Continuous and categorical variables require using different models. While many physical processes can be represented with either continuous or categorical variables, humans often prefer to work with objects (Goodchild et al., 2007). However, to the best of our knowledge, object-based geostatistical simulation methods have not been used to extract information from remotely sensed data. Object-based geostatistical simulation methods may therefore be further explored in the future.
2) Exploring the spatial complexity and the amount of available data: Representing and inferring the spatial parameters of a complex phenomenon require large amounts of data. Two-point geostatistical methods are based on variograms (or covariances), which are entirely defined by a small number of parameters. While they can be inferred from relatively small datasets, they are not appropriate to represent complex structures. Alternatively, methods based on training images can generate complex patterns, but the availability of the training images can sometimes be problematic. More details can be found in Section 2.1.

3) Exploring the number of realizations:
The main criterion for deciding the number of realizations is the balance between computing cost and the need to represent a detailed probability distribution of the modeled quantity. As remote sensing applications mostly have large, high-resolution domains, it is not always possible to have a vast number of realizations. Based on the reviewed studies, the minimum, the maximum, and the average number of realizations of each application are summarized in Table 3.
While geostatistical simulation methods have improved over the years, there remain challenges to consider when analyzing remotely sensed data: • Variability between realizations: uncertainty quantification is generally based on a statistical model representing the spatial phenomenon considered. Such a model generally exists with parametric approaches, so the variability between realizations quantifies uncertainty in a rigorous way, but often misses high-order, complex features that may be important. On the other hand, nonparametric approaches can capture more complexity but at the cost of a poorly defined model of uncertainty, often depending on tuned rather than statistically inferred parameters. • Validation: the realizations obtained in remote sensing applications should be visually realistic and reproduce the expected features of the region of interest. In some applications, such as downscaling, the resulting downscaled simulation can be compared to an available fine-scale satellite image. In gap-filling applications, the simulations can be compared with available gap-free images. However, finding relevant data to be compared is not always straightforward. There are also some general validation approaches regardless of application. For instance, in two-point geostatistical simulations, several approaches, such as testing the reproduction of histograms and semivariograms, are available, which are sufficient to define the model of spatial dependence. In MPS, different approaches should be defined for comparing statistics of the TIs with realizations and testing the achievement of conditioning. These methods can be categorized into five groups, including using summary statistics, avoiding verbatim copy, checking trend reproduction, comparing spatial uncertainty, and consistency checks for conditioning, which are discussed as follows : i) Using summary statistics: specific statistical properties such as histograms, connectivity functions, and semivariograms of a set of realizations are used to compare TI and realizations. ii) Avoiding verbatim copy: portions of the training images that are exactly reproduced in the realizations are defined as "verbatim copy", an undesirable phenomenon. A verbatim copy can be quantified using coherence maps generated by indexing each pixel's location in the TI and keeping track of them during the simulation. iii) Checking trend reproduction: Trend reproduction should be verified in cases of nonstationary modeling. iv) Comparing spatial uncertainty: One way to evaluate it is the analysis of distance. This method is based on using two distances: a first one that quantifies the dissimilarity between the realizations and the TI and a second one that evaluates the spread of realizations. Ideally, the first distance should be minimized, and the second distance should be maximized (Tan et al., 2014). v) Consistency checks for conditioning: It is desirable that conditioning data have some regional effect. In two-point models, the realizations can be verified against the known posterior mean Fig. 9. Schematic workflow for analyzing satellite data using a geostatistical simulation model. and posterior variance or covariance. One way of testing this consistency in MPS is by generating a large number of unconditional realizations and extracting a set of conditioning data from them. Then, a set of conditioning simulations is generated using each obtained conditioning data. Ideally, the conditioning simulations should be equivalent to the unconditional simulations.
For further details on validation approaches, one may refer to . We note that most of the reviewed studies involving MPS have not considered all five types of validation methods. It is, however, recommended to use as many of them as possible in model quality assessment.
• Scale: while parametric approaches can model the support effects due to their analytical formulation, this can be more challenging for training image-based approaches. With training images, the scale of the simulations is related to the scale of the TIs. This relates, once again, to the challenge of finding appropriate TIs (e.g., from the region of interest at another time or using images from an area with similar features).

Hybrid Geostatistical simulation models
In recent years, other hybrid simulation models have been introduced in geostatistics, often combining geostatistical simulation models with optimization algorithms. For instance, Yang et al. (2016) proposed combining MPS with an iterative optimization algorithm to improve realizations. In another work, Pourfard et al. (2017) proposed an optimization-based simulation inspired by computer graphics, also based on optimization to minimize the error between a set of overlapping image patches. One challenge with this type of algorithm is that the optimization parameters often require tuning.
Another new approach is combining neural networks with geostatistical simulation models. One very early study in this domain is the work by Caers and Journel (1998) that trains a neural network to learn multiple-point statistics. More recently, deep learning techniques such as generative adversarial networks (GANs) have received remarkable attention to be used for geostatistical simulations. One of the challenges in this domain, however, is to generate conditional simulations because, in practice, the networks are trained on unconditional images. This has received much attention in recent studies (Chan and Elsheikh, 2020;Laloy et al., 2018;Mosser et al., 2017).
These hybrid geostatistical simulation methods, to the best of our knowledge, have not been extensively used in remote sensing applications, with the notable exception of Zhu et al. (2020). They proposed to Table 4 An application-based overview of geostatistical simulation models using RS data. SGS: sequential Gaussian simulation; SGCS: sequential Gaussian cosimulation; SIS: sequential indicator simulation; MCRF: Markov chain random field; co-MCRF: Markov chain random field cosimulation; SNESIM: single normal equation simulation; DS: direct sampling; FILTERSIM: filter-based simulation; LULC: land use/land cover; * it has been combined with ATPK.
use a deep learning model based on conditional GANs to interpolate elevation data located in mainland China. The proposed method showed the ability to use deep learning molds in capturing deep spatial features. This highlights the potential of hybrid geostatistical simulation models in remote sensing as a possibility to improve future studies in terms of computational cost or representation of patterns that are not available in TIs.

Summary
This review presents geostatistical simulation models that are used for the processing of satellite remote sensing data and an overview of their different application domains. This demonstrates the broad range of remote sensing applications where the use of geostatistical simulations can be valuable. This synthesis will help guide future research by specifying each method's applications, advantages, and limitations. Table 4 summarizes these applications and shows which areas remain to be investigated in future research. We intentionally did not include the application of geostatistical estimation (e.g., kriging), which is outside the scope of this review. This table shows that parametric approaches (SGS and SIS) have been the most widely used. We explain this by the fact that these simulation methods have been available to the scientific community for a very long time (since the 1980s). In contrast, much remains to be explored with nonparametric and TI-based simulation approaches, which have the potential to fully use the rich spatial information content of remote sensing imagery.
For example, while TI-based approaches have shown good results in digital soil mapping using field data, our review shows that these methods have not yet been used to include information from remote sensing in soil maps. In contrast, TIs have been used in topography mapping, and specific methods have been developed for this application. Similarly, two-point-based simulation methods are the most popular in vegetation studies. However, similar to soil applications, there is untapped potential for TI-based methods to be investigated in future research, since they may reveal complex spatial interactions that are ignored when only considering interactions between pairs of pixels. Similarly, there is scope for applying TI-based approaches to climate and atmospheric data. These examples show that the cross-disciplinary character of this review allows identifying gaps and possible future promising research directions.
We also note that in many applications, a single geostatistical approach was used, highlighting the need for a more systematic comparison of approaches to identify those that are more appropriate for the purposes of each domain.
As mentioned in the gap-filling section, clear-sky observations are sparse due to acquisition limitations, clouds, or shadows. Accordingly, gap filling is one of the important application domains in remote sensing. However, Table 4 shows that only a handful of applications have used geostatistical simulation to address this problem, which is most often addressed using deterministic approaches. The widespread use of a deterministic framework is remarkable because it does not allow quantification of uncertainty, which is often significant in gap-filling problems. For this reason, we believe that more research and applications are required for routine use of geostatistical simulation and uncertainty quantification, with the goal of making available to the scientific community gap-filled datasets that include uncertainty bounds or alternate realizations of gap-filled values.
Various types of geostatistical simulations have been used in remote sensing applications, and this review notes that a large diversity of approaches have been used to address a given problem, e.g., downscaling, sampling design, gap-filling, or error assessment. Indeed, in each subdiscipline, researchers have developed ad hoc algorithms and practices that seem to work in the cases investigated but differ from those in other subdisciplines. We believe that the remote sensing community could benefit from interdisciplinary exchanges in order to develop common systematic guidelines and best practices to address a given problem.
Looking to the future, geostatistical simulations could be used to open new research avenues and address data science problems such as change detection or image fusion. For instance, change detection can be seen as a classification procedure where the goal is to find change and no-change areas, or in more advanced applications, "from-to" classes. For instance, change maps can be generated by classifying the entire time series together using geostatistical simulation algorithms. Accordingly, geostatistical simulation models used in the classification domain have the potential to be used in change detection applications. As another example, geostatistical simulation methods have shown acceptable results in both spectral and spatial downscaling (see Section 3.6.2). By extension, these models could be used as an image fusion tool to improve the spectral and spatial resolutions simultaneously, potentially resulting in stochastic pansharpening algorithms. Another application where geostatistical simulations could be used is analyzing time series data. For example, the TI-based methods can reconstruct missing data of hydrological flow rate time series (Oriani et al., 2016) or simulate daily rainfall time series (Oriani et al., 2014). There is real potential to extend such approaches to the analysis of remotely sensed image time series.