Application of spatial classification rules for remotely sensed images

In this paper the remote sensed image classification example using spacial classification rule with distance (SCRD) is examined. This supervised classification method was first presented in paper [11]. This method is improved version of earlier method PBDF [4, 10, 9], during the classification it incorporates more spatial information. The advantage of this method is its ability to classify data which is corrupted by Gaussian random field and it is typical to remotely sensed images classified in this letter which are corrupted by clouds. Classification accuracy is compared with earlier method and with other commonly used supervised classification methods.


Introduction
The classification of remotely sensed images is widely used in different situations. Lots of methods performs very good in clear situations, but often the remotely sensed images or some parts of them are covered with clouds. In this case these common methods perform errors, because they don't have the information behind the clouds. The methods authors are working with helps in this exact situation, the clouds are modeled according to the training sample information and spatial relation between observation to be classified and the training sample, and this modeled information is used during classification for better decision making.
The earlier method [4,10] was tested in the real situation for classification of the real remotely sensed image covered with clouds in article [9]. Later the more advanced method was created [11] and now in this letter the performance of this method is tested in the same situation. In the paper [9] the remotely sensed image was used from Landast7 satellite. Now, when Landsat8 satellite images are available, the same situation was tested on Landsat8 satellite image which is more corrupted by clouds and the image is larger when the first one.
In order to show the complicity of such situation some other common methods are used in the same situation for classification. For the commonly used supervised classification methods the rassclass package from the R system is used. Numerical and visual analysis of discriminant function in the case of isotropic exponential spatial correlation for the nearest neighbor neighborhood system using eight nearest neighbors N N (8) is done. All calculations are done in R system.

Method declaration
The incorporation of the spatial information into image classification is done by some authors (see e.g., [1,2]). The statistical supervised classification method (PBDF) [4] was extended by incorporating more influence from the spatial dependency into classification problem and achieving higher accuracy in the paper [11] and this method is named SCRD.
The plugin Bayes discriminant function (PBDF) to the classification problem is whereμ 0 lt =μ l + α ′ 0 (z n − X yμ ), andσ 2 0t =σ 2 R on , γ(y) = ln(π 1 (y)/π 2 (y)). The PBDF classification rule based on the posterior distribution of Y (s 0 ) specified by . . , n} and where ρ is non negative constant called a clustering parameter, and n 1 is the number of locations from N 0 with label equal 1. Here N N (8) is the nearest neighbor scheme with eight nearest neighbors.
The new classification rule SCRD is based on the following posterior distribution where δ(·) is the 0-1 indicator function and d(·,·) denotes the Euclidean distance function between locations. For the case of two classes π 2 = 1 − π 1 .

Methods for comparison
In order to verify the reliability of the method and its advantage in this concrete situation, when data is corrupted by spatial noise, the comparison with other commonly used methods is done. These methods are: Support Vector Machines (SVM), Neural Networks (NNet), Random Forests (RF) and Multinomial Logistic Regression (Logit). All these methods belong to the group of supervised classification methods and the classification with these methods is done on the same data and on the same training sample as for the methods proposed in this letter. The advantage of SVMs classifiers is their capabilities to learn from small number of samples [5]. The class (label) of a new sample is determined by a linear combination of the kernel functions evaluated on a certain subset of the examples the support vectors and the input. The coefficients of the combination are obtained as a solution to a convex optimization problem occurring at the learning stage [8].
Neural networks rely on the iterative derivation of weights which effectively define hyper-planes and hyper-regions in the pattern feature space [7]. Although artificial neural network methods are frequently found to give a higher total classification accuracy when compared to other methods, they do not always perform universally well [6]. Random Forests method grows many classification trees and then every tree gives the class label for the observation to be classified. Then the class is assigned according to the most classes given by all trees in the forest [3].

Description of experiment
In this letter the experiment with two different remotely sensed images is presented. One image was earlier used in the paper [9]. Here the same image is classified with the new method. This image is obtained by Landsat7 satellite and is from the territory of Lithuania. This image is 200 × 200 pixels and is shown in Fig. 1(a). Another image is obtained by the satellite Landsat8 and is also in the territory of Lithuania. The second image is 4 times bigger: 400 × 400 pixels. This image is shown in Fig. 1(b). The density of clouds is higher in the second picture so the accuracy of classification must be lower for this image.
The original image is naturally corrupted with clouds and such noise is modeled by Gaussian random field (GRF) with zero mean and exponential spatial correlation function given by r(h) = exp{−|h| 2 /α}. Here α is a spatial correlation range parameter which must be estimated. This parameter is evaluated with function variofit from the geoR package in the environment of the R system.
The remotely sensed image used for classification is naturally corrupted and then the exact correlation range parameter is unknown. This parameter is estimated according to the training sample points (pixels) using the geoR package of the statistical program R.
At the same time both classes training sample points are used in order to identify the correlation range parameter. In order to minimize the influence of different classes to the accuracy of the modeling, the means of corresponding classes are subtracted from the training sample points feature values. This way transformed points with their coordinates are then used to calculate empirical semivariogram. It is done with geoR packages command variog. The parametric model is then fitted to the empirical semivariogram points using variofit command, which uses least squares method for fitting. After several trials, several different types of models were fitted according to the shape of the empirical semivariogram, it was determined that for this concrete situation best model was of exponential type. The exponential model was best fitted for the both images. The value of the correlation range parameter α = 13 for the first image and α = 68 for the second image were estimated. These numbers were used for classification using SCRD and PBDF methods.

Results
Visual classification results are shown in Fig. 2 and in Fig. 3. Numerical evaluations of overall accuracy for all methods used in this experiment are provided in Table 1.

Conclusions
The SCRD method performed better than PBDF, so the incorporation of more spatial information into classification rule helps to get better results.
The second image is more corrupted by clouds than the first one and all methods performed worse.
If the clouds corrupt image just slightly as in the situation with the first image, other common methods still performs quite well, but if corruption becomes harder, these methods become useless. SCRD and PBDF methods performs quite well in both situations, even when image is corrupted quite hard.