Neighborhood scheme selection for classification with SCRD method

When using Spatial Correlation Rule with Distance (SCRD) the selection of the neighborhood scheme influences classification accuracy. Spatial dependency in different situations remains at various distances, so, according to this, in applications it is important to choose a suitable neighborhood scheme. In the earlier papers of the authors, the nearest neighbor scheme was used. In this paper, several different neighborhood schemes are examined by large experiment.


Introduction
Spatial information is important in image classification [2]. In the papers (see e.g., [1]) the incorporation of geostatistical information of features into plug-in versions of classifiers is based on the marginal distribution of the observation to be classified. The statistical supervised classification method was extended by incorporating more influence from the spatial dependency into classification problem in the paper [3] and this method is named SCRD. The SCRD method is used in situations when classification data is with spatially correlated information (noise). This is common in images which are covered with clouds or smoke in pollution data and in other situations.
SCRD method showed its better accuracy comparing with other common methods in [3] in artificial experiment and in the paper [4] in real life situation. Analyzing more deeply the results of the previous experiments some important cases were noticed. During the classification some parts were misclassified because there were no training samples corresponding appropriate class. These parts were surrounded by the training sample points of other class. In this letter this situation is examined by proposing to use different neighborhood schemes which always includes at least one training sample element from each class.
The declaration of SCRD method and descriptions of neighborhood schemes are presented in the first section. In the second section the experiment is described. In the third section the numerical and visual results of the experiment are presented.

Method description
The method used in this letter is a spatial classification rule based on the Plugin Bayes Discriminant Dunction (PBDF) with posterior distribution of class label depending on distances among unclassified locations and training sample locations is called SCRD [3]. As it is common in image analysis the features are modeled by stationary Gaussian Random Field (GRF) {Z(s): s ∈ D ⊂ R 2 }, and class labels are modeled by discrete Markov Random Field (MRF). Here s is a state of the pixel.
Z(s) = µ l + ε(s) is the marginal model of observation Z(s) in class Ω l , with the mean µ l and with the error term ε(s) which is generated by zero-mean stationary GRF {ε(s): s ∈ D} with covariance function defined by model cov {ε(s), ε(u)} = σ 2 r(s − u) for all s, u ∈ D, where σ 2 is variance as a scale parameter. In this letter the exponential covariance function is used C(h) = σ 2 exp{−|h|/α}. r(s − u) = r(h) = exp{−|h|/α} is the spatial correlation function, where α is the correlation range parameter which shows how far the correlation remains and h is the Euclidean distance between s and u locations.
The PBDF to the classification problem is . The classification rule SCRD is based on the following posterior distribution of Y (s 0 ) specified by where δ(·) is the 0-1 indicator function and d(·, ·) denotes the Euclidean distance function between locations. For the case of two classes π 2 = 1 − π 1 . I 0 = {i: s i ∈ N 0 , i = 1, . . . , n} and n 1 is the number of locations from N 0 with label equal 1. Here N 0 is a set of s 0 states neighboring pixel selected by a neighborhood scheme.
We test eight different neighborhood schemes which are using four different methods for selecting neighbors from training sample. The differences between neighborhood schemes are presented in Fig. 1.
NN(n) method selects n nearest neighbors form the training sample. According to this method, various amounts of different class training sample elements can be selected. There can be a situation when all selected neighbors are from the same class so the classified pixel is also assigned to this class.
NN C (n, m) method selects n nearest neighbors form the training sample as the NN(n) method. If n i (selected neighboring points from the i-th class) is smaller then m, then additional m − n i neighbors are selected from the training sample from the i-th class and it is done for all classes. In this case there is always at least m elements from all classes and this means that the information from all classes is used for classification for every pixel to be classified.
NN R1 (rad , n) method selects neighbors form the training sample where Euclidean distance between training sample pixels and pixel to be classified is smaller or equal  to the radius rad . In this case the situation can occur when no trainings sample is selected. So if the number of selected neighbors from the training sample is smaller then n, the radius rad is increased until at least n neighbors are selected. Using this method the same situation as with NN(n) method can occur, when all selected neighbors are from the same class, so another modification is needed.
NN R2 (rad , n, m) method selects neighbors form the training sample in the same way as NN R1 (rad , n) method. When at least n neighbors are selected, the radius rad is increased until m training sample elements are selected for all classes.

Description of experiment
In order to determine the influence of neighborhood schemes a large experiment was performed. 100 different initial black and white images of 200 × 200 px size were made. White color of the images corresponds to the first class and the black color -to the second class. From the initial images training sample points were taken randomly selecting about 0.8% of the points for the training sample. Then two dimensional GRF images were generated using exponential covariance function with different correlation range parameter α which varied from 5 to 60 by 5. So for each initial image 12 different GRF images were generated. Then these GRF images were multiplied by 3 and added with the initial images, then the received images were normed. So 1200 different images were generated for the classification. The preparation of the experiment scheme is presented in Fig. 2.

Fig. 2.
Preparation of the experiment. 100 different initial images we processed according to this scheme and 1200 different images for the classification were generated.

Results
After the classification the results were received. Table 1 shows the average accuracy of the classification using SCRD method for all neighborhood schemes used in this experiment. According to the results presented in Table 1, all neighborhood schemes were almost equally good. The influence of the correlation range to the classification accuracy is presented in Fig. 3. As seen in the graph all neighborhood schemes performs better when correlation range parameter α increases and it almost stops increasing at α = 35. From Table 1 and from Fig. 3 can be seen that schemes NN(8), NN R1 (30, 4) and NN R2 (30, 4, 1) performs best but the accuracy difference is very small.
Visual results of most interesting and important images are presented in Fig. 4. These visual results show that best schemes, according to average classification accuracy, in some cases does not deal with harder situations when where are no training samples in specific places. The image of Fig. 4 representing number "87" is the best example. Only the methods which uses at lest one element form each training sample could correctly classify the bottom of the number "8".

Conclusions
Neighborhood schemes which require to take some training samples from every class NN C (n, m) and NN R2 (rad , n, m) deals better with harder situations, when in some important place training samples are missing. It may became more important when the smaller training sample is given. Such schemes can help in situations when some class areas are very thin and the training samples can not be gain from these thin areas.
These different neighborhood schemes can be tested in real situation, especially classifying roads or rivers from remotely sensed images.
NN C (n, m) and NN R2 (rad, n, m) neighborhood schemes during the classification gets more noisy results then other methods, but this noise can be removed using common morphological image processing methods.