Segmentation of High Resolution Worldview-2 Satellite Images

This paper presents the segmentation technique used to segment the Worldview-2 high resolution satellite multispectral (MS) images. First the spectral features like Simple Ratio (SR), Normalized Difference Vegetation Index (NDVI), Soil Adjusted Vegetation Index (SAVI) and Modified Soil Adjusted Vegetation Index (MSAVI) are considered to extract the spectral features from the MS image. Next the MS image is segmented by using the over segmented k-means algorithm with novel initialization (OSKNI) method. The proposed method performs well in terms of User’s accuracy (UA), Producer’s accuracy (PA) and overall segmentation accuracy (OVA) compared to the existing k-means algorithm


Introduction
Image segmentation is the underlying process in majority of the applications of image processing like remote sensing, computer vision etc. Image segmentation is the process fall under region-based techniques of classifying remote sensing images before the classification of segments takes place (Banerjee et al., 2014;Paclı´ka et al., 2003). Image segmentation in RS images involves the extraction of texture or spectral features and clustering of the features extracted using the data clustering method. With respect to the remote sensing, segmentation is the way towards the outlining singular areas of homogeneous earth cover, while segmentation is the ensuring procedure of recognizing the depicted region as kinship to a particular earth cover (Johnson and Xie, 2011). This paper proposes an unsupervised clustering method of segmenting the RS images into the sectors of uniform areas using spectral features (Kumar et al., 2018). The spectral features considered in this paper are SR (Birth and McVey, 1968), NDVI (Rouse et al., 1973), SAVI (Huete, 1998) and MSAVI (Qi et al., 1994).

Materials and Methods
The Worldview-2 satellite images of Kalaburagi city (at geographical coordinates 17° 20′ 09″ N, 076° 50′ 15″ E) Karnataka, India are used to test the algorithm. The Worldview-2 image contains total 9 bands, panchromatic band is the first band with wavelength 450nm to 850nm and remaining 8 bands are multispectral bands covers total wavelength of 400nm to 1040nm. Three subset images Research Article considered in this paper to test the k-means and proposed algorithm. The subset -1 consist of 3 clusters, cropland1 (non cultivated), tree and soil dug & field bands, represented by aquamarine, green and light yellow colours respectively in the ground truth image. The subset-2 image consist of 3 clusters, cropland1(non cultivated), cropland2 (cultivated), soldug & filed band, represented by aquamarine, dark green and light yellow colours respectively in the ground truth image. Subset-3 image consist of 3 clusters, crop land, trees, soil dug & field band represented by aquamarine, green, light yellow colours respectively.

Over Segmented k-means Algorithm with Novel Cluster Centre Initialization (OSKNI)
First over segment the data with k-number of clusters in to O and P clusters, using k-means algorithm, such that O=k+1 and P=k+2. And run the k-means algorithm, the initial cluster centre for O and P clusters is obtained by using the following steps.
1. Let A 1 , A 2 , A 3 ,….A m are the set of information elements of O and P clusters, where m is the quantity of the data points. 2. Run k-means algorithm to produce O and P clusters. 3. The initial cluster centres for O and P clusters are found by using the Kaufman approach, steps 4 through 10 (Kaufman and Rousseeuw, 2005). 4. Select the principal sample the most halfway located sample. 5. For every new sample s i run the following step 6. For every s j find c ji =max (Dj-d ji ,0) where d ji = ||s i -s j || and D j = min dp j , where p is the chosen sample. 7. Calculate the scale of s i using .
8. Chose the new sample s i which maximizes .
9. If the selected centres are O then stop else continue step 5 through 8. 10. After forming clusters assign each sample to the cluster represented by nearest centre. 11. Now apply same procedure explained in step4 through 10 to find the initial cluster centre for P clusters. 12. Run k-means iterative algorithm on O and P clusters. 13. More than 99% of the pixel values unchanged between the present and previous clusters the kmeans algorithm is stopped. 14. O and P over segmented clusters are fused using the procedure explained in the following subsection.

Fusion of O and P Clusters to Find Optimal Cluster Centre
The fusion decides the O x P clusters of occurrences, which are in fact the undistinguished occurrences. New k-beginning cluster mid pint for grouping the data sets in to k-portions originate from the average of the biggest groups of undistinguished occurrences. Along these lines the smallest groups are dropped, choosing just the k-largest clusters among O x P sets of undistinguished occurrences. In this way basic cluster centres are generated to implement the iterative k-means on data sets. Below equations explains the fusion process.

… (1)
Where O=k+1, P=O+1, I is the group of all occurrences in the data. UO is O clusters of first over segmentation. VP is P-clusters of second over segmentation (Ursani et al., 2007).
Where, , ||Z||=O x P. By using the OSKNI clustering method the MS images under test are segmented. Figure 1, gives the process of spectral feature extraction from MS images and segmentation. Totally three times the k-means algorithm is applied on the datasets.

Results and Discussion
In this paper, we considered Worldview-2 images of Gulbarga district, 50 subset images of size 256 x256 are considered to test the proposed algorithm out of which three images results are presented in this paper. Segmentation results of k-means and proposed method of subset-1 (contains 3-classes), subset-2 (contains 3-classes) and subset-3 (contains 3-classes) are shown in Figure 2, 3 and 4 respectively. In Figure ( It is observed from Figure 2 (c) that k-means algorithm is unable to differentiate the pixels belongs to class tree and field band & soil dug. Figure 2(d) shows the segmented result using the proposed method here the miss-clustered error due to k-means are reduced as we considered the common members of over segmented O and P clusters. From Figure 3 (c) is observed that k-means algorithm miss-clustered large number of pixels belongs to the class cropland-2 to the class field band & soil dug. From the proposed segmentation method shown in Figure 3 (d), it is clear that miss-clusters created by k-means are avoided between cropland-2 and field band & soil dug. Figure 4 (c) shows the segmentation result of subset-3using the k-means, it is observed from the figure that samples belongs to the trees are miss-clustered as field band & soil dug. Figure 4 (d) shows the segmentation result of the proposed method for subset-3 and it is observed that pixels belongs to all three classes are segmented properly.
Quantitative analysis of the existing k-means and proposed methods is carried out by using three parameters of the Kappa statistics namely, UA, PA and OVA. Figures (5) and (6) represents the results of parameter namely, user's accuracy, producers accuracy and over all accuracy for all three subsets-1, 2 and 3. From the quantitative analysis graphs it is shown that the k-means algorithm produces average PA of 51.07%, 80.83% and 53.92% for subset-1, 2 and 3 respectively and average UA of 51.65%, 82.63 and 67.80 for subset-1, 2 and 3 respectively. On the other side proposed method segment the spectral subset images with higher accuracy compared to k-means algorithm.

Conclusion
The k-means algorithm performance is depending on the initialization process. Unfortunately, k-means algorithm and many variants of the k-means algorithm proposed in the literature initialize the initial cluster centre by random sampling due to which the segmentation accuracy will decrease. Proposed over segmented k-means algorithm with novel cluster centre initialization (OSKNI) method utilizes the Kaufman initialization process to provide the good initialization for over segmented clusters O and P.
Later the repetitive process of k-means is applied on the data set three times. The proposed method performs well in contrast with the k-means in terms of PA, UA and OVA presented in the previous section. In the future work the suitable post processing can be done to the results of the proposed method to improve the segmentation results further. In the future work different spectral features can also be considered to improve the segmentation results. The proposed method can also be compared with the other segmentation algorithm.