SAR image segmentation using MSER and improved spectral clustering

A novel approach is presented for synthetic aperture radar (SAR) image segmentation. By incorporating the advantages of maximally stable extremal regions (MSER) algorithm and spectral clustering (SC) method, the proposed approach provides effective and robust segmentation. First, the input image is transformed from a pixel-based to a region-based model by using the MSER algorithm. The input image after MSER procedure is composed of some disjoint regions. Then the regions are treated as nodes in the image plane, and a graph structure is applied to represent them. Finally, the improved SC is used to perform globally optimal clustering, by which the result of image segmentation can be generated. To avoid some incorrect partitioning when considering each region as one graph node, we assign different numbers of nodes to represent the regions according to area ratios among the regions. In addition, K-harmonic means instead of K-means is applied in the improved SC procedure in order to raise its stability and performance. Experimental results show that the proposed approach is effective on SAR image segmentation and has the advantage of calculating quickly.


Introduction
Image segmentation is a process of dividing an image into different regions based on certain attributes such as intensity, texture, color, etc. This process is fundamental in computer vision and many applications, such as object recognition, image compression, image retrieval, and visual summary, can benefit from it. This process is also challenging because segmentation is usually not satisfactory, and computation is highly costly. Synthetic aperture radar (SAR) image segmentation plays a special role in automatic target recognition and has attracted more and more attention recently.
Various approaches of SAR image segmentation have been proposed and the recent work includes a variety of techniques, for example, clustering algorithm [1], threshold methods [2,3], morphologic methods [4], graph-based approaches [5,6], and statistic model-based methods [7,8]. The graph-based approaches have become popular over the last decade. In such approaches, an image is seen as a weighted graph where each node corresponds to an image pixels or a region and the weight of each edge connecting two pixels or two regions represents the likelihood which belongs to the same segment. So far, several graph-based methods have been proposed for image segmentation. For example, Shi and Malik [9] proposed a general image segmentation approach based on normalized cut (Ncut) and Ng et al. [10] came up with a simple and effective multi-way spectral clustering (SC) method named NJW. The improved SC in this article is mainly based on the NJW method.
The maximally stable extremal region (MSER) algorithm is an interesting region detector originally used in wide-baseline stereo matching [11]. The MSERs are connected components of an image where local intensity is stable over a large range of thresholds. MSERs have properties that form their superior performance as a stable local detector. First, the set of MSERs is closed under continuous geometric transformations. Second, MSERs are invariant to affine intensify changes. Finally, MSERs are detected at different scales. The performance evaluation by Mikolajczyk and Schmid [12] showed that the MSER detector performed better on a wide range of test sequences and required much less computational complexity than other local detectors. MSERs have successfully been used in applications such as the automatic 3D-reconstruction from a set of images, object and scene retrieval in videos and object recognition [11,13,14]. In this article, we will demonstrate how MSERs can be used for SAR image segmentation.
SC method is based on the graph theory and is insensitive to the structure of data. Many traditional clustering problems have been solved by it. Recently, SC method has successfully been implemented in many fields such as information searching [15], bioinformatics [16], and image segmentation [5,[17][18][19][20]. In the process of using SC method for image segmentation, the pairwise similarities of all pixels in the image are needed to be computed and the computational cost is huge which restricts the method's application. To solve this problem, Fowlkes et al. [21] proposed an approach based on a classical method for the integral eigenvalue problem known as the Nystrom method. The approach worked by first solving the grouping problem for a small random subset of pixels and then extrapolating this solution to the full set of pixels in the image. The article [5] used watershed algorithm to over segment the input image firstly, then SC method was used for clustering. This method performed well in some cases, but the watershed algorithm was sensitive to noise and close texture and produced a large number of small but quasi-homogenous regions which might lead to performance degradation in the consequent region grouping. The article [22] used mean shift algorithm to segment the input color image firstly, then Ncut was applied to perform the final segmentation. SC method makes use of the mathematics tool spectral-graph-theory commendably and its result is very close to global optimum. It has its own advantages compared to another graphbased method named graph-cuts. Boykov and Kolmogrov [23] borrowed algorithms for network flows to search the minimum cut of graph-cuts problem. It performed well on two-class segmentation but did not work on multi-class segmentation. Although they had proposed alpha-expansion method and alpha-beta-swap method [24] for multi-class problems based on graphcuts framework, these two methods could approach to a good result but were inefficient when classification number was large which led to much iterative computation.
In this article, a novel SAR image segmentation algorithm is proposed based on MSER and improved SC. Frost filtering algorithm and morphological closing algorithm are applied to remove noise and enhance input image. The input image after this procedure is more suitable for later processing. The input image is composed of multiple disjoint regions after MSER procedure, then the regions are treated as nodes and a graph structure is applied to represent them. The node number representing one region depends on area ratio of the region to the smallest region. In this way, more information of the image after MSER procedure is preserved compared to assigning only one node to represent one region and the segmentation performance can be enhanced. In addition, K-harmonic means (KHM) [25][26][27] is insensitive to the initialization of the centers and performs better than K-means, so KHM instead of K-means is applied in the improved SC procedure which is propitious to enhance stability and performance of the method.
This article is organized as follows. Section 2 introduces principles of MSER algorithm, SC method, and KHM algorithm. Section 3 describes the proposed approach for the effective SAR image segmentation. Section 4 shows the experimental results and Section 5 concludes the article.

MSER algorithm
We begin with a lattice grid and the pixels are the functions defined on this grid. We reinsert the pixels in the intensity order, i.e., first we place all black (intensity = 0) pixels at their correct locations, then we place all pixels with an intensity value 1 and so on until the complete image is restored. During this process it produces regions of pixels which will grow and connect to other regions as more and more pixels of higher intensity are placed. The rate of growth as a function of intensity q(i) is measured for all these regions and a region is detected as an MSER when the growth rate has a local minimum. The sensitivity of the detection is controlled with a parameter Δ [28].
The MSER algorithm can be divided into four major parts: (1) Preprocessing: Pixels are sorted in the intensity order and the number of pixels for each intensity is determined.
(2) Clustering: A representation of all regions at each intensity level is created.
(3) MSER detection: The sizes, |Q|, of all regions are tracked and the growth rates, q, are monitored for local minimums. (4) Display result: All pixels belonging to a detected MSER are identified and presented as an output.
The standard MSER algorithm makes use of a unionfind data structure and takes quasi-linear time in the number of pixels. Nister and Stewenius [29] proposed a new algorithm for computing MSER. This new algorithm is based on a different computational ordering of the pixels which is suggested by another immersion analogy than the one corresponding to the standard connected-component algorithm. With the new computational ordering, the pixels considered or visited at any point during computation consist of a single connected component of pixels in the image, resembling a flood-fill that adapts to the grey level landscape. The computation only needs a priority queue of candidate pixels, a single-bit image masking visited pixels, and information for as many components as there are grey-levels in the images. This is substantially more compact in practice than the standard algorithm where a large number of connected components must be considered in parallel. The new algorithm provides exactly identical results in the true worst-case linear time. Moreover, the new algorithm uses significantly less memory and has better cache-locality, resulting in faster execution. In this article, the linear time MSER algorithm is used for its advantages.

SC method
An image can be seen as a weighted undirected graph G (V,E,W) with nodes V representing pixels and edges E whose weights capture the pairwise similarities between pixels. In this article, nodes V represent regions after MSER procedure and weights of edges E represent the pairwise similarities between the regions. A multi-way SC method NJW is used in this article. It has been proved that using more eigenvectors and directly computing a multi-way partitioning are better than computing a two-way partitioning [30].
Let N be the number of nodes and d(i, j) (i = 1,2,...,N;j = 1,2,...,N) be the feature distance between V i and V j , element of similarity matrix A(i,j) R N × N is defined by A(i, j) = exp(-d 2 (i, j)/2s 2 ) (i ≠ j) where s is the scaling parameter that controls how rapidly the similarity A (i , j) falls off with d(i, j). The matrix composed of A(i, j) is called similarity matrix which is the key point of SC method. The SC method is based on the eigenvectors of the similarity matrix [31]. The eigenvectors induce an embedding of the nodes in a low-dimensional subspace wherein a simple central clustering method can then be used to do the final partition.
Let X = {x 1 , x 2 ,..., x h } be the set of points, and we want to cluster X into K subsets. The specific procedure is described as follows.
(1) Form the similarity matrix A R h × h defined by and (2) Define D to be the diagonal matrix whose (i, i) element is the sum of A 's i th row, and construct the Laplacian matrix L = D -1/2 AD -1/2 .
(3) Compute the K largest eigenvalues of L and unify corresponding eigenvectors V i , i = 1,2,..., K , then form the matrix is a eigenvector which has been unified by eigenvalue.
(4) Form the matrix Y from S by renormalizing each of S's rows to have unit length, such that . It has been proved that the normalized SC has a better performance [31]. (5) Treat each row of Y as a point in R k , cluster them into K clusters via KHM. (6) Finally, assign the original point x i to cluster j if and only if row i of the matrix Y was assigned to cluster j.

KHM algorithm
KHM is a center-based clustering algorithm which uses harmonic averages of the distances from each data point to the centers as components to its performance function. KHM algorithm is essentially insensitive to the initialization of the centers. In certain cases, KHM algorithm significantly improves the quality of clustering result compared to K-means algorithm. Let X = {X 1 , X 2 ,..., X n } be the n given data points, and k is the center number. The KHM objective function is where d i,l is the distance between X i and the clustering center C l . The new positions of the centers are calculated as following.
The recursion continues until the objective value stabilizes.
KHM algorithm addresses the intrinsic problem by replacing the minimum distances from the data point to the centers, used in K-means, by the harmonic averages of distances from the data point to all centers. In the process of using KHM algorithm, the dynamic weighting of data helps clustering algorithm escape certain local optimum and converge to a better local optimum, which contributes to the insensitivity of KHM algorithm to the initialization of the centers.

Description of the algorithm scheme
The outline of the proposed method can be characterized as follows.
First, Frost filtering algorithm and morphological closing algorithm are used to remove noise and enhance input image. The main effect of Frost filtering algorithm is to remove speckles, and the main effect of morphological algorithm is to eliminate details which are smaller than the structuring element, connect adjacent regions, and smooth boundaries. The image after this step has fewer details and is smoother than the original image, and pixels belonging to the same region connect more closely with each other, so the sense of integrity of the image becomes higher. We mainly pay attention to segment the image into several large classes correctly while ignoring preserving the details, so the image with higher sense of integrity is more suitable for later processing.
Second, the input image is segmented into multiple disjoint regions using MSER algorithm. The MSER algorithm often returns a lot of MSERs and most pixels in the image can be assigned to the MSERs, but there still exist some regions which are not considered to be stable regions. Thus, the input image is composed of multiple disjoint regions including the MSERs and the regions which are not MSERs. The average intensity of every region is computed and assigned to every pixel in it.
Third, all the regions after MSER procedure are treated as nodes and a graph structure is constructed to represent them. In this article, an improved strategy is developed for the graph construction. We assign different number of nodes to represent each region according to area ratio between the region and the smallest region, instead of considering each region as only one graph node. Let m 1 be the area of the smallest area r 1 and m 2 be the area of the other region r 2 , we assign one node to represent r 1 and n nodes to represent r 2 , where n = m 2 /m 1 . The advantage of determining the numbers of nodes representing the regions based on area ratio is that it takes account of the area differences among the regions and keeps more information of the image after MSER procedure. In this way, better segmentation result can be gained with more information. As the graph has been constructed, similarity matrix A can be computed. The weight A(i, j) between nodes i and j which represent two different regions is defined as where g(i) is the intensity value of region i, and ∥ ⋅ ∥ 2 denotes the Euclidean distance. s is a scaling factor that determines the sensitivity of A(i, j) to intensity difference between regions i and j. dist(i, j) denotes the spatial distance between regions i and j, and it is defined as the minimal pixel distance between the two regions, max(dist(i, j)) denotes the maximal spatial distance among all the regions. The smaller the spatial distance of two regions is, the greater that possibility of clustering the two regions to be one class is. h is an adjusting constant that determines the sensitivity of A(i, j) to the spatial distance between the regions i and j.
Finally, as the similarity matrix has been computed, the SC method is applied to solve the region partitioning problem. In the fifth step of original SC method, Kmeans algorithm is used for clustering. Because Kmeans algorithm is sensitive to the initialization of the centers, the clustering result is not stable. It has been proved that KHM algorithm is much more stable and performs better than K-means algorithm [26], so KHM algorithm instead of K-means algorithm is applied in this step to enhance stability and performance of the SC method in this article.
In the proposed method, the node number h is depend on the number of regions after MSER procedure instead of the size of input image N, h is always much less than N, so the computational cost of the proposed method is reduced dramatically. In addition, the partitioning of graph based on regions is more robust and insensitive to noise than that based on pixels.

Implementation procedure
To illustrate the implementation process of the proposed method, a natural scene image is used as an example, as depicted in Figure 1a. The image can be clustered into three classes: river, trees, and grass. The image size is 300 × 200.  Figure 1c is the resultant image after morphological closing procedure. The windows of Frost filter and morphological closing are set to 3 × 3 and 7 × 7, respectively. In this step, noise is removed and sense of integrity of the image becomes higher than the original one. The image after this procedure is more suitable for later processing. Figure 1d is the resultant image after MSER procedure, the connected pixels with the same color depict one region. The parameter Δ is set to 7. As a result, 30 regions are produced by MSER procedure.

Improved weighted graph construction strategy
In this step, the regions produced by MSER procedure are treated as nodes and a weighted graph is constructed. In this article, an improved strategy is developed for the graph construction. We assign different numbers of nodes to represent each region according to area ratio between the region and the smallest region, instead of considering each region as only one graph node. Let m 1 be the area of the smallest area r 1 and m 2 be the area of the other region r 2 , we assign one node to represent r 1 and n nodes to represent r 2 , where n = m 2 /m 1 . For example, the area of the smallest region in Figure 1d is 42, the area of one of the other regions is 2151; thus, we assign one node to represent the smallest region and n = 2151/42 ≈ 7 nodes to represent the other one.
The nodes representing the same region have the same feature value and the weights among them are 1. Every two nodes in the graph have one weighted edge. Figure 2 is a sketch map of the weighted graph structure of regions which are represented by two nodes, three nodes, and four nodes in the graph, respectively, and all the nodes have the weighted edges between each other.
The advantage of determining the numbers of nodes representing the regions based on area ratio is that it takes account of the area differences among the regions and keeps more information of the image after MSER procedure. In this way, a better segmentation result can be gained with more information. In addition, the computational cost only slightly increases as the node number increases according to experiments.

Similarity matrix computation
As the weighted graph has been constructed, similarity matrix can be computed. The elements of the similarity matrix are calculated according to (4).

Final segmentation using SC method
As the similarity matrix has been computed, SC method is applied to perform the final segmentation. In this procedure, KHM algorithm instead of K-means algorithm is used to improve stability and performance of the segmentation. The regions produced by MSER algorithm are clustered into several classes with the improved SC method. The final segmentation result of Figure 1a is depicted in Figure 1e. The segmentation result is satisfactory and the image is correctly clustered into three classes, namely, river, trees, and grass, which proves the validity of the proposed approach. Figure 1f is the segmentation result when considering each region as one graph node. It can be seen from Figure 1f that the segmentation result is not satisfactory, which proves the disadvantage of considering each region as only one graph node. It also shows that better segmentation results can be achieved by using the proposed graph construction strategy.

Experiments
The proposed method has been applied for the segmentation of a set of SAR images with natural scenes compared to Nystrom and Ncut methods. In this section, the experiment results are presented, indicating different stages of the methods. The sizes of the test images are 300 × 200. The windows of Frost filtering algorithm and morphological closing algorithm are set to 3 × 3 and 7 × 7, respectively. Since morphological closing algorithm is employed after Frost filtering algorithm, the window size of Frost filtering algorithm is set smaller than common size. The parameter of MSER algorithm is set to Δ = 7. The adjusting parameters s and h should be set according to the image content to get the best segmentation result.
The test examples include five typical SAR images which are shown in Figure 3a    group, the six images, respectively, show, from left to right, the resultant image after Frost filtering procedure, the resultant image after morphological closing procedure, the resultant image after MSER procedure, the final segmentation result of the proposed method, the segmentation result using Nystrom method, and the segmentation result using Ncut method implemented by Cour et al. [33]. When using the Ncut method in [33], we have to resize the images to 160 × 107, or the   Figure 3. The partitioning class k: is two in (a) and three in (b-e) and five in (f). In each group, the six images, respectively, show, from left to right, the resultant image after Frost filtering procedure, the resultant image after morphological closing procedure, the resultant image after MSER procedure, the final segmentation result of the proposed method, the segmentation result using Nystrom method, and the segmentation result using Ncut method implemented by Cour et al. [33]. The adjusting parameters s and h are set in each group (from (a) to (f)) to s = 0.7,0.6, 0.5,0.4,0.5,0.5 and h = 0.6,0.9,0.4,0.6,0.8,0.8. computer memory cannot afford the computational cost. To guarantee fairness of different methods comparison, the same kind of pre-processing is employed before employing Nystrom and Ncut methods.
It can be seen from all the experimental results shown in Figure 4 that the proposed method effectively segments the natural scenes into several meaningful regions and provides better performance than Nystrom and Ncut methods. For example, in the first row of Figure 4, the buildings are not separated from the background with Nystrom and Ncut methods, while the proposed method can separate the buildings from the background effectively.
Manual segmentation results of Figure 3a-e which are taken as ground-truths are shown in Figure 5a-f. Figure  5e,f are two manual segmentation results for the same image Figure 3e with different partition classes which are three and five, respectively. Miss-classification rates of different methods are computed and the results are shown in Table 1. The calculation formula of miss-classification rate is MC_Rate = MC_PixelNum/ImageSize in which MC_PixelNum denotes the number of pixels that are segmented into wrong classes and ImageSize denotes the size of the input image.
It can be seen from Table 1 that the miss-classification rates of proposed method are much less than that of the Nystrom and the Ncut methods, which proves good accuracy of the proposed method. The miss-classification rates of the three methods are likely to increase when classification number increases from three to five. However, the miss-classification rate and its increasing rate of proposed method are still less than that of the other two methods, as depicted in Figure 4e,f and the last two rows of Table 1.
Computational cost of the proposed method is compared to that of Nystrom and Ncut methods. A PC with a 2.67-GHz Core2 CPU and 2.0 GB memory is used. The runtimes of different methods are shown in Table  2.
It can be seen from Table 2 that the runtime of proposed method is much less than that of the Nystrom and the Ncut methods. The runtime of the proposed method consists of three parts. The first part is the runtime of applying Frost filtering algorithm and morphological closing algorithm, the second is the runtime of using MSER algorithm, and the third is the runtime of partitioning the regions produced by MSER algorithm with the improved SC method. The total runtimes of the proposed method are between 0.6 and 0.9 s, which can be seen from the last three columns of Table 2. From the second and the third columns of the same table, we can see that the runtimes of the Nystrom method are between 11 and 13.8 s, and the runtimes of the Ncut method in which the input images have been resized to 160 × 107 are between 4 and 6.5 s. The reduction of computational cost by using the proposed approach is obvious. In addition, it can be seen from the last two rows of Table 2 that the runtimes of the three methods increase when classification number increases from three to five. However, the runtime of proposed method is much less than that of the Nystrom and the Ncut methods.

Image number
Miss-classification rates of Nystrom method Miss-classification rates of Ncut method [33] Miss-classification rates of proposed method

Conclusion
A novel approach has been developed for SAR image segmentation in this article. By incorporating the advantages of MSER algorithm and SC method, the proposed approach provides effective segmentation. The Frost filtering algorithm and morphological closing algorithm make the input image more suitable for later processing. The MSER algorithm transforms the image from a pixel-based to a region-based model, and the discontinuity characteristics of the image are preserved. An improved strategy in which the number of nodes representing each region is assigned according to the area ratio between the region and the smallest region is developed for the graph construction. The improved SC is applied to perform the final region-partition and a satisfactory image segmentation performance can be gained. Experimental results show that the proposed method can not only enhance the SAR image segmentation performance, but also reduce the computational cost. How to choose the values of adjusting constants s and h according to the properties of the input image is also a question unsolved and needs further research.