A MODIFIED FUZZY CLUSTERING APPROACH IN UNSUPERVISED CLASSIFICATION FOR DETECTING THE MIXED PIXELS OF SATELLITE IMAGES

The major problem of remote sensing images is mixed pixels, available in the data which degrades the quality, accuracy of the image classification and object recognition. To overcome the problem of mixed pixel in a real satellite data a modified K-means clustering algorithm and a modified fuzzy C-means clustering algorithm, are discussed. The algorithms are developed by modifying the membership function of the standard K-means clustering algorithm (FKM) and the standard fuzzy C-means algorithm (FCM). The performance of the proposed algorithms is discussed and compared with the traditional fuzzy K-means algorithm and the traditional FCM algorithm. Results on classification and segmentation of satellite images reveal that the suggestive algorithms are robust and effective.


INTRODUCTION
Image segmentation is a significant and difficult issue; it is the first step in image analysis as well as high-level image interpretation and understanding such as robot vision, object recognition, and satellite imaging. Image segmentation is the process of partitioning the original image into homogeneous groups and it plays a vital role in image processing and classification.
Many segmentation tools were introduced, and detailed surveys are available from Fu and Mui [1]. Image segmentation approaches, according to Pal and Pal [2], are classified into four types namely: edge detection, clustering, thresh-holding and region extraction.
Clustering is a procedure for categorizing objects or elements so that objects in the same group are substantially more similar with one another than objects in other groups. Many clustering procedures, such as the fuzzy clustering and hard clustering have been used; both have its own unique set of properties. The traditional hard clustering approach assigns each point in the set of data to only one cluster. As a result, the classification results from this approach are frequently quit crisp. That is every pixel of the image belongs to absolutely single class.
Nevertheless, in several real-world circumstances, such as those for satellite images, issues such as the presence of mixed pixels in the data, limited spatial resolution, weak contrast, and intensity fluctuation in homogeneities make this hard (crisp) segmentation a challenging task.
Zadeh's [3] introduced the concept of uncertainty (ambiguity), which was only expressed by a membership function. The fuzzy clustering is a soft classification technique that has been extensively studied and effectively used in image classification and image segmentation. In most of the fuzzy clustering approaches, FCM algorithm Bezdek [4] was the most often employed in classification of image as it has robust vagueness features and can preserve more information than hard segmentation methods Bezdek [5]. Now fuzzy clustering has been extensively applied and studied in a variety of substantive areas Bezdek [4], Pal and Majumdar [6]. These approaches become the significant tools in cluster analysis.
The major problem of remote sensing images is mixed pixels, available in the image data which degrades the quality, accuracy of the image classification and object recognition. The fuzzy approach is being used to process and analyze images in different ways; such as Bibiloni et al [7] has used fuzzy mathematical morphology to process the digital images. Dwivedi et al. [8] used a fuzzy approach for learning unknown patterns to be used with a neural network for analyzing satellite images to estimate crop area. Thapa and Murayama [9], compared four approaches for mapping that include fuzzy supervised and GIS post-processing. In this paper, to overcome the problem of mixed pixel in a real satellite data a modified fuzzy K-means (FKM) clustering algorithm and a modified fuzzy C-means (FCM) clustering algorithm are proposed. The remaining Sections of this study are organised as; Section 2 explain the problem of interest. Image classification techniques and their types are briefly describes in Section 3.
Description of framework for the study is given in Section 4. Finally, results of the study and their conclusions are explained in Section 5 and Section 6.

PROBLEM TO BE ENTERTAINED
The presence of large pixels size in land cover satellite images raises the possibility of the presence of mixed pixels. In digital image analysis, mixed pixel problem is created by those pixels which are not totally occupied by a single, homogeneous class. When a pixel area is occupied by two or more classes that are differ in terms of brightness. For instance, a pixel in land sat imagery has a size of 10m × 10m whereas the pixel size of Quick bird imagery is 0.60m × 0.60m. This difference in resolution increases the chances of encountering mixed pixels in Land sat imagery as compared to Quick bird image. Thus, mixed pixels can be one of the sources of error in per pixel classification and should be treated accordingly.

IMAGE CLASSIFICATION TECHNIQUES
Image classification is the basic task for processing of remotely sensed images. Classification techniques provide frameworks for organizing and categorizing information that can be extracted from image data. Spatial unit to be classified may be a pixel or an object in the imaged scene.
Each pixel of an image is represented by a vector, consisting of a set of measurements (e.g. spectral bands, textural features etc.). The general image classification methods of remote sensing such as, pixel-wise classification methods consider each pixel is pure and categorized as a single land use land cover type Fisher [10]. Generally, pixel-wise classification algorithms are divided into two groups: unsupervised classification and supervised classification.

UNSUPERVISED CLASSIFICATION
Unsupervised image classification is a method by which the large numbers of unknown pixels in an image are separated by an image interpreting software based on their reflectance values into classes or clusters with no training from the analyst Tou and Gonzalez [11]. The most common unsupervised clustering methods are: Fuzzy K-means clustering and fuzzy C-means clustering methods. These methods are purely based on spectrally pixel-based statistics and include no prior information of the features of the themes being studied. On the other hand, supervised classification is a method in which the user defines small areas called training sites on the image, which contain the predictor variables measured in each sampling unit, and assigns prior classes to the sampling units.

CLUSTER ANALYSIS
Cluster analysis is one of the major techniques used in pattern recognition and image classification. The unsupervised technique that is most readily used by the analysts is cluster analysis in different areas such as remote sensing, taxonomy, medical science, engineering systems, robotics and image processing etc,. Clustering is a technique to identify the number of sub-classes of c clusters in a data set X consisting of n data samples, and partitioning the data set . It is to be noted that c =1 denotes the rejection of the presence of clusters in the data, where c = n represents the trivial case where each sample is in a cluster by itself. There are two types of c-partitions of the data set: hard or crisp and soft or fuzzy. In numerical data interpretation one supposes that the elements of each cluster bear more mathematical similarity to each other than to elements of other clusters. Two significant issues to consider in this regard are how to compute the similarity between the pairs of observations and how to calculate the partitions once they are formed. If one can determine a suitable distance measure and compute the distance between all pairs of observations, then one may expect that the distance between points in the same cluster will be considerably less than the distance between points in different clusters.
The traditional (hard) clustering techniques restrict that every pixel of the data belongs to exactly one class or one cluster. Zadeh [3] proposed fuzzy set theory gave an idea of belongingness described by a membership function and provides imprecise class membership information. Bellman, et al., [12] and Ruspini [13] were early proposed the applications of fuzzy set theory in cluster analysis. The two unsupervised classifiers that are most commonly used are classifiers based on fuzzy K-means clustering and fuzzy C-means clustering.

FUZZY K-MEANS CLUSTERING ALGORITHM
Fuzzy K-Means (FKM) is exactly the same algorithm as K-means, which is most common clustering technique. The only difference is that instead of assigning a point exclusively to only one cluster, it can have some fuzziness or overlap between two or more clusters.
are given. The FKM clustering algorithm partition data points into k clusters j C ( j = 1, 2, …, k) and clusters j C are associated with centroid of the cluster jp v .
The FKM algorithm is as follows: estimate a value of the centroids of the k clusters will be iteratively modified to converge to the centroids obtained from the procedure. Step2: v saves the value of jp v before the iteration. During the iteration, the value of jp v will change.
At the end of the iteration, jp v will be compared with jp v .
Step3: The relationship between centre of the cluster and a data point is fuzzy, i.e., a membership 1 1 1 1 2  Step 4: New coordinates of the centroids 1 V to k V of the clusters are calculated. The value of m is as chosen in the previous step.
Step 5 Thus, the membership value of every pattern in each cluster is obtained. The variable  is the convergence criterion of the procedure; it is a small value, say 0.01, chosen by the user.

FUZZY C-MEANS CLUSTERING ALGORITHM
Dunn [14] proposed the Fuzzy C-Means (FCM) clustering algorithm and later on it was extended by Bezdek [15]. The algorithm is an iterative clustering technique that gives an optimal c partition by minimizing the weight within group sum of the squared error objective function  and move to step 4.

MODIFIED K-MEANS CLUSTERING ALGORITHM
In The proposed modified FKM is as follows: Step1: For   Step 4: For , ,..., 2 , 1 k j = calculate the Euclidean distances between the current centroid j V and the centroid j V  at the beginning of the iteration, as recorded in Step 2. Let the largest of these distances be D.
Step 5: If D >  , go to Step 2. Otherwise, return from the procedure with Thus, the membership value of every pattern in each cluster is obtained. The variable  is the convergence criterion of the procedure; it is a small value, say 0.01, chosen by the user.

MODIFIED FUZZY C-MEANS CLUSTERING ALGORITHM
In 3. Set counter loop as,  ; then to next k .
6. If the value of and move to step 4.

DESCRIPTION OF FRAMEWORK FOR THE STUDY
To validate the effectiveness of the proposed fuzzy unsupervised classification methods, we have conducted classification on high-resolution remotely sensed images using the proposed modified fuzzy clustering methods and the traditional fuzzy clustering classification methods.  In this paper, modified fuzzy K-means clustering algorithm and modified fuzzy C-means clustering algorithm has been suggested as the classifier of choice when dealing with mixed 14 A.R. SHERWANI, Q.M. ALI, IRFAN ALI pixels in imagery. Their performance in terms of confusion matrix and misclassification error has been compared with that of the traditional fuzzy K-means clustering algorithm and fuzzy Cmeans clustering algorithm.
On the parameter of the LISS-III digital image, a total of 1049 points were under study.
These shape files of the surface covers were created using the software package ERDAS Imagine and computational work of this study, has been carried out by using the software R. The data under study consists of a total of 1049 pixels from different classes.

RESULTS
It is very obvious that any classification method results in some misclassification probabilities and these misclassification probabilities play a vital role in assessing the performance of the