Ameliorate Fuzzy C-Means: An Ameliorate Fuzzy C-Means Clustering Algorithm for CT-Lung Image Segmentation

Effective and efficient image segmentation acts as a preliminary stage for the computer-aided diagnosis of medical images. For image segmentation, many FCM-based clustering techniques have been proposed. Regrettably, the existing FCM technique does not generate accurate and standardized segmentation results. This is due to the noise present in the image as well as the random initialization of membership values for pixels. To address this issue, this study has enhanced the existing FCM technique and proposed a technique named Ameliorate FCM (AFCM). Initially, the given image is preprocessed to remove the noise using the Contrast Limited Adaptive Histogram Equalization (CLAHE) technique. The preprocessed image is given as input to a Bayesian classifier to classify the images into two set namely normal and abnormal using a Hybrid feature selection method. The classified images are given as input to the proposed segmentation technique, which overcomes the drawbacks of existing FCM technique. Here, the membership value of the pixels of an image is standardized and clustered to segment the regions. Experiments are carried out using lung images to determine the efficiency of the proposed technique. Results of the experiment show that the proposed technique outperforms the existing FCM technique.


INTRODUCTION
Lungs are the most important part of the human body, which contracts and relaxes thousands of times daily in order to expel the carbon dioxide and to inhale the oxygen.Disorder in breathing usually refers to the lung disease.With the emergence of CT digital imaging technique, diagnosis of lung disease becomes easier.The CT digital image of lung is broadly used in clinical practice for posttreatment evaluation and diagnostic imaging.Due to the cost-effectiveness of CT lung imaging over other diagnostic imaging, a plenty of research work has been carried out in CT medical image segmentation.Numerous methods are available for image segmentation.
Segmentation is a function, which subdivides the given image into different objects or regions based on the valuable information regarding the objects present in imaging data.This segmenting process helps separate the interested component from their background.However, the CT medical image segmentation is often difficult and challenging, due to the existence of noise and artifacts, weak contrast and poor resolution.
Earlier, the segmentation process is carried out manually by human experts, which is a time-consuming and tedious task.This limitation in manual segmentation leads to the need of automated lung image segmentation.

Initially, the automatic segmentation technique uses artificial intelligence technique. Further investigations on
Science Publications AJAS medical image segmentation have shown that Markov model functions are better than the artificial intelligence techniques.In the last decade, Fuzzy C-Means (FCM) algorithm has been widely used in image segmentation.FCM uses the clustering method to retain higher information about the original image than the hard or crisp image segmentation methods.
The clustering process is used to group a set of image pixels into clusters such that the members of the same cluster have similar characteristics as the components of other clusters.Generally, the number of clusters is predefined, which depends on the requirement of the application where the segmentation process is carried.The significance of the FCM in segmentation is that it can retain higher information from the original image than other techniques.A notable advantage of FCM is that it allows a pixel to belong to multiple clusters with the reasonable degree of membership grades.Nevertheless, there exists no standardized assignment of initial values for membership that causes inaccurate segmentation, which is the major drawback faced by the FCM.
Here, the images present in the database are preprocessed using a technique termed Contrast Limited Adaptive Histogram Equalization (CLAHE) to remove the noise and classified into two groups namely normal and abnormal through Bayesian classifier.The classification step is carried to minimize the time required for segmentation whereas the preprocessing step helps to improve the accuracy of the segmentation process.For effective classification, texture features are used, which plays a major role in image processing.The effectiveness of classification depends on finding a set of texture features with a good discriminating power.Tamura and Laws features are elected using a hybrid feature selection technique.After classification, the abnormal images are segmented depending on the requirement.In the proposed AFCM algorithm, the drawbacks of the aforementioned problems that are associated with the classical FCM technique is overcome through standardization of membership value.Along with the standardization method, a weight associated with each pixel is computed to improve the performance of segmentation.The process of proposed technique is shown in the Fig. 1.
Rest of this study is structured as follows.The next section gives the related works on medical image classification and segmentation.The proposed AFCM method for segmentation is elaborately discussed in the section 3. Section 4, reports the effectiveness of the proposed algorithm through experiment.Finally, section 5 concludes the study with a summary of the proposed technique along with the discussion on future work.

Related Works
The most important operations in computer vision are classification and segmentation.This section reviews the different classification and segmentation articles that have been proposed in the last six years.

Classification
Texture plays an important role in classifying the medical images.Numerous techniques have been developed based on texture features.Liu et al. (2012), Adaptive Local Binary Patterns (ALBP) were proposed for image classification, which was based on the texture features for local binary patterns.Here, neighborhood pixel values were considered to classify the image.Similarly, in the article (Rehman et al., 2012), a software system was generated to classify the images, which uses dominant run length and co-occurrence texture features for classification.This study incorporated both Support Vector Machine (SVM) and Probabilistic Neural Network (PNN) classifier to classify the images.The accuracy of the aforementioned classification techniques depend on texture extraction.Therefore, it is essential to extract the best set of texture features for classification.
There are also other methods for classifying the images.One such technique was found in (Kannan et al., 2011) that uses a Multiple Rank Regression Model (MRR) for classification.The multiple-rank was used to access the information present in matrix data and to classify the image.A tool named MaZda, a software package was explained in (Strzelecki et al., 2013) that automatically classifies the medical images.The tool has a procedure for evaluation, selection and extraction of textures; based on extracted texture, the images were classified.A technique named Intersection Coordinate Descent (ICD) method was proposed in (Yang et al., 2012), which replaced the texture based image classification by histogram based classification.Nanthagopal and Sukanesh (2013), proposed SVM to classify the tissues of the breast.A similarity measure named Complex Wavelet Structural Similarity (CW-SSIM) used for image segmentation was proposed in (Shasidhar et al., 2011).They demonstrated in their proposal that while incorporating the proposed technique with SVM method, the accuracy of image classification was improved.

Segmentation
A robust algorithm was proposed in (Tosun and Gunduz-Demir, 2011) for the segmentation of histopathological tissue image.The proposed algorithm combined the background knowledge of tissue organization into segmentation.The spatial relationship between the cytological tissues components was quantified by the graph construction to acquire background knowledge.The constructed graphs provided the necessary details about texture features, which help to segment the given image.Another robust model named Student's-MIXture Model (SMM) for medical image segmentation was proposed in the proposal (Wu, 2012).This technique incorporated the local spatial constraints through exploiting the Dirichlet law and the Dirichlet distribution.The model parameters were estimated directly through the Student's-t distribution.It adopted the gradient method to optimize the parameter.The main advantage of this technique was that it reduces the complexity present in the existing student model.A novel image segmentation technique was proposed in the study (Li et al., 2011) based on the non-parametric clustering procedure in discretised color space.Here, the segmentation was employed by mapping the revealed range domain cluster to the spatial image domain.Pixel classification based color image segmentation was proposed in the study (Yang et al., 2012) with the Least Squares Support Vector Machine (LS-SVM).A new algorithm named Graph-Cut Active Contour Model (GC-ACM) was proposed in (Chen et al., 2013) for medical image segmentation.This technique was proposed by using graph-cut with the model-based Active Shape Model (ASM) method.
Many techniques that have been proposed for image segmentation in the last six years have used the FCM technique.Some of the proposal that uses the FCM technique for segmentation is given here.Dermoscopy images were segmented using intelligent Fuzzy clustering technique in the study (Joshi et al., 2012).The process of subdividing the cluster was continued until no erroneous clusters were found.The images were clustered through specified cluster centers.A new fuzzy level set algorithm was proposed in (Lin et al., 2012) for segmenting the medical image.The proposed algorithm enhances the existing fuzzy level set algorithm with locally regularized evolution.Usually, the FCM uses the Euclidean distance on feature space using new hyper tangent function.However, the proposal (Krstinic et al., 2011) derived a new hyper tangent, which clusters the image strongly.Lo and Wang (2012), the medical images were segmented based on the FCM clustering and bilateral filtering.
The above discussion on image segmentation reveals that the FCM technique was used widely to segment the image.Though it is widely used, it suffers from accuracy in segmenting the images.This is due to the problem in initializing the membership value.This problem is addressed in this postulate.A detailed explanation of the proposed segmentation technique is presented in the subsequent section.

Proposed Method
This section describes the process of proposed image segmentation technique AFCM.The following subsections describe the techniques used in this study to enhance the accuracy of the segmentation process.

Preprocessing
The raw images that are obtained from the machines of medical acquisition may afford comparatively low quality images.It is hard to locate and extract useful The aforementioned steps are computed to remove the noise thereby improves the quality of image.The CLAHE has the following advantages, (1) ease of use, (2) computational requirement is modest and (3) highly improves the quality of the image.

Feature Extraction
The preprocessed image is given as input to the feature extraction algorithm.Here, spatial features are extracted using Laws and Tamura feature extraction algorithm.This study uses the Law's mask by convolving together just three basic 1×5 masks as given in Equation (1-3): The starting letters of the aforementioned masks represents the local averaging, edge detection and spot detection.This study uses matrix multiplication to combine the 1×5 and similar set of 5×1 to obtain nine 5×5 masks.The nine 5×5 matrix are computed by multiplying L5 T L5, L5 T E5, L5 T S5, E5 T L5, E5 T E5, E5 T S5, S5 T E5 and S5 T S. Consider the following illustration for computing the 5×5 masks from the 1×5 and 5×1 mask.L5 T , E5 T , S5 T represent the number transpose matrix: Local magnitudes of the quantities are determined at this stage.These magnitudes are then smoothed over fair-sized region.With this the image is converted into vector image.Here, absolute and squared magnitudes are used to estimate the texture energy.The former is useful as it requires less computation while the latter corresponds to true energy.The texture energy is computed through summing the absolute values of filtering the results in local neighbors around each pixel.Finally, the features are combined to achieve rotational invariance.
In addition to Laws features, this study also considers three different Tamura features namely coarseness, contrast and directionality.The most fundamental feature is coarseness that has direct relationship to repetition rate and scale.The objective of the coarseness is to determine the prevalent size at which the texture exists.For computational purpose the average of every point over neighborhoods is manipulated using the Equation ( 4): The difference between pairs of averages is computed for each point that is related to the non-overlapping neighborhoods on opposite sides of the point in both the horizontal and vertical orientations.For horizontal case it can be computed using the Equation ( 5): From this, highest output value is picked where k maximizes P in either direction.The coarseness measure is then the average of ( ) The second most important feature that a medical image must requires is that the contrast, which is extracted using the Tamura.The contrast captures the gray levels Science Publications AJAS that are in dynamic range presented in an image.The contrast can be measured using the Equation ( 6): ( ) where, Similarly, directionality feature is used to represent the global property over a region, which measures the total degree of directionality.The edges in the images are determined using two simple masks.Magnitude and angles are computed at each pixel.Edge probabilities of a histogram is built up by counting all the points whose magnitude are greater than a threshold andthereby quantifying the edge angle.The degree of directionality is reflected by histogram.The sharpness of the peaks is computed to extract a measure form histogram from their second moments.

Feature Selection
From the above section a feature set FS = {f1,f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13} is obtained.These feature values are computed for a trained dataset.The computed feature values for the trained samples are converted into feature matrix.The feature matrix is given as input to the genetic algorithm, which generates subset of features.The number of features in the subset depends on the user requirement.In this postulate number of feature in the subset is considered three.Therefore, GA initially subdivides the feature set sequentially as follows F1 = {f1, f2, f3} F2 = {f4, f5, f6,} F3 = {f7, f8, f9} F4 = {f10, f11, f12} f5 = {f13}.The Bayesian classifier classifies the samples for the aforementioned feature subset and computes the classification accuracy and error rate.Here, an error value, X which represents the minimum error rate that the classification system can tolerate to generate better classification results for given input images.The error rate and X is compared, if the error rate is lesser than the X, then that feature subset is selected as the best feature set.In case if none of the above feature subset satisfies the minimum error rate then single point cross over is carried.The process is continued till a feature subset is framed, which satisfies the minimum error rate.Mutation process is carried when none of the feature subset fulfills the above criteria during crossover.Therefore, as a result a best feature subset {f x ,f y ,f z }is selected.

Image Classification
This section enables to decide whether the given input image is normal or abnormal on the basis of feature value.Here, given image instance is considered 3dimensional vector of feature values, which is mathematically represented as I = (i 1 ,i 2 ,i 3 ) where the i 1 , i 2 , i 3 denotes the values for the f x ,f y ,f z features respectively.The classifier assigns each image instance to either normal class (C 1 ) or abnormal class (C 2 ) for which the instance has highest posterior probabilities of the classes and the image I.That is, I is assigned to a class (C i ) if and only if it satisfies the Equation ( 7): The prior probability of the classes can be estimated from their frequencies in the sample training dataset using the Equation (8 and 9): In the Equation ( 8) and ( 9), TN Normal and TN Abnormal represents the total number of images that are normal and abnormal in the trained dataset.TN denotes the total number of samples present in the trained dataset.In order to classify the given image, P (I/C x ) should be calculated but it is computationally expensive.Therefore, in order to simplify the calculation, the attributes are assumed to be independent.With the assumption Equation ( 10) is derived, which classifies the samples into the respective class: ( ) ( )

Image Segmentation
For image segmentation, the classical FCM method is improved in the proposed AFCM by standardizing the initial membership value in order to improve the segmentation process.Initially, the AFCM execute the classical FCM for a sample dataset and utilize the final membership of FCM as the initial membership U F .The Science Publications AJAS computed U F and the cluster centroids V are given as input to the AFCM algorithm, which is based on minimizing an objective function as shown in Equation ( 11): ( ) ( ) In the above Equation P = {p 1 , p 2 ,....., p z ,….., p N } is a I×N data matrix, where l denotes the dimension of each p z feature vector and N p represents the number of pixels in the image, i.e., number of feature vector.N C represents the number of clusters.The membership function of vector p z to the y th cluster is expressed as . The membership function can be mathematically expressed using the Equation ( 12): In Equation ( 12), v = {v 1 ,v 2 ,..,v j …v Nc }, i.e., matrix and expresses the cluster feature center Equation ( 13): xÎ 1, ¥ which is a weighting exponent on each fuzzy membership that controls the degree of fuzziness.d 2 (p z ,v y ) is a measurement of similarity between p z and v y as represented in Equation ( 14): ( )

EXPERIMENTAL RESULTS
The dataset used for the experiment is poised from the Early Lung Cancer Action Program (ELCAP) public lung image database, which has CT lung image of 50 subjects.Number of lung images for different subject varies on an average there exist 260 images for one subject.The results shown in this postulate is for the case W0001-20090909 that has 176 low-dose CT image.The results of the proposed method are compared with the technique presented in the article (Amer et al., 2011), which uses the artificial neural network for image classification.The results are computed based on the following five major parameters: Above parameters are computed from the Equation ( 15), ( 16), ( 17), ( 18) and ( 19).Where the CS crt , CS Icrt , CS, CPS crt , zTN Samples , CNS crt and TN Samples represents the correctly classified samples, incorrectly classified samples, classified samples, correctly classified positive samples, True positive samples, correctly classified negative samples and true negative samples respectively: In Equation ( 19) TP, TN, FP and FN denotes the true positive, true negative, false positive and false negative respectively.The computed values for these parameters for the dataset images based on the proposed and existing technique is as shown in Table 1.In addition to the major parameters, Table 1 also gives additional parameters that are useful in determining the efficiency of the proposed technique.
Table 1 expresses that the proposed method functions better than the existing technique.The efficiency of the proposed method over the existing classification technique is represented through graph presented in the Fig. 2 and 3.These graphs provide AJAS the details about the predicted and actual classification done by the classification technique.
Similar to classification, the performance of proposed segmentation technique is compared with the benchmark technique FCM and with a technique proposed in (Devi et al., 2011) named modified FCM in terms of time it has taken to segment the image.

CONCLUSION
This study has addressed the problem present in the conventional FCM technique.Initially, the images are preprocessed using CLAHE technique to remove the noise present in the image thereby to enhance the accuracy of segmentation.The preprocessed images are then given as input to the classifier, which classifies the images into either normal or abnormal class.The classified images are then fed into the AFCM technique where the initial membership value is standardized thereby to generate static segmentation results.The results generated from the AFCM technique are highly useful for the diagnosing an image.Experimental results prove that the classifier accurately classifies the images as well as the proposed AFCM technique outperforms the conventional FCM and the modified FCM.In future this work is made adaptable for Content Based Retrieval (CBIR) technique in order to improve the efficiency of CBIR process.
Fig. 1.Overall flow of proposed segmentation technique µ 4 and γ 4 are the fourth moment about the mean and the variance respectively.
Unlike ordinary HE technique, the CLAHE technique operates on tiles, i.e. a small area in the given image.CLAHE enhances each tile's contrast and the adjacent tiles are then incorporated through bilinear interpolation.The contrast in homogenous area is limited in order to evade amplifying the noise.This technique limits the slope associated with the gray level assignment scheme, which prevents the saturation.Following steps are carried out by the CLAHE technique to enhance the image quality:• The CT-Lung image is alienated into conceptual regions that are non-overlapping and continuous • Contextual region's histogram is computed • The histogram of every contextual regions are clipped

Table 2
denotes the time taken by the FCM, modified FCM and the AFCM techniques to segment the given image.

Table 2
represents that the proposed AFCM technique requires very less time than the conventional FCM technique and the modified FCM.In addition to that, AFCM technique generates static segmentation results since the initial membership values are static whereas the conventional FCM technique generates dynamic segmentation results since its membership value is dynamic.The static segmentation results are highly useful for diagnosis.

Table 1 .
Accuracy of classification for Proposed and Existing technique Fig. 2. Confusion matrix for existing techniqueScience Publications AJAS Fig. 3. Confusion matrix for proposed technique

Table 2 .
Timing results for CT-Lung image segmentation