Lung Cancer Detection Using Image Segmentation by means of Various Evolutionary Algorithms

The objective of this paper is to explore an expedient image segmentation algorithm for medical images to curtail the physicians' interpretation of computer tomography (CT) scan images. Modern medical imaging modalities generate large images that are extremely grim to analyze manually. The consequences of segmentation algorithms rely on the exactitude and convergence time. At this moment, there is a compelling necessity to explore and implement new evolutionary algorithms to solve the problems associated with medical image segmentation. Lung cancer is the frequently diagnosed cancer across the world among men. Early detection of lung cancer navigates towards apposite treatment to save human lives. CT is one of the modest medical imaging methods to diagnose the lung cancer. In the present study, the performance of five optimization algorithms, namely, k-means clustering, k-median clustering, particle swarm optimization, inertia-weighted particle swarm optimization, and guaranteed convergence particle swarm optimization (GCPSO), to extract the tumor from the lung image has been implemented and analyzed. The performance of median, adaptive median, and average filters in the preprocessing stage was compared, and it was proved that the adaptive median filter is most suitable for medical CT images. Furthermore, the image contrast is enhanced by using adaptive histogram equalization. The preprocessed image with improved quality is subject to four algorithms. The practical results are verified for 20 sample images of the lung using MATLAB, and it was observed that the GCPSO has the highest accuracy of 95.89%.


Introduction
Lung cancer, also known as lung carcinoma, is a malignant tumor characterized by uncontrolled growth of the cell in tissues of the lung. It is mandatory to treat this to avoid spreading its growth by metastasis to other parts of the body. Most cancers that start in the lung are carcinomas. e two main types are small-cell lung carcinoma and nonsmall-cell lung carcinoma [1]. Long-period tobacco smoking is the primary factor for 85% of lung cancers [2]. About 10-15% of cases occur in people who have never smoked but due to air pollution, secondhand smoking, asbestos, and radon gas. Computer tomography (CT) and radiographs are the conventional methods to detect the presence of lung cancer. e diagnosis is confirmed by biopsy which is usually performed by bronchoscopy or CT scan. e cause of cancer-related death among men is mainly due to lung cancer. Hence, it is essential to determine a new robust method to diagnose the lung cancer at an earlier stage [3]. For the present study, 20 lung image samples and four algorithms have been taken for analysis. It was proved that the combination of adaptive median filter, adaptive histogram equalization, and guaranteed convergence particle swarm optimization-(GCPSO-) based algorithm has more accurate results among others. (1) Assume the input matrix "A" which has M rows and N columns.
(2) Construct a matrix with M + 2 rows and N + 2 columns by appending zeros to sides of the input matrix (3) Take a mask of size 3 × 3.
(4) Place the mask on the first element, i.e., element on the first row and first column of matrix "A". (5) Select all the elements listed by the mask and sort them in ascending order. (6) Take the median value (center element) from the sorted array and replace the element A(1, 1) by the median value (7) Slide the mask to the next element. (8) Repeat the steps from 4 to 7 until all the elements of matrix "A" are replaced by their corresponding median value.
(1) Assume the input matrix "A" which has M rows and N columns.
(2) Construct a matrix with M + 2 rows and N + 2 columns by appending zeros to sides of the input matrix.
(4) Place the mask on the first element, i.e., element on the first row and first column of matrix "A". (5) Select all the elements listed by the mask and find the average (6) Take the mean value from the sorted array and replace the element A(1, 1) by the median value. (7) Slide the mask to the next element. (8) Repeat the steps from 4 to 7 until all the elements of matrix "A" are replaced by their corresponding median value.
(1) Obtain the histogram for the input image and find the probability mass function.
(2) Find the cumulative distributive function; from that, find the CDF according to gray levels.
(3) Find the new gray levels by using the following equation: CDF New � CDF * (number of gray levels − 1). (4) Map the new gray levels into a total number of pixels and plot the modified histogram. (1) Select the cluster centers. Let them be "C." (2) Calculate the Euclidean distance.
(3) Take each and every pixel and assign them into the appropriate cluster if the Euclidean distance is minimum between the cluster and pixel. (4) Once the segregation is completed for all the pixels, recalculate the new cluster center using the following formula: Repeat the steps from 2 to 4 for some number of iterations or until a certain condition is encountered.
(1) Select the random cluster centers. Let the number of cluster centers be "C." (2) Calculate the Euclidean distance.
(3) Take each and every pixel and assign them into the appropriate cluster if the Euclidean distance is minimum between the cluster and pixel. (4) Once the segregation is completed for all the pixels, recalculate the new cluster center using the median value instead of using a squared formula. (5) Repeat the steps from 2 to 4 for some number of iterations or until a certain condition is encountered.
(1) Initialize the velocity and position of all the particles with random values.
(3) Find the fitness value for each particle. (4) Compare the fitness value with the best fitness. If the fitness values are better, then set the current value as new pbest. (5) Repeat steps from 3 to 5 for each particle. ALGORITHM 6: Particle swarm optimization [11,13].

Initialization
(1) Initialize the number of clusters and number of iterations.
(3) Define a fitness function. Clustering (4) Find the fitness value for each particle. (5) Update the local best solution obtained so far. (6) Repeat steps 4 and 5 for the predefined number of iterations. (7) Update velocity and position of each particle for the current global best particle.                  Image 1  Image 2  Image 3  Image 4  Image 5  Image 6  Image 7  Image 8  Image 9  Image 10  Image 11  Image 12  Image 13  Image 14  Image 15  Image 16  Image 17  Image 18  Image 19 Image 20 Sample images    Image 20 (b) Figure 11: Resultant images after preprocessing.
Computational and Mathematical Methods in Medicine eradicate the incidence of noise content and to improve the image quality before an examination [4]. is part of work is known as preprocessing. In the preprocessing stage, noise removal and contrast enhancement are two primary steps. In the present study, the performance results of median, adaptive median, and average filters to isolate the presence of speckle noise have been compared. e coding for the same has been implemented using MATLAB. Furthermore, the image quality and visual appearance are improved by adaptive histogram equalization. e second stage of work is segmentation. is stage consists of applying five methods, namely, k-means, k-median, particle swarm optimization (PSO), inertia-weighted particle swarm optimization (IWPSO), and GCPSO. e tumor portion was extracted from the segmented results of the above-said five methods and compared with manual extraction. e results show that the GCPSO-based segmentation has more accuracy than the others. Figure 1 depicts the process of operation for the present study.

Median and Adaptive Median Filters.
e median filter removes the noise and retains the sharpness of the image. Accordance to the name, each pixel is replaced by the median value from the neighborhood pixels. A 3 × 3 window is used in this filter [5]. is is one of the best filters among conventional filters which remove the speckle noise. e steps followed to construct the median filter are given in Algorithm 1.
Spatial processing to preserve the edge detail and to eliminate nonimpulsive noise by the adaptive median filter plays a vital role. e small structure in the image and edges are retained by the adaptive median filter. In the adaptive median filter, the window size varies with respect to each pixel.

Average Filter.
is is a simple filter which removes the spatial noise from a digital image. e presence of spatial noise is mainly due to the data acquisition process.  e neighborhood mean value is measured for each and every pixel and is replaced by the corresponding mean value. is process is repeated for every pixel in the image [5]. All the pixels in the digital image are modified by sliding the operator over the entire range of pixels. e steps followed for the average filter are given in Algorithm 2.

Histogram Equalization.
Image enhancement is the technique which is used to improve the image quality. For better understanding and analysis, it is mandatory to enhance the contrast of medical images. e conventional method used for this operation is histogram equalization. A minor adjustment on the intensity of image pixels is done in this method. Each pixel is mapped to intensity proportional to its rank in the surrounding pixels. e steps followed for histogram equalization are given in Algorithm 3 [6].

k-Means Clustering Algorithm.
e simplest and conventional method in cluster analysis is the k-means clustering algorithm. is algorithm segregates the given dataset into two or more clusters [7]. e accuracy of this method completely depends on the selection of the cluster center. It is mandatory to select the optimum cluster center to get a better result. e Euclidean distance is the general measure to segregate the dataset [8]. Pixels are assigned to an individual cluster based on the Euclidean distance. e objective function used in this algorithm is where x i are the pixels, v j are the cluster centers, ‖x i − v j ‖ is the Euclidean distance between x i and v j , C i is the number of data points for the ith cluster, and C is the number of cluster centers [9]. e steps followed for k-means clustering are given in Algorithm 4.

k-Median Clustering Algorithm.
is is also a clustering algorithm slightly modified from the k-means algorithm. In centroid calculation instead of calculating the mean value, the median value is considered. is algorithm significantly reduces the error since there is no squared operation as in the calculation of the Euclidean distance. e clusters formed by this method are more compact. As an alternate, this approach uses the Lloyd-style iteration. e steps followed for k-median clustering are given in Algorithm 5 [10].

Particle Swarm
Optimization. PSO is a metaheuristic algorithm used efficiently in medical image analysis [11]. It mimics the social behavior of the birds searching for food [12]. e fundamental idea of PSO is sharing and communicating the information. In this approach, each particle has initial position and velocity. Based on the fitness value, the velocity and position are updated. e relevant two equations in PSO to update the position and velocity are as follows [11,12]: where r 1 and r 2 are the random numbers and the acceleration coefficients c 1 and c 2 are two positive constants. e success of PSO relies on the fitness function. e following fitness function has been used for the present study:  where n is the number of clusters. e steps followed for the particle swarm optimization are shown in Algorithm 6.

Inertia-Weighted Particle Swarm Optimization.
e exploration and exploitation in PSO are based on the inertia weight. e basic PSO, presented by Eberhart and Kennedy in 1995, has no inertia weight. In 1998, Shi and Eberhart introduced the concept of inertia weight by adding constant inertia weight. ey stated that a significant inertia weight facilitates a global search, while a small inertia weight facilitates a local search [14]. is enhances the convergence rate and reduces the number of iterations. Inertia weight less than 1, in general, improves the results. e used method improves the convergence rate and saves the time taken and some iterations. e resulting velocity update equation becomes where w is the inertia weight, with constant inertia weight w � 0.7 and random inertia weight w � 0.5 + rand()/2.

Guaranteed Convergence Particle Swarm
Optimization. e GCPSO focuses on a new particle which deals with the current best position in the region. In this task, this particle is treated as a member of the swarm, and the velocity update equation for this new particle is given as follows [15]: e search ability is increased by the social part. is will improve the random search in the area around the gbest position. e random vector and diameter of the search area  are r and ρ(t), respectively. e range of the random vector lies between 0 and 1. e diameter of the search area can be updated using the following equation: #successes > sc, where the terms #successes and #failures are defined as the number of consecutive successes and failures, respectively. e threshold parameters sc and fc are determined empirically. Since it is hard to obtain a better value in only a few iterations in a high-dimensional search space, the recommended values are thus sc � 15 and fc � 5.
On some benchmark tests, the GCPSO has shown an excellent performance of locating the minimal of a space after unimodal with only a small amount of particles. e steps to be followed for the GCPSO are shown in Algorithm 7.

Performance Measures
Certain performance measures are used to evaluate the results obtained from medical image segmentation. e list of performance measures used to assess the filter operation is shown in Figure 2 [16]. Let I f be the image after noise reduction and I 0 be the noisy image.
Performance measures used for the evaluation of the results of the segmentation algorithm are given in Figure 3 [17].

Results and Discussion
e used methods are practically implemented using MATLAB coding, and the results were verified.
In the preprocessing stage, a comparison was done between the performance of median, adaptive median, and mean filters. e SSI and SMPI values are shown in Table 1 and Figures 4 and 5. From the results, it is evident that the adaptive median filter has accurate characteristics than the mean and median filters for medical image segmentation. e segmentation accuracy was measured using the true positive rate, true negative rate, false positive rate, and false negative rate by comparing the results from the algorithm with manual segmentation results. e practical results of the kmeans clustering segmentation algorithm are shown in Table 2.
e practical results of the k-median clustering segmentation algorithm are shown in Table 3.
e practical results of the PSO-based segmentation algorithm are shown in Table 4. e practical results of the IWPSO segmentation algorithm are shown in Table 5.
e practical results of the GCPSO segmentation algorithm are shown in Table 6. e graphical view of the comparison of the true positive rate, true negative rate, false positive rate, and false negative rate for the algorithms used is shown in Figures 6-9. It is proved that the true positive and true negative rates are high and false positive and false negative rates are low for the GCPSO algorithm. e comparative evaluation based on the accuracy of the segmentation is shown in Table 7 and Figure 10. e results indicate that the GCPSO-based technique has the highest average value of accuracy than the other methods. e resultant images after preprocessing are shown in Figures 11(a) and 11(b). e resultant images after segmentation using k-means clustering are shown in Figure 12. e resultant images after segmentation using k-median clustering are shown in Figure 13. e resultant images after segmentation using the PSO algorithm are shown in Figure 14.
e resultant images after segmentation using the IWPSO algorithm are shown in Figure 15. e resultant images after segmentation using the GCPSO algorithm are shown in Figure 16.
In an earlier research, lung cancer detection was done using PSO, genetic optimization, and SVM algorithm with the Gabor filter and produced an accuracy of 89.5% [18]. e method to detect lung cancer by means of K-NN classification using the genetic algorithm produced a maximum accuracy of 90% [19]. e comparative results with respect to the above-said methods are shown in Table 8.
e graphical comparative analysis between the used and existing methods is shown in Figure 17.

Conclusion
In this study, various optimization algorithms have been evaluated to detect the tumor. Medical images often need preprocessing before being subjected to statistical analysis. e adaptive median filter has better results than median and mean filters because the speckle suppression index and speckle and mean preservation index values are lower for the adaptive median filter. Comparing the five algorithms, the accuracy of the tumor extraction is improved in GCPSO with the highest accuracy of 95.8079%, and it obtained above 90% of precision in all the 20 images. It is more accurate when compared to the previous method which had an accuracy of 90% in 4 out of 10 datasets only. In future studies, the use of more number of optimization algorithms will be included to improve the accuracy. Comparison between existing and projected methods  Various methods Accuracy (%) PSO, GA, and SVM algorithm [18] 89.50 K-NN classification using GA [19] 90.00 Projected GCPSO method 95.81

Data
Computational and Mathematical Methods in Medicine 15