Water Cycle Bat Algorithm and Dictionary-Based Deformable Model for Lung Tumor Segmentation

Among the different types of cancers, lung cancer is one of the widespread diseases which causes the highest number of deaths every year. The early detection of lung cancer is very essential for increasing the survival rate in patients. Although computed tomography (CT) is the preferred choice for lungs imaging, sometimes CT images may produce less tumor visibility regions and unconstructive rates in tumor portions. Hence, the development of an efficient segmentation technique is necessary. In this paper, water cycle bat algorithm- (WCBA-) based deformable model approach is proposed for lung tumor segmentation. In the preprocessing stage, a median filter is used to remove the noise from the input image and to segment the lung lobe regions, and Bayesian fuzzy clustering is applied. In the proposed method, deformable model is modified by the dictionary-based algorithm to segment the lung tumor accurately. In the dictionary-based algorithm, the update equation is modified by the proposed WCBA and is designed by integrating water cycle algorithm (WCA) and bat algorithm (BA).


Introduction
Lung cancer is considered as the second most common kind of cancer for both male and females worldwide. As per the World Health Organization (WHO) statistics, 1.3 million deaths are happening because of lung cancer. Besides, it is calculated in the United States (US) that every year, approximately 228,820 people are newly affected by lung cancer in which 112,520 are women and 116,300 are men. Also, nearly 135,720 deaths are caused by lung cancer disease. The computer vision system has various tools, and these tools are used in different medical applications, especially for medical image analysis to diagnose various diseases [1,2]. Computed tomography (CT) is a basic imaging modality which effectively helps for the detection of lung cancers. As per the statistics, lung cancer is the fourth major cause of death globally. The initial process of lung cancer detection is manual detection of lung regions in CT images by specialists, which is a more challenging and tedious process for computer vision models. The number of deaths due to lung cancer can be considerably decreased, when the lung CT screening is effective. However, it is a challenging process for radiologists to make effective and precise detection for large scale of CT images. Hence, automated segmentation techniques are introduced to overcome these difficulties. In addition, end-to-end probabilistic detection model was developed based on deep three dimensional convolutional neural networks (CNNs) to overcome uncertainty complexities [3]. Lung cancer is a malignant tumor, which is developed due to the abnormal development of cells in lung regions. The early diagnosis of lung cancer can drastically reduce the death rate, and survival rate of patients can be improved. Therefore, the early prediction is essential for enhancing the clinical situations of patients; thereby, it is more crucial for developing an effective technique for early prediction of lung cancer. In fact, low dose CT is considered as a secure and effective tool for preventive detection of high risk populations [4]. Segmentation of CT images plays a very significant role in lung cancer detection [5]. However, there are few issues such as same image densities in scanning protocols and variations of pulmonary structures due to scanner. Most of the existing semiautomatic segmentation techniques depends on human factors, thereby affecting the segmentation accuracy. Recently, various lung segmentation techniques have been developed using deep learning architectures, which are presently applied for clinical applications. Among these, U-Net architecture offers better performance in deep neural networks.
Generally, radiologists frequently use computer-aided design (CAD) system in order to offer secondary consideration for more precise detection. Moreover, this system is more useful to enhance the efficiency of detection rate. In the literature, various approaches are available to perform medical image segmentation. But, they are unable to segment lung nodules which are connected to the lung walls. Therefore, deep learning techniques are more effective for such applications. Deep learning models are capable of identifying the significant features of medical images; thereby, major drawbacks of handcrafted features are resolved [6]. Semiautomated segmentation methods are also available for the lung tumor segmentation and lung cancer classification, which includes marker-controlled watershed technique [7] and single click ensemble model [8]. But automatic segmentation and classification approaches are more effective and accurate as compared semiautomatic methods [9,10]. Furthermore, every image classification model needs a suitable object segmentation approach. Several researches are focused for enhancing effective classification of lung cancers using multiscale Gaussian filter based on active contour and CNN method [11] and also convolutional transfer neural network with modified U-Net structures.
The main intention of this research is to design the development and performance evaluation of lung cancer segmentation algorithm using WCBA based deformable model approach. Lung tumor segmentation is a challenging issue due to in-homogeneities in the lung regions. This reason motivated us to develop a new methodology for the effective segmentation of lung tumor using CT images.

Literature Review
The early detection of lung cancer is a challenging issue due to complex structure of cancer cells, where most of the cancer cells are overlapped each other. For early diagnosis and treatment are very important to reduce the death rate. Researchers have been working towards the development of system which can detect cancer in its early stage and also tried to improve the accuracy of the system by incorporating preprocessing, segmentation, feature extraction, and classification techniques. The significant contributions of the existing research work and their limitations are presented below.
Hu et al. [12] developed a mask region-based CNN approach for lung tumor segmentation. In this method, Region Proposal Network (RPN) was employed for detecting object cases. This approach achieved better classification accuracy, but still failed to reduce the segmentation time. Ozdemir et al. [13] designed a deep learning approach for lung cancer segmentation using CT images. Here, data augmentation process was performed for reducing the overfitting issues. This technique obtained effectual diagnostic interventions. However, this model failed to visually evaluate learned feature representations for better performance. Shakeel et al. [14] designed improved profuse clustering technique (IPCT) and deep learning scheme for lung cancer detection. In addition, a weighted mean histogram equalization model was applied for removing the noises from the input CT image. In this method, IPCT-based spectral super pixel clustering was employed for the segmentation process. This approach has minimum misclassification error, although computational difficulty is high. Suresh and Mohan [15] presented CNN architecture for detecting lung cancer. In this technique, preprocessing was carried out to eliminate noise and desired nodule region is extracted. In addition, generative adversarial network (GAN) was applied for generating similar character images. This model enhanced the prediction speed, even though it is failed to explore optimal size of input patch in order to enhance the performance.
Jiang et al. [16] modeled multiple resolution residually connected network (MRRN) for lung tumor segmentation using CT images. This method resulted better segmentation accuracy independent of tumor location and size, but still this approach failed to reduce the computational complexity. Yu et al. [17] devised adaptive hierarchical heuristic mathematical model (AHHMM) for lung cancer detection using CT images. Here, weighted mean histogram approach was applied for removing of noise from input and enhancing the image quality. Then, K-means clustering was applied in order to obtain segmented tumor regions. Finally, a deep learning is introduced for predicting lung cancer. Singadkar et al. [18] developed deep deconvolutional residual network (DDRN) for lung nodule segmentation using CT images. Also, summation-based long skip connection from the network was applied for conserving spatial information. This technique obtained more precise classification of nodules in lung cancer. However, it is failed to enhance pulmonary nodule segmentation accuracy for other kinds of nodules. Jalali et al. [19] introduced deep neural network (DNN) for lung cancer segmentation. Here, morphological function was applied for extracting ground truth images. Additionally, bidirectional convolutional long short term memory (CLSTM) and ResNet-34 network, termed Res BCDU-Net, were introduced for effective training. The execution time of this approach was less; however, this model failed to integrate deep learning-based methods for obtaining better segmentation results.
The mask region-based CNN approach was developed for lung tumor segmentation using CT images. But this model was failed to provide accurate classification. The deep learning architecture was devised for the classification of lung cancer. But, this technique was failed to integrate decline choice and patient referral during the training process for better performance. The CNN architecture was used for detecting lung cancer, but this approach failed to produce high-quality realistic fake lung nodule samples to enhance detection performance in cancer images. The DNN approach was applied for the lung cancer segmentation process, but is also failed to perform the testing process for obtaining effectual segmentation output.

2
International Journal of Biomedical Imaging

Proposed Methodology
The proposed WCBA-based deformable approach for lung cancer segmentation model mainly includes three stages; preprocessing, lung lobe segmentation, and lung cancer segmentation. In the first stage, preprocessing of the input image is carried out using median filter to remove the noise.
In the second stage, the lung lobe regions are segmented using Bayesian fuzzy clustering [20]. Finally, segmentation of lung cancer is performed using deformable model which was modified by dictionary-based algorithm [21]. Here, the equation is updated by the proposed WCBA optimization algorithm. This algorithm is designed by combining BA [22] and WCA [23]. The block diagram of the WCBAbased deformable model for lung cancer segmentation is depicted in Figure 1. Let us assume a database H with r number of images, which is expressed as where T n denotes n th image in database, and r is a total amount of image in dataset. Here, the input image T n is subjected to preprocessing to remove noise. The expression for the filtered image is as follows.
where u and v are coordinates of pixel positions. The median filter output is represented as Z 1 :

Lung Lobe Segmentation Using Bayesian Fuzzy
Clustering (BFC). The preprocessed output Z 1 is taken as input for lung lobe segmentation. The estimation quantity is effectively increased with less computational complexity in Bayesian fuzzy clustering (BFC) approach. Thus, the BFC technique is employed for segmenting lung lobe regions. This model combines joint likelihood function for segmenting lone regions. The core and edema tumor sections are identified by preserving edge and texture. Here, the BFC technique segments the input by prototype clustering process. Moreover, BFC uses membership function F and even symmetric Dirichlet proposal q for finding cluster prototypes, which is specified as where q + w indicates regular symmetric Dirichlet proposal, and O refers total clusters in segmentation process. In addition, BFC uses conditional distribution with membership function F in order to estimate cluster prototype. The conditional distribution employed in BFC is given as where Y indicates membership function, and t w is maximum-a posteriori (MAP) sample. The membership function utilized for estimating cluster prototypes is varied due to Gaussian distribution properties, thus, Markov chain state rule is used to get cluster prototype, which is specified as The likelihood value of cluster prototype is identified, and cluster prototype with enhanced likelihood value is taken as final segmented output. Hence, BFC technique predicts O segments for image slice X m s is specified as where P s,m w indicates w th image segment X m s : The output of lung lobe segmentation process is represented as R x , and it is given to lung cancer segmentation.

Lung Cancer Segmentation Using Water Cycle Bat
Algorithm-(WCBA-) Based Deformable Model. The segmented lung lobe regions R s from preprocessed image are considered as input for lung cancer segmentation. Here, the deformable model is developed for segmenting the lung cancer. Generally, deformable model is a type of surfaces or curves defined in image, which can move under the pressure of internal forces. In addition, the deformable model is a geometric pattern, and its shape varies depending on time. Thus, in this technique, deformable approach is introduced by the modification dictionary-based image segmentation. Meanwhile, in dictionary-based image segmentation, the equation is updated based on the proposed WCBA optimization technique and is designed by integrating WCA and BA models.
Bat algorithm is devised by the inspiration of echolocation feature of microbats. This model is very effective to create improved features for solving multiobjective optimization problems. In addition, it has better capacity for solving high nonlinear problems with complex restrictions. On the other hand, WCA is motivated by nature, and it is devised by water cycle observation, river over sea, and stream flow in real-time situation. Moreover, WCA has the ability to solve various engineering design and managed optimization problems. WCA is effective to offer qualitative solutions and successful computational effectiveness. This method is used to address real-time optimization complexities with sufficient accuracy. In addition, it is effective for controlling dissimilar combinatorial optimization problems and affords optimal solution with less computational difficulty. Thus, the BA is integrated with WCA in order to obtain effective performance for lung cancer segmentation. The algorithmic steps for the proposed method are as follows.
Initially, populations of raindrops are randomly created, which is expressed as 3 International Journal of Biomedical Imaging where f represents total number of raindrops population, and L k specifies v th population.
The fitness measure is estimated in order to predict the optimum solution for effectual lung cancer segmentation process. The fitness function with the least value is considered as the best solution, and it is estimated by equation (26), which is specified as minimization function.
After the computation of fitness value, the cost of all raindrops is computed by below equation where B y denotes the cost of raindrop, f pop is the amount of raindrops, and f var is the number of design variables.
Step 4. Calculation of flow intensity for sea and rivers.
The flow intensity of river and sea is estimated in order to allocate raindrops, which is calculated as where f H i is number of streams, which flow to particular sea or rivers.
Step 5. Stream flow to river. After the computation of flow intensity for river and sea, then streams flow to rivers is estimated by Step 6. River flow to sea.
Here, stream flow to river is estimated in which solution update is performed by developed WSBA. In this stage, rivers flow to sea is estimated by   International Journal of Biomedical Imaging Moreover, the standard position update equation of BA is Moreover, it is expressed in terms of WCA, Substituting equation (17) in (12), Thus, the above expression defines the final updated equation for the developed WSBA.
Where rand specifies uniformly distributed random integer, which ranges between ½0, 1, l b = l min + ðl max − l min Þk, l min = 0, l max = 100, and k = ½0, 1, z ranges from 1 to 1, Q denotes velocity, and L b river signifies the position of river at b th iteration.
Here, river position is replaced with stream, and it affords better solution. Moreover, if a river identifies the best solution than sea, then, location of river is replaced with sea.
Evaporation is a most important factor, which avoids rapid convergence, and evaporation circumstance, which is given by If above equation is satisfied, then, raining procedure is executed where B max is a small integer, which is near to zero.
Once the evaporation condition is satisfied, then, raining process is done. The new position of newly generated streams is located by where LB and UB indicate lower bound and upper bound. Moreover, computational performance and convergence rate are improved by the following equation, and it is only utilized for stream, which is directly flow to sea.
where δ denotes a coefficient, and it shows the range of searching area near sea, rand a is randomly distributed integer.
Step 10. Decrease value of user define parameter. The large amount of B max decreases a search, but minimum value supports search intensity near a sea. The value of B max is decreased by Step 11. Check the feasibility of solution.
The optimal solution is estimated by fitness measure using equation (26) and if a new solution is better than previous solution then updates a previous value with new one.

Optimized Dictionary-Based Algorithm.
The optimized dictionary-based image segmentation process is as follows.

Region-Based Curve Evolution.
Let us assume an image J with background and object, which is characterized by various two intensities. The curve is initialized in image, and it is specified by zero level set of function ϕ, which directs to pixel labeling as outside or inside. Moreover, mean label intensities t out and t in are computed in which outside includes more backgrounds, whereas inside involves more objects. Meanwhile, the curve is introduced for segmenting the image. Furthermore, curves are reduced, while pixel intensities are near to t out , and curves are enlarged, when pixel intensities are near to t in . The threshold value among enlargement and reduction is referred as ðt in + t out Þðt in + t out Þ/2, and this two-stage process is continued. The mean label intensities are recomputed in order to differentiate intensities of background and object, since curve is developed, and at last curve segments out the object. The zero 5 International Journal of Biomedical Imaging level set evolution is specified by where p specified minimizing curve length, and h denotes curvature of level set curve and term weighted, which is expressed as 3.3.2. Texture Dictionary. Here, particular amount of TxT patches is extracted from image and collects pixels intensities in patch vector of t = T 2 length and cluster are patches in d th clusters based on k-means approach and Euclidean distance for creating dictionary. All image patches are allocated to single dictionary component and all image pixels related to t = T 2 dictionary pixels in which the image patches are overlapping. Moreover, allocation of particular image patch to certain dictionary component produces t pixels from image patch relate to t pixels from dictionary part. In addition, binary relation among dictionary pixels and image pixels is specified based on sparse binary matrix G with jφj rows and dt columns; here, jφj defines the whole amount of image pixel, and dt refers to the whole quantity of dictionary pixels. Here, matrix G captures texture details of image by concurrently encoding two things, namely, spatial relationship among patches and dictionary task of all image patches.

Label to Probability
Transformation. Every patch from image J has equivalent label patch from C in : All dictionary units have the amount of allocated image patches, and it permits to estimate dictionary component label. Moreover, dictionary unit label is estimated as pixel-wise average of label patch related to image patches, which are allocated to dictionary unit. Dictionary label is effectively estimated by order-ing label image pixels in binary vector C in and multiplying with G, and it is normalized. The resulting vector includes pixel-wise frequency of dictionary unit belonging to the inside, which is expressed as The elements of g in is necessary to rearrange depending on dictionary dimension for obtaining dictionary label.
The labels in g in and g out = 1 − g in are biased, because of the ratio of inside area jφ in j and outside area jψ out = jψj − j ψ min jj. Moreover, pixel normalization function works on every element of g in , which is given bỹ The next transformation includes the estimation of pixel-wise image probabilities from dictionary labels, which is carried out by averaging. Every dictionary label is located in image space at image patch location, which is allocated to dictionary unit. In order to estimate pixel probability, t values are required to average, because of patch overlap. Here, the effective estimated is done by Moreover, E in is different from C in , because image patches from outside and inside is allocated to similar dictionary unit. The binary values from C in is diffused depending on the texture information encoded in G: 3.3.4. Multiple Labels. The layered image label with C 1 to C x layers is generated for managing multiple layers. In addition, every layer is a binary indicator of label, and the layers are Input: Rain drop population and initial parameters Output: Optimal solution Begin: Initialize a population from rain drop, sea and river Estimate fitness function Estimate a cost of each rain drop based on equation (8) Establish intensity of flow for rivers and sea based on equation (9) Stream flow to river is carried out by equation (10) River flow to sea is done by equation (21) Replace position of river and sea Check evaporation condition If jL b sea − L b river j < B max ; b = 1, 2, 3, ⋯f M − 1 Perform training process using equation (23) and (24) End if Replace user defined parameter Check feasibility of solution End.
Algorithm 1: Pseudocode of the proposed algorithm. 6 International Journal of Biomedical Imaging sum to one in all pixels. The transformation is applied to every layers; thereby, dictionary labels are g 1 to g x :. Meanwhile, normalization of area is carried out pixel-wise for every layer, Once area normalization is completed, then, transformation is applied to everyg Χ , thus, in X probability images E 1 to E x , which is sum to one in every pixel.

Curve Evolution.
The closed curve is denoted as zero level set of function ϕ, which defines label image C in attains 1, while ϕ is negative or else it is zero. The label image is converted to probability image E in as in equation (30). The curve point at positions with large E in should be move outwards, whereas curve point with large E out = 1 − E in should move inside, and curve must be convergence in band, while E in = E out . The curve evolution is specified as Every Χ labels with single level set function is represented as ϕ Χ , x = 1, : ⋯ , Χ for segmenting multiple labels. The pixel-wise transformation of probabilities for every labels are expressed as Here, the level set evolution for a multilabel segmentation is expressed by the following equation, Moreover, the curve update is performed by Based on WCBA, curve update is equation (21) of equation (35), Substitute equation (36) in equation (35), the solution Therefore, the above expression is utilized for updating the curve; thereby, the proposed WCBA-based deformable model achieved better performance for lung cancer segmentation with less complexities.

Results and Discussion
The implementation of the proposed work is carried out in Matlab environment. The experiment is conducted using Lung Image Database Consortium image collection (LIDC-IDRI) [24]. This dataset was started by the Foundation for the National Institutes of Health (FNIH). This dataset includes three modalities; lung cancer screening CT scans with annotated lesions, computed radiography, and digital radiography. Moreover, eight medical imaging companies and seven centers are incorporated to produce this database including 1018 cases.

Performance Metrics.
The performance metrics which are used to evaluate the proposed WCBA based deformable model approach is as follows.
4.1.1. Average Segmentation Accuracy. Segmentation accuracy is utilized in order to predict the correctness of segmentation and is defined as where A, B, K, and I denote true positive, true negative, false positive, and false negative, respectively.

Average Jaccard Coefficient.
Jaccard coefficient is used to calculate the similarities of two samples. The Jaccard coefficient is expressed as where M and N are samples.
where Y is a target output and Z signifies predicted output.

Segmentation
Results. The experiment is conducted on three sample CT images. Figure 2 illustrates the segmentation results obtained by the developed WCBA based deformable model. Figure 2(a) represents three sample images of CT, Figure 2(b) shows preprocessed images, Figure 2(c) depicts the lung lobes regions segmentation, and Figure 2(d) represents lung cancer segmentation. The performance evaluation and comparative analysis of the proposed method are discussed in the following sections.

Performance Evaluation.
The performance of the proposed method is evaluated using three metrics, namely, average segmentation accuracy, average dice coefficient, and average Jaccard coefficient for the cluster size of 7 with different iteration values as depicted in Figure 3. For the cluster size of 7, the average segmentation accuracy of the proposed

Comparative Analysis.
The comparative analysis of the proposed WCBA based deformable model is performed with the existing techniques, namely, CNN [3], IPCT + NN [14], and dictionary-based segmentation [21]. Three evaluation metrics, average segmentation accuracy, average Dice coefficient, and average Jaccard coefficient, are used for the performance comparison with different cluster size as well as image size as depicted in Figure 4. For cluster size 4 and image size 256 × 256, the proposed method achieved an average segmentation accuracy of 0.9245 and correspondingly the other methods achieved 0.8745, 0.8788, and 0.8823 as shown in Figure 4(a).
Here, the proposed method obtained a better percentage improvement of 5.40%, 4.94%, and 4.56% with respect to the existing techniques. Likewise, in Figure 4(b), the average Dice coefficient obtained by the proposed method is 08208 and for other techniques 0.7278, 0.7696, and 0.7863. Hence, the percentage of improvement of the proposed method is 11.31%, 6.23%, and 4.20%. In Figure 4(c), the average Jaccard coefficient of the proposed method is 0.8155 and for the other techniques 0.7657, 0.7753, and 0.7953, respectively. In this case, the percentage of improvement is 6.10%, 4.92%, and 2.47%.
Similarly, the performance comparison o7f the proposed method with the existing techniques is shown in Figure 5   International Journal of Biomedical Imaging Figure 5(a), the average segmentation accuracy achieved by the proposed method is 0.9024, while other methods achieved 0.8224, 0.8591, and 0.8766, respectively. Here, the percentage improvement of the proposed model is 8.86%, 4.79%, and 2.85% as against to the existing techniques. Furthermore, the average Dice coefficient obtained by the proposed method is 0.8261 and for other methods 0.7282, 0.7832, and 0.7999, respectively, as shown in Figure 5(b). In this case, the improvement obtained by the proposed method is 11.84%, 5.20%, and 3.17%. Likewise, in Figure 5(c), the average Jaccard coefficient obtained by the proposed method is 0.8228 and for other methods 0.7437, 0.7702, and 0.7977, respectively. Here, the proposed method attained the performance improvement of 9.61%, 6.39%, and 3.05%, respectively. Table 1 summarizes the performance of the proposed method with the existing techniques in terms of average segmentation accuracy, Dice coefficient, and Jaccard coefficient for different cluster and image size. Tables 2 and 3 illustrate the comparison of computational time and memory utilization of the proposed method and existing techniques. From Table 2, it is observed that the processing speed of the proposed method is high and hence

10
International Journal of Biomedical Imaging computationally efficient. Similarly, from Table 3, it is noticed that the memory requirements of the proposed method are less as compared to other techniques.

Conclusion
In this paper, a WCBA-based deformable model for lung cancer segmentation is presented. This approach consists of preprocessing, lung lobe regions segmentation, and lung cancer segmentation. After preprocessing, accurate lung lobe regions are extracted using Bayesian fuzzy clustering models. Once the lung lobe regions are extracted, then, lung cancer segmentation is performed using the WCBA-based deformable model. The proposed WCBA is devised by the incorporation of BA and WCA. The experiment was carried out using standard database. Furthermore, the performance is evaluated using average segmentation accuracy, average Dice coefficient, and average Jaccard coefficient for different cluster sizes. The results obtained by the proposed method are very encouraging as compared to the existing lung cancer segmentation techniques. The future dimension of this work can be extended by developing other novel optimization algorithms for obtaining better segmentation performance.

Data Availability
The codes along with dataset images are available on github repository. Link https://github.com/mamthashetty234/ Water-Cycle-Bat-Algorithm.git.