A Hybrid Approach for the Lung(s) Nodule Detection using the Deformable Model and Distance Transform

The Computer Aided Diagnosis (CAD) systems are gaining more recognition and being used as an aid by clinicians for detection and interpretation of diseases every passing day due to their increasing accuracy and reliability. The lung(s) nodule detection is a very crucial and difficult step for CAD systems. In this paper, a hybrid approach for the lung nodule detection using a deformable model and distance transform has been proposed. The proposed method has the ability to detect all major kinds of nodules such as the juxta-plueral, isolated, and the juxta-vescular, along with the non-solid nodules automatically and intelligently. Results show an impressive 95.2% accuracy with 4.85 false positives per scan. One significant achievement of the proposed work is its ability to detect various sizes of nodules from 3 mm to 30 mm. The proposed technique has been tested on the publicly available Lung(s) Image Database Consortium (LIDC). The results clearly show the effectiveness of the proposed technique in early detection with impressive accuracy.


Introduction
Research shows that lung cancer is one of the most prevalent types of cancer [1][2] and lung cancer is the most frequent cause of death among all the types of cancers [3][4]. Among the cancers, lung cancer has the second-lowest survival rate only topped by pancreatic cancer in the five-year relative survival data. The survival rates are less than 10% for both male and female [5]. Recent reports on lung cancer suggest that the continued existence rate of lung cancer in the past five years varies between 13% to 21% [6]. This rate shows an increase of up to 50% when lung nodules are diagnosed in early stages. As per statistics from 2008, lung cancer is a prevalent disease, which affects a very large population of people throughout the world. As per the last global estimates, annually a total of 1,200,000 new cases of lung infections are reported [7] and the numbers are on the rise. However, unfortunately despite all the latest technologies available to us, diagnoses are still regularly made late, affecting treatment results. Diagnosing lung cancer using low dose computed tomography is a big hope for changing this situation as such and would be more proactive and effective at arriving at conclusions about lung tumors.
Research shows two major reasons for the tendency of late diagnosis. First, using available methods, it is difficult to diagnose lung cancer at early stages due to insufficiency of symptoms. Second, poor prognostics also contribute to the problem. It is a hard challenge to diagnose early whether a pulmonary nodule exists or not, and whether the nodule is malignant or not. During the diagnosis, radiologists need to analyze hundreds of Computed Tomography (CT) images relying solely on human judgment. This results in vulnerability and the enhanced probability of mistakes by the radiologists. The need of including automated intelligent support for the radiologists is spurring an increased interest in CAD systems [8][9].
The CAD-based diagnosis system supports findings made by radiologists based on quantitative analysis of the radiological image. The basic idea/steps used in the CAD schemes are first, processing of images for extraction and detection of the nodule candidates from the images. Second, image feature's quantitation is the candidates of abnormalities. Third, classification of the data between abnormal and normal features of lungs images (benign and malignant) and last, a quantitative assessment and recovery of pictures like those of obscure sores. In a lung's tumor detection, computed tomography (CT) imaging techniques are widely considered the most significant and most responsive method of detecting lung nodules. A figured tomography (CT) sweep is an imaging technique that uses X-beams to make pictures of a cross-area of the body. Automated nodule recognition plans have been shown to significantly increase indicative precision in radiological imaging [10][11]. Fig. 1 shows a description of the different structures of lungs [12]. An important aspect, which needs to be considered is that a radiologist analysis is mainly based on the morphological structures under investigation, which can be checked in a 3D space. The examination of a CT is performed through the bi-dimensional picture so in essence there is a tradeoff between the radiologist's needs to be the perceptive and what is provided to him. This requires a remodeling of the tridimensional parts of the tissue under investigation. This is an intricate undertaking, which therefore leaves the door open for many mistakes. That's the major reason there is a great demand for computational frameworks, which helps with the task of discovery and analysis when it pertains to lung nodules [13]. A representation of a chest X-ray and CT image are shown in Fig. 2.
The paper is organized as follows: After the introduction, we present a brief and concise literature review outlining previous work and the current state of research in this domain. In Section 3, the proposed methodology is explained followed by a discussion of the experimental results achieved in Section 4. Section 5 is the conclusion and future work is described.

Literature Review
Literature studies show that Dehmeshki et al. [14] presented a CAD plan named as the hereditary calculation format coordinating for the programmed recognition of lung knobs. This was based on the premise of the geometric state of the voxels and with the worldwide conveyance of the knob power performs the calculation of the wellness capacity. Sousa et al. [8] presented a CAD plan to perform divisions at numerous stages. Every division stage was responsible for segments of the volume of CT image. Ye et al. [9] proposed a CAD sachem in which they utilized five elements containing force data, shape record and a 3D spatial area. Opfer et al. [15] demonstrated impressive results of the CAD-based identification with lung knob demonstrating 89% affectability in recognizing the knobs having sizes more noteworthy than 4mm and 60% affectability for having measurements under 4mm. Messay et al. [16] connected various dim level limits to the volumetric lung areas to distinguish knob hopefuls. Likewise, various format coordinating based routines have been worked on. Pei et al. [17] introduced a three-stage technique, utilizing a 2D multiscale channel, separation of blob-molded knobs and on-knobs and lastly, extraction and classification of the shape elements. They performed testing on 30 exams and displayed an affectability of 100% and a false positive rate of 8.4 for each exam. The CAD proposed by Tan et al. [18] was based on the neural system. The hereditary calculation furthermore incorporates developments such as the utilization of another classifier of elements depicted as the highlight deselected neuro-advancing enlarging of the topologies. Two different classifiers were additionally utilized, i.e., the settled topology manufactured neural systems and the SVM. The model during testing showed the affectability calculation of (87.5%). Lee et al. [19] proposed a two staged method of classifiers called sporadic woods. In the first stage, the area of lung handles was chosen, and in the second stage, there was an effort to reduce the false positives. The results demonstrated by them showed a 100% accuracy for the bonafide positives and 1.4% for the false positives for every picture. Camarlinghi et al. [20] proposed the combination of various CAD frameworks for the upgraded performance to the lung knob ID. Their analysis was then contrasted with consequences of the individual frameworks by the method for the ROC curve. The results showed that the higher the quantity of the CAD frameworks utilized as a part of the identification, the higher the quantity of the genuine positives of 65, and the lower the quantity of the false positives of 139 and over the LIDC base of 69 pictures containing 114 lung knobs. Chama et al. [21] presented a system that uses mean-movement taken after by the methods in view of geometric properties, for example, the Region of interest (ROI), made from the symmetric centered guide of two typical topics. They performed experimentation on 429 pictures, (133 ordinary and 296 strange) from the LIDC-IDRI from the Interstitial Lung Disease (ILD) databases. The proposed technique accomplished sensitivities and specificities of 97% and 99%, (ordinary pictures) and 83% and 99%, (unusual pictures), separately.
Choi et al. [22] introduced a hereditary programming-based component change and grouping for the programmed identification of the pneumonic knobs on processed tomography pictures. This CAD approach worked in three fundamental steps. In the initial step, the lung division was performed utilizing the thresholding and 3D part marking. In the second step, the ideal thresholding and manage based pruning for the location of the knob applicants are utilized and in the last step, a GP-based classifier is used to sort the knobs and non-knob elements.
Recently with the development of the convolution neural network (CNN), more and more researchers [23][24][25][26][27][28] used the CNN for detection of lung nodules. Juang et al. [29] used an Automatic Multi-Thresholding for the tumor classification. Similarly, Xu et al. [30] also proposed a novel technique for the medical image segmentation using the self-adaptive PCNN model.
The major contributions in this paper are the following: (1) The proposed technique does not focus on a single type of nodule. Instead, it detects all types of lung nodules including isolated nodules, juxta-plueral nodules, and juxta-vescular nodules. (2) It can detect the juxta-vescular nodules of a large size, which were not detected earlier in any technique due to their complex structure. The Juxta-vescular nodules are complex to work with, because they are attached to other structures such as vessels and the bronchi tree.
The proposed scheme is based on a four-step approach, which works as follows: Initially a lung segmentation is performed using a linear interpolation and lung parenchymal identification technique. In the second stage, we detect the Region of Interest (RoI) using multiple thresholds for the ROI, contour correction and a seed point choice. After detection, in the third stage, the process of the nodule detection is carried out using both the deformable model for detection (for Juxta-plueral nodules and isolated nodules) and the Distance Transform (for juxta-vescular nodules). In the final stage, the false-positive reduction is achieved using Fuzzy rule-based pruning.
This brief overview of literature shows that despite impressive developments in this field, there are still significant shortcomings hampering the effectiveness of the CAD based systems in a true sense. The limitations we are focusing on in this paper include a comprehensive mechanism that can work with various kinds of nodules simultaneously. Another area of interest for us on which no previous work has focused on so far deals with the large size juxtavescular nodules. In the Section 3, we explain the structure of proposed methodology.

The Proposed Methodology
The lung(s) segmentation is the process to detect the lungs part from the whole CT Scan image of the lungs. There are many different organs available in the complete CT scan lung image, therefore it is important to segment the lungs for the detection of the nodules available only in the lungs. The lung(s) segmentation, nodule detection is a very important and crucial step to extract the nodules. There are four types of lung(s) nodules that are required to detect. The proposed methodology consists of four steps to perform the whole process.
• The lung parenchymal and linear interpolation techniques are used to performed the Lung Segmentation. • The Region of Interest (RoI) has been detected using a multiple threshold.
• The nodule detection is carried out using the deformable model (for Juxta-pleural nodules and isolated nodules) and the Distance Transform (for juxta-vescular nodules). • The false-positive reduction is achieved using the Fuzzy rule-based pruning. Graphical representation of our proposed methodology is given below in Fig. 3. In the following subsections, we explain the complete process of each stage separately in detail.

Linear Interpolation
The performance of any CAD system depends greatly on the examination based on the volume of the Computer Tomography pictures. For the first step, we transform all the voxels into a 3D coordinate system network with uniform 3D spatial determination. This helps overcome the risk of blunder due to the anisotropic representations of lattices. This approach used in our work, despite its simplicity, is a strategy that is broadly used for informal recreation. Keep in mind that the spatial determination along the hub heading in the CT examination is not the same as the spatial determination inside every cut. That is why we have used a straight addition along the hub bearing as shown in the image.

Lung Parenchymal Volume Identification
In any CAD scheme the lung segmentation is necessary, because to detect the nodules from the input image is a difficult task. In our work, the input image is segmented into several slices to minimize the computation complexity and the FPs. We calculate the initial volume of the interest (VOI) for separation of the lung(s) portions. The objective is to better identify the juxta-vescular nodules, the isolated nodules and the juxta-pleural nodule. The segmentation makes it easier to find different types of nodules from the image. The most used technique for the lung segmentation in the literature is the threshold-based region filling. The limitation of this technique is that it erroneously removes the important regions including the juxta-pleural nodules, because these nodules are attached with the pleura. The simple thresholding filling technique or algorithm is not able to detect the juxta-pleural nodules separately. To overcome this limitation, the proposed technique works on following two steps: • The Inclusion Process A 3D Region Growing algorithm is used for the lung parenchyma identification.
• The Connectivity Analysis To incorporate the interior structures with the high-power esteem (knobs and vessels) and the pleura layer neighboring inward lung volume, a widening procedure is connected, which is called the Connectivity analysis.

The Inclusion Process
When we examine the structure of the lungs, we find that the bronchial tree and air lies in the interior lung volume. In the CT Images, these appear as a low power voxel encompassed by the high force voxels, which are connected to the pleura surface. We use this information to divide the interior lung volume using a 3D RG (Region Growing) calculation. Our technique begins from a seed point, which is the incremental appending, so it combines adjacent focuses, which have the comparable properties to the seed. This process keeps iterating until there are no more focuses fulfilling the predefined criteria. The most challenging task in the RG (Region Growing) calculation is to pick a better incorporation process with a proper seed point. For this purpose, we have utilized the Simple Bottom Threshold (SBT). In this technique, if the power of a nonexclusive voxel is lower than the edge esteem, this voxel is added to the area. The termination point is consequently chosen by the system proposed by Ridler et al. [31]. This procedure depends on the CT voxel dim quality circulation, which is comprised of two parts: One has air, bronchial tree, trachea, and lung parenchyma. The other is comprised of muscles, bones, vascular tree, and fat. The ideal edge is set between these two locales. The RG calculation utilizes the two seed focuses, and determines the criteria of the seed point as described by Cascio et al. [32] shown as follows: 1

The Connectivity Analysis
After the inclusion process, we receive the segmented image. However, in this image the internal nodule, vessels and air walls are not included. In order to include these nodules, a dilation process called the connectivity analysis is applied. For an accurate application, the connectivity analysis is applied. After applying the connectivity analysis, the juxta-pleural nodules are also included into the segmented image. Fig. 4 represents the proposed methodology diagram.

The Multiple Optimal threshold for the ROI Extraction
The process of selecting multiple optimal thresholds is shown in Fig. 5. As we know, there are different intensities of nodules and varying levels of vessel attachments. In order to extract the Region of Interest (ROI) in such a situation, we use multiple thresholds. We have used the optimal threshold to detect the ROI. The Thoracic CT consists of two main groups of pixels, a high-intensity pixel located in the body (body pixels) and a low-intensity pixel, which are in the lungs and surrounded air, which are called the non-body pixels. The larger intensity difference between these two group of pixels ensures the thresholding is the best mechanism to separate these groups. The method proposed by Choi et al. [22] used five static threshold values, which is not an optimal threshold. In our technique we use the method of the optimal thresholding defined by Hu et al. [33]. This technique iteratively figures the estimation of a limit so that the two gatherings of the pixels are all around isolated.

Figure 5:
The multiple optimal threshold selection process Let be the edge esteem at step w and µ , µ is the normal power estimation of the body pixels, (i.e., with the force higher than ), and separately the non-body pixels (power lower than ). The limit for step w + 1 is: This technique is rehashed until the union, i.e., and until step e, where = −1 . The introductory limit of 0 is set to 128, which is the middle dim level. At the point when joining is finished, the picture is thresholded at quality . The body pixels are set to 0 and the non-body pixels are set to 1.

The Seed Points Choice
The seeds chosen work as the information for the step is called the "Deformable Model". The knobs have a more prominent force as for the pneumonic parenchyma and they can be effortlessly found by searching for the neighborhood maxima in the volume of interest. A voxel-level operation is performed as follows: O(a,b,c) = (M(a,b,c)-N(a,b,c)) . I (a,b,c) (2) where M(a.b,c) is the matrix after the connectivity analysis process, N(a,b,c) is the matrix we get after the RG process, I(a,b,c) is the input image, and O(a,b,c) is the resultant matrix. Fig. 6 represents the result of Eq. (1). After getting the matrix O(a,b,c), we apply a hit the highest point detector algorithm on it to discover the confined maxima and select the seed points, which will be used in the next step. The process of the Seed Point Choice is shown in Fig. 6.

The Nodule Detection
In this stage, we detect all the various types of nodule candidates from other structures. To detect these nodules, we used two techniques. The first one is the deformable model to detect the isolated nodules and the juxta-pleural nodules and the second one is the distance transform to detect the juxtavescular nodules. Detailed description of these methodologies is given below.

The Deformable Model
This model is used to detect the isolated and juxtapleural nodules from the lung parenchyma. This model is also called the Mass Spring Model. The basic aim of the deformable model is to identify the shape of the object. We have used it to identify the nodule's shape among other structures. The Deformable model uses a priori knowledge by definition and parameterization. The Deformable model discovers the balance between the inner strengths (portraying the model shape) and the outer powers (depicting the picture data). In the material science-based displaying worldview, the shape relates to the connected (outside) powers on the deformable model so that the model converges into the shape of an object, while the inside powers keep the model smooth amid disfigurement. The deformable model is adjusted considering the vitality term until the vitality is insignificant.

The Design of the Deformable Model
We need to locate a suitable vitality capacity from which we get an answer for the knob form. This model was introduced by Cascio et al. [32]. In the first stages of the 3D deformable model, we took a circular crosssection of mass. The model begins from an augmented position where the mass cross-section holds on to the potential knobs. The seed focuses, which were picked in the past section are taken as the focal point of every circle network. We marked the circle with the N value of the mass focus that make the dynamic model and with the t number of the mass purposes of a solitary cut that makes a connection of L = t.u, where u is the total number of slices. Each mass in the sphere N(j) (j = 1…t) is connected with two other masses N(j -1), N(j + 1), which belongs to the same slice and with two slices N(j -n) and N(j + n), which belongs to the previous slice. We have written it in a more generalized way as: As shown in Eq. (2), every single element in the proposed model plays an important role with external and internal energies to the functional energy. Now we will describe internal and external forces in detail.

Internal Energy
The internal energy ( ) of the deformable model is achieved by the calculation and addition of all following energies.

Elastic Energy
In order to get the shape of the segmented object, the first step is the Elastic energy. The vertex is connected to the z-plane and -plane of its four neighbors. Those four points are contributing to the elastic energy.
where ( ) represents the elastic energy of the model, which is attached with the two masses i and j.
The two-mass points distance is represented by ∆ and , which denotes the spring constant, as a parameter.

Bending Energy
The second step is to apply the bending energy on the object. The model must adjust to the diverse bends that the object might show. In this way we apply the bending energy. Here the bending energy is linked with the general vertex I, which is defined as follows: (6) where indicates the position of the ℎ vertex.

Attraction Energy
The vertices belong to the same slice, and the geometric centre of those positions are taken as a linking point for the calculation to separate the vertices normal ( ̅ ) and standard deviation ( ). If the ℎ vertex has the distance of from the center, then it is represented as: > ̅ + (7) Then the attraction energy is linked with the ℎ vertex and given as: With respect to the average distance as the vertex is from the center caused the larger amount of energy contribution. This contribution makes the points on the model surface almost spherical, which are most likely nodules.

External Energy
Internal energy evaluates the contour points as well as the contraction and bending but it does not provide any pulling force with which we can separate our desired object from the rest of the image [32]. Therefore, we apply external forces for this task. The external energy is the combination of the gradient and potential energies. Now we describe the contribution of each energy in our work one by one respectively.

Gradient Energy
Gradient energy has a high gradient evolution towards the location. A very small amount of energy will be associated with the ℎ vertex if it is located on the edge point, which is represented by Eq. (9) [32]: where ( ) is the intensity of the ℎ vertex at the position.

Potential Energy
Every point of the model is linked with potential energy. If the point is with high intensity, then it means more energy is linked with it. If a pixel moves from a low-intensity value to a large intensity value like the pleura surface, then it must pay the additional energy function such as:

Energy Functional
After calculating the internal and external forces we must sum up all the energies to get the total energy function, which is represented as follows: Dimensional parameters like ε, β, α, γ and δ are empirically reduced. Various sizes of the artificial objects are introduced in the image for identification of theses parameters. At the start of the methodology, we mentioned that our aim is to detect the isolated and juxta-pleural nodules, and both types of nodules were detected successfully. Now for the detection of the juxta vescular nodules, we use the distance transform technique. A detailed description of this technique is given.

The Distance Transform
After applying the deformable model, we get the isolated, juxta-pleural and some small juxta-vescular nodules. The deformable model excludes the larger juxta-vescular nodules due to their complex structure. To include those larger juxta-vescular nodules, we apply the distance transform using the 3D region growing algorithm. We used the seed points, which were identified earlier. As a matter of the first importance, the structures are regrouped by utilizing the 3D district developing calculation. This helps in joining the structures with one another. Initially, the resultant structures are not partitioned into the genuine structures (knobs, vessels and bronchi).
The shape of the knobs is the most important element, which is utilized to distinguish it from the different structures. It provides us with essential data about the shape and we use this data for analysis. The difficulty arises from distinguishing the round shapes (more probable knobs) from the barrel-shaped shapes (more probable vessels). The distance change is figured [8] in each of the structures in all the voxels of the edges. As we traverse to the inward voxel, it is increased until there are one or two voxels left that are distinguished from the edge. The structures with the expansive grouping of the knobs have a high estimation of the separation change. This information about the lung knob is used to separate the bigger juxta-vescular knobs from alternate structures, for example, the vessel and bronchi and so on.

The Nodules Fusion
After detection of all the types of nodule candidates at this stage, we merge all those types of nodules before performing the rule-based pruning. We take union of all the detected nodules to avoid duplication and to make the rule-based pruning much easier.

The False-Positive Reduction Using the Fuzzy Rule-Based Pruning
There is probability of non-nodule elements present in the earlier stage detection module. These nonnodule elements are unnecessary and affect the accuracy of the system. Rule based pruning has been performed and four rules are defined to remove these unwanted non-nodules. Nodule candidates are characterized by their volume (v), area (a), elongation (e), diameter (d) and circularity (c). Maximum and minimum threshold has been defined and is used for the distinguishing nodules from non-nodules elements.
For the symbolic representation of minimum and maximum thresholds of volume, area and diameter (min ) , All these four rules can help us in separating the non-nodules candidates from nodule candidate. The fuzzy rule-based pruning algorithm is presented here:

The Results/Comparisons with Other Works
Nodules are mainly categorized into three types namely juxta-vascular, juxta-plueral and isolatednodules. Techniques developed by [22] have addressed the problem of detection of all types of nodules. Their technique was unable to detect some of the large and non-solid nodules. However, there is deterioration in accuracy of juxta-vascular nodules detection. It is only because of the larger size and complicated structure of these types of nodules. Our proposed technique has been successful in categorizing larger juxta-vescular type nodules along with some non-solid nodules. In order to detect the larger juxtavescular nodules, which were not detected through earlier techniques, we used the distance transform technique. As shown in Tab. 1, three different distribution of 30-70%, 50-50% and 70-30% are used for training and testing purposes. Accuracy, specificity, and sensitivity measurements are used to check the system performance with training to testing ratio. Tab. 1 illustrates that the accuracy and sensitivity gradually increased with an increase in the training sample. However, the specificity rate, which shows the relationship between the false positive and the true negative are almost consistent with an increase in the training samples. The best results are obtained with 70-30% training to testing ratio.
It is important to compare the results of the proposed method in the same domain with existing methods. It is a challenging task due to the change in the evaluation parameters like the results metrics, nodule size and dataset. For fair comparison we have considered the methods using the same experimental protocols as shown in Tab. 2. The number of the false-positive rate is the most important evaluation parameter, which greatly affects the accuracy of any good diagnosis system. It is also the key requirement of any diagnosis system to reduce the false-positive rate. Tab. 2 shows the clear picture that the results of the proposed method are better than the existing method with respect to the average false positive rate and sensitivity. Suzuki et al. [34] achieved 80.3% sensitivity with a 16.1% false-positive rate. The methods reported in [16] and [22] used the same size nodule as ours, however, their sensitivity rate is less than our proposed method.  Fig. 7 shows the proposed method performance in terms of the scatter plot. It shows the classification of correct and incorrect nodules. The figure depicts that the proposed method has reduced the false positive rate and increased the true positive rate.
The results clearly show the performance improvement using our proposed technique for the falsepositive detection against most of the standard techniques. At the same time, we can see that the efficiency of the proposed technique is much better than most of the already existing methodologies as shown in Fig.  8. Fig. 8 shows the illustration of the ROC curve, the AUC and the results of the current classifier. The results in Fig. 8 also show the different features for each group, which are described in terms of false positives and true positives. Our proposed method improved the performance in terms of the true positives, sensitivity, accuracy and specificity.

Conclusion and Future Work
There is a significant rise in the use of CAD systems to help radiologists detect lung modules and increase the survival rate of lung cancer patients. These systems are very helpful for the radiologists and patient's survival rate has been drastically increased. However, there are few limitations of these types of systems when applied using CT-Scan images. A novel framework for CAD Systems is developed and is presented in this research paper. Lungs are segmented using Linear Interpolation and Lung Parenchymal Volume Identification methodology. We have used multiple optimal thresholding method for region of interest extraction. Candidate nodules are detected by applying deformable model and distance transform. Proposed technique performs better than the existing methods and is capable of detecting all types of nodules. System detects malignant and benign lesions. However, categorization of benign and malignant is left. In our experiments we have used LIDC datasets and results are generated. It has been observed that there is significant improvement in sensitivity, i.e., 95.2% and deterioration in false positive rate, i.e., 4.85%.
Funding Statement: The author(s) received no specific funding for this study.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.