Development of CAD System for Automatic Lung Nodule Detection: A Review

Lung cancer is a type of cancer that spreads rapidly and is the leading cause of mortality globally. The Computer-Aided Detection (CAD) system for automatic lung cancer detection has a significant influence on human survival. In this article, we report the summary of relevant literature on CAD systems for lung cancer detection. The CAD system includes preprocessing techniques, segmentation, lung nodule detection, and false-positive reduction with feature extraction. In evaluating some of the work on this topic, we used a search of selected literature, the dataset used for method validation, the number of cases, the image size, several techniques in nodule detection, feature extraction, sensitivity, and false-positive rates. The best performance CAD systems of our analysis results show the sensitivity value is high with low false positives and other parameters for lung nodule detection. Furthermore, it also uses a large dataset, so the further systems have improved accuracy and precision in detection. CNN is the best lung nodule detection method and need to develop, it is preferable because this method has witnessed various growth in recent years and has yielded impressive outcomes. We hope this article will help professional researchers and radiologists in developing CAD systems for lung cancer detection.


Introduction
Lung cancer is the leading cause of death worldwide and the most dangerous type of cancer in comparison to other cancers [1]. According to the American Cancer Society, lung cancer killed 158.040 people in the United States in 2015 [2]. In Indonesia, based on 2018 Globocan data, a total of 26.095 people died from lung cancer with 30.023 new cases, the highest in Southeast Asia [3]. Based on the latest statistical data, the human survival rate for 5 years is around 16% and it is estimated that in 2020 the number of deaths will reach 1 million due to cancer every year, where the largest percentage is lung cancer. Nevertheless, if nodules are detected early, the human survival rate can be increased. A lung nodule is an abnormality of tissue in the lungs that can quickly proceed to lung cancer [4]. Lung nodule analysis is one of the steps to prevent lung cancer effectively, which is done by detection and classification. Generally, the characteristic dark-level lung nodule is about 3 mm to 30 mm in diameter [5]. The circumscribed, juxta vascular, juxta pleural, and pleural tails of nodules are depicted in Fig. 1. Circumscribed nodules are freely scattered and unrelated to other tissue structures, where as juxta vascular nodules are firmly attached to blood vessels and juxta pleural nodules are in the area around the pleura [4] * Corresponding author: sekarsari103@mail.ugm.ac.id  [4] In recent years, computer-aided detection (CAD) systems have been developed. Lung cancer detection at an early stage aims to reduce mortality rates and improve survival. Lung nodules that are still small, whether cancerous or not, can also be called benign or malignant. Benign lung tissue does not experience much growth, while malignant ones will grow rapidly and attack the body so it is very dangerous for health [6]. The top priorities of CAD are to accurately recognize images and extract Regions of Interest (ROI) obtained from various imaging modalities, including computed tomography (CT) images, position emission tomography (PET), X-ray, and magnetic resonance imaging (MRI). CAD systems are further categorized into computer-aided detection (CADe) and computeraided diagnosis (CADx). The CADe system's field is limited to images and the identification of abnormal tissue areas, whereas the CADx system can be used to diagnose a disease by determining the type and malignancy of the anomalies [4]. In most cases, the CADe system for lung nodule detection divided into four stages: (a) preprocessing, (b) segmentation, (c) nodule detection to determine candidate nodules and (d) stage, the feature extraction process and nodule classification using feature-based classifiers were involved [7]. The feature extraction process is based on shape and texture. On shape features, what is measured is the geometric value of each structure (such as shape proportion, density, roundness, elongation, weighted radial distance, and Boyce-Clark radial shape index), look for the most spherical shape, because nodules are rounder than other tissue structures [15]. After performing the feature extraction stage, then detect the nodule to reduce a FPs value using several supervised or unsupervised classifiers [4]. This FPs reduction step focuses on classifying true lung nodules of all suspicious nodules and removing false nodules to reduce the falsepositive rate of the CADe system. In the classification process, there will be four possible outcomes. If a lung nodule is correctly classified, it is defined as true positive (TP) or false negative (FN). If a lung nodule is incorrectly classified, it is defined as true negative (TN) or false positive (FP) [16].
Medical images are acquired using numerous different imaging modalities, such as CT scans, X-rays, and MRI images [17]. Medical images can be obtained from publicly available image datasets. Several publicly databases of lung images that can be accessed by anyone, including LIDC-IDRI, LUNA16, ANODE09, ELCAP, Kaggle, NELSON and others are shown in Table 1. Medical images from imaging modalities have their respective differences which are not explained in this article, but only shows several public available lung datasets. The primary goal of having public datasets of lung images is to provide a source of data to the research community as material for CADe system research, development, evaluation, and benchmarking. CT images are more widely available than some public datasets because they can be effectively used for the detection of lung disease [23]. Even in detection, it is difficult to interpret and identify tumors from scanned CT images, but it can be successfully done with the aid of a computer approach that utilizes image processing and artificial intelligence [17]. The advantage of CT images in the detection of lung nodules is that the resulting images are more detailed and abvious, which can specifically depict parts of the lung [22] and could help in the detection of temporal and spatial heterogeneities in an unobtrusive manner [24].
So far a lot of research has been related to the analysis of lung nodule detection using the CADe system, which is usually built on conventional machine learning due to insufficient resources and large amounts of data, as well as multiple gray-level thresholding, linear discriminant analysis, distance transformation and support vector machine (SVM). However, in recent years many researchers have developed lung detection systems using deep learning, one of which is the CNN method, which has high performance values in computer vision and increasing accuracy and sensitivity of CADe system [5].
This article aims to provide researchers with an overview of the CAD system for lung nodule classification and detection, which can be used as learning material. The review that will be discussed is about several public datasets that are widely used in the detection of lung nodules along with their details, the stages of the CAD system in general, the segmentation and detection methods that are widely used from the old until now, to the lung nodule detection method that is being used. Trends in 2019 and 2021 are deep learning in the form of CNN and the performance of each method.

Review of Lung Nodule Detection
The lung nodule detection system divided into four stages, which are as follows: (1) preprocessing, (2) lung segmentation, (3) nodule detection, and (4) false positive reduction or FPs which include feature extraction which aims to determine the sensitivity level and the value of FPs/scan. Even though segmentation and feature extraction are critical in the detection of lung nodules due to increased accuracy, we summarize some of the widely used segmentation and feature extraction techniques. In addition, we also summarize several variations of the nodule detection classification using both the conventional method and the deep learning method.

Preprocessing
Preprocessing is a necessary early step in the lung detection CT images because in the initial CT images there is a lot of irrelevant information or noise which will certainly interfere with the CAD system process for lung nodule detection [5]. Generally, rule-based approaches in preprocessing medical images are including thresholding [7], gaussian filter [8][9] [15], median filter [9][10] [15], brightness, shape and position of lung [1], grayscale [10][25], histogram equalization [8], and morphological transformation [19]. Anitha et al. proposed preprocessing method is divided into two subprocesses are carried out on the lung CT image, that is improving the structure of the lung tissue and removing noise. To improve the lung tissue structure (fissures, vascular and bronchial trees) are done using the morphological transformation method. Meanwhile, the noise removal process is carried out with a weiner filter in order to reduce the amount of irrelevant information [19].
Widodo et al. proposed preprocessing using the histogram equalization method, which is done by increasing the contrast value in the form of stretching image intensity distribution or by transform the value of the color map used. The following step is the filtering process, which implements a gaussian low pass filter to remove noise and smooth image features [8]. The next preprocessing technique is use grayscale by Kuruvilla et al., where the grayscale image is converted into a binary image, when the pixel value in the input image that is significantly higher than the threshold value is replaced by 1 and the pixel value that is less than the threshold value is replaced by 0 [25]. Another preprocessing technique is using a gaussian filter and a median filter, a gaussian filter is applied to remove noise in lung cancer detection and a median filter is applied to remove small noise in CT images called salt and pepper noise. This will smooth the image and reduce the speckle noise from the CT image of the lung [8] [9]. So that it can be summarized several preprocessing techniques that are widely used prior to the lung segmentation process, namely using gaussian and median filters, both are used to remove salt and pepper noise and refine the image for the public database available from LIDC-IDRI. For more details, the preprocessing technique reports are shown in Table 2.

Lung Segmentation
Lung segmentation in lung detection is the process of separating lung nodules from other parts of the lung CT scan image and further enhancing a resulting image to obtain detail [10]. Overall lung segmentation techniques are divided into three kinds, among which are (1) deformable boundary-based techniques, (2) edge-based techniques, (3) threshold-based techniques [4]. Some of these segmentation techniques have their respective advantages and disadvantages. A quality thresholding (QT) segmentation technique was applied to human chromosome gene segmentation and gained the advantage of speed efficiency and easy application. It is performed by grouping the points of the Euclidean distance to the starting point which is smaller than or equal to the quality threshold value, aiming to limit the parameters in cluster establishment. This process is called candidate cluster formation. The threshold value used of 90 in the research conducted by Filho et al., this threshold value was selected by reason of it can report good results in accordance with the case of lung nodule detection [15]. Threshold techniques in lung segmentation reduce the non-nodule area and increase the nodule detection rate [13].
In addition to thresholding techniques in lung segmentation, there are active appearance model (AAM) [8], active shape model (ASM) [18], active contour model (ACM) [12], watershed techniques [9], morphological operations [25] and fissure regions segmentation [19]. Widodo et al. proposed an AAM technique is a statistical learning method that models parameters to describe forms between classes and texture variations. The main thing to build this model is the principal component analysis (PCA) to perform the covariance matrix eigen analysis of the training vector that matches the shape and texture in the training image. As a entire, the AAM model is divided into three parts, namely (1) shape data alignment, (2) generation of a parameter model from statistical and actual data, (3) template matching. The results obtained from this segmentation are able to separate the lung image from other nearby tissues [8]. The ACM technique proposed by Kasinathan et al. in lung tumor segmentation was used to integrate the field formulation of the local biased image with an active contour model (ACM). The mean square error was used to reconcile carefully homogeneous CT images and efficiently divide the tumor region with an inhomogeneous intensity. The proposed ACM technique, which consists of 850 images of the lesion and is capable of producing accurate detection of lung tumor CT images, was evaluated with the LIDC-IDRI database [12]. Kuruvilla et al. proposed a lung segmentation technique that uses morphological operations on CT images, performed by converting grayscale images to binary images. The morphological operation technique is the speed and simplicity of its application [25]. While Jayaraj et al. proposed the watershed segmentation techniques have the main characteristic of being able to extract and determine objects in contact with the image. It is a region-based model of mathematical morphology. It is appear image decomposition and allocates each pixel to an area or watershed [9]. Huang et al. proposed the segmentation technique for detecting lung CT images can be performed using deep learning, especially the Faster Convolutional Neural Network (FCN). It uses five layers of convolution with a lower rate due to higher resolution and increased segmentation precision. The proposed FCN technique was evaluated using the LIDC-IDRI database and obtained an accuracy value of 94.6% [11]. Chunran et al. used a fully convolutional network (FCN) to segment CT images from the LIDC-IDRI database in 2018 [26]. Messay et al. proposed the Regression Neural Network (RNN) segmentation method. They created a fully-automated (FA) system, a semi-automated (SA) system, and a hybrid system. The hybrid system is derived from the FA and SA systems, which then generate several parameters, which are then determined adaptively for each nodule using RNN [27]. Sankar et al. proposed the RNN technique in 2020 as well. They are using RNNs to improve the detection of juxtapleural and juxtavascular lesions. When recognizing lesions at the same intensity level, their proposed RNN method outperforms the Skeleton graph cut and Level set methods [24]. Shaziya et al. proposed the UNet and CNN methods for segmenting lung CT images. The Dice similarity coefficient (DSC) results show that the UNet method is 1.27% better than CNN at image segmentation [28]. Arora et al. also performed segmentation using the UNet method with 662 Chest Xray (CXR) images. They obtained a DSC value of 0.9680 for lung segmentation in the tuberculosis category [29]. In fact, performance segmentation using a rules-based approach is the same as a data-based approach. Nevertheless, a data-based approach requires take a lot of time to train the learning model and will be computationally more expensive than a rule-based approach for CAD system optimization. A rule-based approach can be done by adjusting the manual parameters of the data-based approach so that researchers are more comfortable apply a rule-based approach to processing lung CT images [5]. As a result thresholding, FCN, RNN, and UNet techniques are the best possible segmentation methods for image detection tasks. Table 2 summarizes selected reports on lung nodule segmentation techniques.

Lung Nodule Candidate Detection
Candidate nodule detection is applied to identify lung tissue structures that are suspected that may be nodules. This detection stage is carried out after lung segmentation stage, where segmentation is useful for reducing the workload of detecting CT input images because the background and unwanted areas have been removed. Random forest, support vector machine (SVM), naive bayes, K-nearest neighbor (KNN), and CNN are some of the methods used to detect lung nodules. CNN is consisted of several architectures, including the CNN, faster R-CNN, 3D CNN, and R-CNN. Gong et al. detecting lung nodules for CAD system using random forest by applying a 10-fold cross method for CAD systems. The CAD system created was validated with two datasets, namely LUNA16 and ANODE09 [7]. De Carvalho Filho et al. perform lung nodule detection wield the SVM method. SVM is a state-of-the-art algorithm based on the Vapnik-Chervonenkis theory, with a degree of accuracy depending on the choice of kernel parameters such as C carried out using the LIDC-IDRI database with 140 new exams [15]. Nóbrega et al. proposed Naive Bayes and KNN method for lung nodule detection, applied a gaussian distribution which is applied for the probability density function as the basis of its builder [14].
However, detection of lung nodules using machine learning methods struggle with the task of defining and selecting the features of a specific image. As the number of images in each category increases, the feature extraction process becomes more time consuming and exhausting [5]. Besides, deep learning techniques have been developed in recent years for lung nodules detection. Some researchers have developed the CNN method for lung nodule detection. In general, CNN consists of a convolutional (CONV) layer, a pooling layer, and a fully connected (FC) layer. Jiang et al. studied lung nodule detection using the CNN method, to replace the convolutional layer using the function of utilized rectified linear units (ReLU). The advantage obtained from ReLU is to speed up the error training process from other activation functions on the CNN method. At the pooling layer, max-pooling and averagepooling operations are performed, while the FC layer consists of four channels. Evaluation of the proposed CNN method with the LIDC IDRI database [13]. Wang et al. perform lung nodule detection using the CNN method, which consists of several layers. The convolution layer uses Leaky ReLU as an activation function, uses means and averages at the pooling layer and uses global average pooling on the FC layer. The FC layer uses a 4 x 4 matrix kernel for feature mapping, which aims to reduce a large number of connected parameters. Furthermore, the batch normalization layer is used to reduce overfitting and accelerate network convergence [30]. Another study, using the CNN method for lung nodule detection was conducted by Kasinathan et al. [12] and Li et al. [18]. The CNN method is currently being developed, and it includes architectural models such as region-based fully convolutional network (RFCN) [31], regional-CNN (R-CNN) [32], faster regional CNN (Faster R-CNN) [11], ResNet50 [14], and 3D-CNN [33]. Table 3 summarizes the lung nodule detection techniques.
Deep learning plays an important role in enabling the CNN method to be used in processing medical images as technology advances, the detection process becomes computationally fast, and the amount of data available grows. The CNN method has several advantages, including improved image detection performance, high flexibility and adaptability to a wide range of datasets, and the ability to be designed automatically and efficiently using black-box operations [5]. Here are some advantages of deep learning based on previous research: (1) Deep learning techniques can improve the CAD system's performance in detecting nodules in lung cancer. The CAD system not only detects the presence of lung nodules but also provides information on their location and can classify detected nodules as benign or malignant [31].

False Positive Reduction
After the candidate nodule detection step, furthermore classify the image into nodules and not nodules. This stage is called False Positive (FPs) reduction. FPs reduction is divided into two types, it is conventional feature-based classification and classification using CNN. Feature extraction and nodule candidate detection are used in conventional feature-based classification. A few methods for feature extraction and candidate nodule detection have been proposed. Below are review some of the publications relating to these two classification stages for CAD systems in lung images. Filho et al. proposed the SVM method for classifying lung images with data taken from the LIDC-IDRI database and feature extraction process using shapes and textures. The results obtained a sensitivity of 85.91%, a specificity of 97.70%, and an accuracy of 97.55% with an FPs/scan of 0.008 and a free-response operating characteristic (FROC) of 0.8062 [15]. Another study using the SVM method was conducted by Han et al. [35] and Boroczky et al. [36], classification using feature-based SVM rely on categorical rules in the form of geometric or shape features, intensity features, gradient features, and eigen-value based features. The proposed system was validated on 205 patient cases from the public available online LIDC database and the experimental results obtained a sensitivity of 89.2% at 4.14 FPs/scan [35]. Boroczky et al. proposed a genetic algorithm method for feature extraction with private data (52 true nodules and 443 false ones) from lung CT scans and obtained the results of the study with a sensitivity of 100% and a specificity of 56.4% [36].
Jia et al. proposed three-dimensional techniques for lung detection, including maximum intensity rendering (MIP), minimum intensity rendering (MIP), surface shadow display (SSD), and volume rendering (VR). The feature extraction process using identify ROI more accurately with attention to shape, gray value and position of suspicious regions, area, diameter, circularity, mean value of gray-level, and smoothness. The proposed method was successful in detecting lung image nodules with a sensitivity of 95% on FPs/scan of 0.91 [37]. Tariq et al. studied lung nodule detection using the neuro fuzzy method. The classification of neuro fuzzy is divided into two sub-networks, namely fuzzy self to manage the network and multilevel multilayer perception (MLP). The feature vector is used as input to the fuzzy layer to generate a pre-classification vector which is assigned to the MLP as a sample test classification. The fuzzy self-layer network is responsible for detecting nodule pixels and grouping them according to the similarity of the nodules (whether there are nodules or not nodules) but with different membership values. Furthermore, the MLP network will classify the applied input vectors to extract candidates to the appropriate class. Testing was performed with 100 datasets of lung CT images from different patients. The proposed method yields an accuracy value of 95% [10]. In classified nodules from objects that are not nodules, Talebpour et al. using the Back Propagation Neural Network method which is composed of three layers. The first layer consists of 22 input neurons, the second layer consists of five hidden neurons, and the third layer is an output layer consisting of one neuron, each neuron has an internal function of tan-sigmoid. The test of the proposed method was carried out with the LIDC-IDRI database and the results obtained were 90% sensitivity on FP / scan of 10 [38]. Another study using the Back Propagation Neural Network was conducted by Kuruvilla et al. with obtained results of the research are sensitivity of 91.4% at FP/scan of 30 [25].
Gong et al. proposed a random forest method for classifying lung nodules with the LUNA 16 and ANODE09 databases. The results obtained from testing from both databases were a sensitivity value of 79.3% with FP/scan of 4 and a sensitivity of 84.62% with FP/scan of 2.8 [7]. Another research using the random forest method was also conducted by Jayaraj et al. to detect lung cancer on CT images using the LIDC dataset. Decision making in the random forest method as a classification based on the index and entropy. The proposed method obtained results accuracy of 89.90%, the sensitivity of 90.85%, and specificity of 88.32% respectively [9]. Nobrega et al. study explore the performance of deep transfer learning on lung nodule malignancy classification tasks in order to improve such systems and for testing using the LIDC database. The proposed method is a comparison of deep transfer learning and deep feature, and get results area under the curve (AUC) of 93.1%, true positive rate (TPR) of 85.38%, evaluation metrics accuracy (ACC) of 88.41%, precision of 73.48% and F1-score of 78.83%. From their research, it is found that the deep transfer learning method is a relevant strategy for extracting representative features from CT images of lung nodules [14].
The next method used at the FPs reduction stage is the CNN classification. Shin et al. proposed the CNN method for lung nodule detection, an architecture built using GoogLeNet which consists of a convolution layer, three pooling layers, and nine inception layers. Each inception layer on GoogLeNet consists of six convolution layers and one pooling layer. The system built was evaluated using the ILD dataset and the results obtained were lower accuracy 79% [22]. Golan et al. proposed a deep CNN method using a back-propagation algorithm to detect lung CT images. CNN is built into two parts, the first part consists of multiple volumetric convolutions, rectified linear units (ReLU), and maxpooling layers. The second part is a classifier consisting of multiple fully connected, threshold and softmax layers. The system was evaluated with the LIDC dataset and obtained a low sensitivity result of 78.9% with 20 FPs/scan [2]. Anthimopoulos et al. proposed and evaluated CNN, designed for the classification of ILD patterns. The proposed method consists of 5 convolutional layers, LeakyReLU activations, pooling layer, and three dense layers. The classification performance of lung patterns using the CNN method is 85.5% [34]. Dou et al. proposed three-dimensional (3-D) CNN for the positive reduction in automated lung nodule detection from volumetric CT scans. The proposed has been extensively validated in the LUNA16 challenge and obtain the sensitivity result of 90% with 8 FPs/scan [33]. Tekade et al. proposed lung cancer detection and classification using deep learning and the method is a 3D multipath VGG-like network, which is evaluated on LIDC, LUNA16, and Kaggle dataset. The result for lung nodule detected and classified is 95.60% of accuracy and 0.387732 of log loss [6]. Kido et al. proposed a CAD algorithm for lung abnormalities by use of CNN and regions with CNN features (R-CNN). R-CNN is an object detection framework, which uses a CNN to classify image regions within an image. R-CNN was trained with marked abnormal lesions, and it marked bounding boxes of abnormal lesions on the test image. The result of their method proposed is an accuracy of 84.7% [32]. Jiang et al. proposed an automatic detection system of lung nodule using a multigroup patch based on a deep learning network. The CAD system obtained a result sensitivity of 94% with 15.1 FPs/scan [1]. Huang et al. proposed fast and fully automated detection and segmentation of lung nodules in thoracic CT scans using deep convolutional neural networks. At the false positive (FP) reduction stage is done by CNN, and the results were obtained accuracy is 94.6% with 4 FPs/scan. The average dice coefficient of nodule segmentation compared to the ground truth is 0.793 [11].
Li et al. proposed lung nodule detection using multiresolution convolutional networks for chest X-ray radiograph. They are employed patch-based multiresolution convolutional networks to extract the feature and employed four different fusion methods for classification. For evaluated their proposed method, they use the JSRT database, an accuracy of 99% was demonstrated with 0.2 FPs/scan [18]. Kasinathan et al. proposed automated 3-D lung tumor detection and classification using CNN. For proposed model evaluation used the LIDC-IDRI dataset that consisted of 850 lung nodule-lesion images, and the result is an accuracy of 97% [12]. Shi et al. proposed CNN multiscale feature fusion method for lung nodules detection. The detection framework consists of two parts is region proposal generation and false-positive reduction. The CNN model architecture is VGG16 and experiments on the LUNA16 dataset show an average sensitivity of 82.62% [39]. Masood et al. proposed automated lung cancer detection used the enhanced multidimensional region-based fully convolutional network (mRFCN) method. Their system has been trained and evaluated using LIDC dataset, and the experiment results achieving a sensitivity of 98.1% and accuracy of 97.91% [31]. Wang et al. proposed lung nodule detection in CT images using a raw patch-based CNN. They compared the performance of ResNet with different CNNs architecture on CT images from the LIDC-IDRI dataset, and results was obtained a high detection sensitivity of 92.8% with 8 FPs/scan [30]. The CAD systems described above are summarized in Table 3.

Discussion
Based on the studies and analyses we summarized above for the automatic CAD detection system, it is clear that the CAD system is evolving each year. This occurs to enhance effectiveness in higher quality nodule detection. When it comes to detecting and classifying lung nodules, the best CAD system is that can achieve high accuracy and sensitivity. To identify research directions and future challenges, we summarize several literature studies on lung nodule detection reported from 2006 up to 2021 in Science Direct, Springer Link, IEEE Xplore, and Web of Science databases. In a review of some of the current studies, direct comparison of results is not the main concern as it is necessary to consider the framework and evaluation process on the method that has been proposed. However, we evaluated the lung nodule detection method in a CAD system based on the data used, nodule number, image size, and best performance including sensitivity and false-positive reduction. We also compared and reviewed several lung nodule detection methods at each step, such as the preprocessing, segmentation, nodule detection, and classification between nodules or non-nodules with feature extraction and FPs reduction. According to our analysis of relevant literature, several lung nodule detection studies used a large number of datasets. CT images are the most commonly used dataset type. Jiang [31], and others also using the same dataset. In addition, the widely used LUNA16 dataset is available. Gong et al. used the 1186image LUNA16 dataset [7], which was also used by Dou et al. [33], Tekade et al. [6], Shi et al. [39], and others. Table 1 summarizes and reports on a number of datasets from other publicly available databases.
Apart from the dataset used, we also analyzed the CAD system from the techniques of preprocessing, segmentation, detection, and FPs reduction. Some of our preprocessing and segmentation techniques are summarized in Table 2. The most widely used preprocessing techniques are the gaussian filter and the median filter. Jayaraj et al. using Gaussian and median filters for 1018 images acquired from the LIDC-IDRI database. A gaussian filter is applied to remove noise in lung cancer detection and a median filter is applied to remove small noise in CT images called salt and pepper noise. This will smooth the image and reduce the speckle noise from CT image of lung [9]. The same technique was carried out by Filho et al. [15], Widodo et al. [8], and Sankar et al. [24].
Image segmentation is the next process in a CAD system. Lung segmentation is the process of separating lung nodules from other parts of a CT scan image of the lung and then enhancing the resulting image to obtain more detail. Thresholding is a widely used segmentation technique [1][10] [15]. Deep learning has evolved as a segmentation technique over time and varies depending on the type of lung image nodule. Huang [28]. Arora et al. used the UNet to perform segmentation in the tuberculosis category, and they obtained a DSC of 0.9680 [29]. As a result, deep learning segmentation techniques like FCN, RNN, CNN, and UNet have a lot of potentials.
Following the segmentation process are nodule detection and feature extraction. Its goal is to determine whether or not an image is detected as a nodule. Table 3 summarizes the steps for nodule detection and feature extraction, including the best performance of each method. Filho et al. developed a nodule detection technique with high accuracy and sensitivity, employing SVM and feature extraction using shape and texture [15]. Tariq et al. detect nodules with neuro fuzzy, and for feature extraction using vector and intensity [10]. Talebpour et al. used BPNN to detect nodules, while geometric and texture feature extraction was used [38]. However, machine learning methods for nodule detection and feature extraction struggle with the task of defining and selecting the features of a specific image, and it becomes more time-consuming. Deep learning CNNs are used extensively in nodule detection. Several works, including Dou [30], and Masood et al. using RFCN [31], achieved accuracy and sensitivity above 90%. For the best results in lung nodule detection, feature extraction and false positive reduction are critical processes.
Going forward, further research is needed on the development of CAD systems for lung nodule detection. The goal is to obtain a more accurate detection result and have a high sensitivity value that can reduce the value of FPs reduction. The following are critical topics that can be developed for CAD systems in lung nodule detection later: (1) Developing deep learning techniques, such as the CNN method, to focus on improving lung nodule detection performance. Furthermore, the batch normalization layer can be added to reduce overfitting and accelerate network convergence [30]. (2) Generating a CAD system that can detect all categories of lung nodules with high precision and sensitivity while having a low false-positive rate. (3) Training and evaluating the proposed method using a large number of datasets like the LIDC-IDRI and LUNA16 public database, to make it more extensive in providing an assessment of the general and clinical performance of the detection system.

Conclusion
In this article, we have provided a critical description through literature studies of some of the research work of CAD systems for lung nodule detection. Several research trends have used CT scan images to evaluate the proposed method. A brief general explanation of the lung nodule detection CAD system has been summarized and is known to consist of several steps, including preparing or acquiring image data, preprocessing, segmenting, lung nodules detection, and FPs reduction that contain feature extraction. After we looked at some of the popular lung nodule detection works using a dataset from the LIDC-IDRI database to evaluate the proposed method. Furthermore, we evaluated that some work had better results based on parameters of sensitivity, specificity, accuracy, FPs/scan, and other parameters. We have also summarized several methods for each of the steps in lung nodule detection. The method for determining candidate nodules and extractive features, which has become a trend in recent years, has used deep learning techniques such as CNN. Although we discovered that some CAD systems achieved high sensitivity with low false-positive rates, there are still many challenges for optimizing lung cancer detection CAD systems. The most important goal of a top-performing CAD system is to assist radiologists in lung cancer detection.