Journal of Molecular Biomarkers & Diagnosis

Lung cancer is the leading cause of cancer related deaths in general population. Early diagnosis of malignant pulmonary nodule, can improve 5-year survival rate of lung cancer by upto 80%. There is increase in incidentally detected pulmonary nodules with the increased usage of diagnostic imaging modalities especially computed tomography (CT) of chest. Most often, physicians and trainee doctors have to depend on the experienced radiologists to confidently label these nodules as benign or malignant, thereby raising a need for some method, which could help them in self-learning and also could assist radiologists in ruling out malignancy with good certainty and confidence.


Description
Lung cancer is the leading cause of cancer related deaths in general population [1]. Early diagnosis of malignant pulmonary nodule, can improve 5-year survival rate of lung cancer by upto 80% [2]. There is increase in incidentally detected pulmonary nodules with the increased usage of diagnostic imaging modalities especially computed tomography (CT) of chest. Most often, physicians and trainee doctors have to depend on the experienced radiologists to confidently label these nodules as benign or malignant, thereby raising a need for some method, which could help them in self-learning and also could assist radiologists in ruling out malignancy with good certainty and confidence.
To fulfill these requirements, there has been extensive research on Computer-Aided Diagnosis (CAD) in characterization of the lung nodules. Content-Based Image Retrieval (CBIR) is a type of CAD tool, which involves two main steps, feature extraction and image retrieval. Several features are used by a radiologist to characterize a nodule i.e., texture, shape, size, density and margins. CBIR uses these image features to build searching index. When given a query image, it uses similarity metrics to retrieve similar objects from a database. Hence, CBIR can act as a learning tool and also assist the radiologists in diagnosis of lung cancer by showing them examples of similar nodules from a prestored database of proven cases.
The clinical relevance of CBIR was elaborated initially by Muller et al. [3], who emphasized its usefulness in clinical decision-making, medical research and medical education. This has motivated various researchers to work on CBIR. The principal objective of researchers is to develop algorithms, which will help to retrieve similar images to help in diagnosis. There are several CBIR research projects underway in the medical field with few of them focusing exclusively on lung nodules. CBIR named BRISC (acronym for BRISC Really IS Cool) was developed by Lam et al. [4], in which nodules were segmented using boundary information. Different texture features were extracted from each CT image. For a query nodule, other nodules were retrieved from a database. Retrieved nodules were considered to be relevant if they were different slices of the same query nodule. They were also considered to be relevant if they were the same slice of the query nodule, evaluated by a different radiologist. This system did not help in differential diagnosis or self-learning. However, it set the way forward for further work in this context.
A CBIR system was developed by Seitz et al. [5], in which 64 visual features were extracted including features of texture, size, shape and intensity. The Euclidean distance was used for measuring similarity. However, they used manual technique for segmentation of nodules, which was pretty time consuming.
Kuruvilla et al. [6] also used CBIR system, in which CT examinations with similar nodules were retrieved, according to the parameters that calculate the accuracy of the neural network algorithm. The similarity metrics used were Euclidean distance, Manhattan distance, Chebychev, Tversky distance, Bray-curtis, Canberra distance, City block distance, squared chord distance and chi-squared distance Lucena et al. [7] used weighted Euclidean distance (WED) with weight adjustments to improve the precision of CBIR for retrieval, than systems using Euclidean distance. Using WED, precision increased on average by 17.3%.
Very recently, Dhara et al. [8] developed CBIR based CAD where lung nodules were segmented using semi-automatic technique followed by annotation and ground truth delineation for features viz. size, shape, margins, texture etc. in the nodules. They proposed a rank of malignancy on a scale of 1-5, which was correlated with biopsy results and created a benchmark ground truth database. After creating the database, CBIR-based CAD was developed which was later validated on lung image database consortium (LIDC) and image database resource initiative (IDRI). The radiologist just provided a seed point on the query nodule. This resulted in automatic retrieval of top 5 similar nodules, by comparing features of query nodule with nodules from the database. Retrieval system used 2D shape-based, 3D shape-based, 2D texture-based, 3D texture-based and margin-based features of nodules. The similarity metrics used to retrieve and rank nodules were Euclidean, Manhattan and Chebyshev. In this CBIR based CAD system, the retrieved nodules were ranked and placed in the descending order of similarity with the query nodule, along with their class label. The class label determined the Decision Index (DI). Higher DI meant high likelihood that the query nodule will be malignant.

Conclusion
In conclusion, CBIR based CAD system is a good self-learning tool, can assist trainee radiologist in determining the malignancy status of a nodule and can be used for second opinion even by experienced radiologists. However, more collaborative research is required to improve CBIR based CAD system, particularly for automated segmentation of lung nodules, for improvement of feature set and for improving retrieval strategies.