ABSTRACT
Early detection of lung nodule decreases the risk of advanced stages in lung cancer disease. Random forest (RF), a machine learning classifier, is used to detect the lung nodules and classify soft-tissues into nodules and non-nodules. A lung nodule classification approach is proposed to improve early detection for nodules. A five stages model has been built and tested using 165 cases from the LIDC database. Stage 1 is image acquisition and preprocessing. Stage 2 is extracting 119 features from the CT image. Stage 3 is refining feature vectors by removing all duplicate instances and undersampling the non-nodule class. Stage 4 is tuning the RF parameters. Stage 5 is examining different collections from the extracted feature sets to select those scores best for classification. The accuracy achieved by RF is the highest compared to other machine learning classifiers such as KNN, SVM, and DT. The proposed method aimed to analyze and select features that maximize classification results. Pixel based feature set and wavelet-based set scored best for higher accuracy. RF was tuned with 170 trees and 0.007 for in-bag fraction. Best results were achieved by the proposed model are 90.67%, 90.8% and 90.73% for sensitivity, specificity, and accuracy respectively.
- J. E. Roos, D. Paik, D. Olsen, E. G. Liu, et al. "Computer aided detection (CAD) of lung nodules in CT scans: radiologist performance and reading time with incremental CAD assistance" European Radiology, vol. 20, no. 3, pp. 549--557, 2010.Google ScholarCross Ref
- S. Lee, A. Kouzani, and E. J. Hu, "Hybrid Classification of Pulmonary Nodules" Communications in Computer and Information Science, vol. 51, pp. 472--481, 2009.Google ScholarCross Ref
- American Cancer Society, "Cancer Facts and Figures 2019" Genes and Development, 2019.Google Scholar
- S. A. El-Regaily, M. A. Salem, M. H. A. Aziz, and M. I. Roushdy, "Survey of Computer Aided Detection Systems for Lung Cancer in Computed Tomography" Current Medical Imaging Reviews, vol. 13, 2017.Google Scholar
- J. A. Cruz and D. S. Wishart, "Applications of Machine Learning in Cancer Prediction and Prognosis," Cancer Informatics, vol. 2, pp. 59--78, 2006.Google ScholarCross Ref
- L. Breiman, "Random Forests," Machine Learning, vol. 45, pp. 5--32, 2001. Google ScholarDigital Library
- A. C. Bellail, J. J. Olson, and C. Hao, "A Generic Approach to Pathological Lung Segmentation" IEEE Trans Med Imaging, vol. 33, no. 12, pp. 2293--2310, 2014.Google ScholarCross Ref
- T. K. Ho, "Random Decision Forests" in Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, pp. 278--282, 1995. Google ScholarDigital Library
- T. K. Ho, "The random subspace method for constructing decision forests" IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 8, pp. 832--844, 1998. Google ScholarDigital Library
- L. Breiman, "Bagging predictors" Machine Learning, vol. 24, no. 2, pp. 123--140, 1996. Google ScholarCross Ref
- E. Harris, "Information Gain Versus Gain Ratio: A Study of Split Method Biases" Amai, 2002.Google Scholar
- S. Lee, A. Kouzani, and E. Hu, "Random forest based lung nodule classification aided by clustering" Computerized Medical Imaging and Graphics, vol. 34, no. 7, pp. 535--542, 2010.Google ScholarCross Ref
- L. P. Armato III, Samuel G., McLennan, Geoffrey, Bidaut, Luc, McNitt-Gray, Michael F., Meyer, Charles R., Reeves, Anthony P., Clarke, "Data From LIDC-IDRI. The Cancer Imaging Archive," 2015.Google Scholar
- K. Clark, B. Vendt, K. Smith, J. Freymann, J. Kirby, P. Koppel, S. Moore, S. Phillips, D. Maffitt, M. Pringle, L. Tarbox, and F. Prior, "The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository," J Digit Imaging, vol. 26, pp. 1045--1057, 2013.Google ScholarCross Ref
- S. G. Armato, G. Mclennan, and et al.., "The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A Completed Reference Database of Lung Nodules on CT Scans" Medical Physics, vol. 38, pp. 915--931, 2011.Google ScholarCross Ref
- M. F. Mcnitt-gray, S. G. A. Iii, and et al., "The Lung Image Database Consortium (LIDC) Data Collection Process for Nodule Detection and Annotation" Acad Radiol, vol. 14, no. 12, pp. 1464--1474, 2008.Google Scholar
- T. A. Lampert, A. Stumpf, and P. Gancarski, "An Empirical Study Into Annotator Agreement, Ground Truth Estimation, and Algorithm Evaluation," IEEE Transactions on Image Processing, vol. 25, no. 6, pp. 2557--2572, 2016. Google ScholarDigital Library
- M. N. Patel and P. Tandel, "A Survey on Feature Extraction Techniques for Shape based Object Recognition," International Journal of Computer Applications, vol. 137, no. 6, pp. 16--20, 2016.Google ScholarCross Ref
- Shodhganga, 2008, Chapter 6: feature extraction. Retrieved from Information and Library Network Center.Google Scholar
- INFLIBNET Centre, 2019, January 29. Information and Library Network Center. Retrieved from https://www.inflibnet.ac.inGoogle Scholar
- M. M. Galloway, "Texture Analysis Using Gray Level Run Lengths" Computer graphics and image processing, vol. 4, pp. 172--179, 1975.Google Scholar
- R. M. Haralick, K. Shanmugan, and I. Dinstein, "Textural features for image classification," in IEEE Transaction on Systems, Man and Cybernetics, 1973.Google ScholarCross Ref
- D. A. Clausi, "An analysis of co-occurrence texture statistics as a function of grey level quantization," Canadian Journal of Remote Sensing, vol. 28, no. 1, pp. 45--62, 2002.Google ScholarCross Ref
- V. Michael, "Haralick texture features," 1999.Google Scholar
- T. Messay, R. C. Hardie, and T. R. Tuinstra, "Segmentation of pulmonary nodules in computed tomography using a regression neural network approach and its application to the Lung Image Database Consortium and Image Database Resource Initiative dataset" Medical Image Analysis, vol. 22, no. 1, pp. 48--62, 2015.Google ScholarCross Ref
- T. Zhou, H. Lu, J. Zhang, and H. Shi, "Pulmonary Nodule Detection Model Based on SVM and CT Image Feature-Level Fusion with Rough Sets," BioMed Research International, vol. 2016, 2016.Google Scholar
- J. John and M. G. Mini, "Multilevel Thresholding Based Segmentation and Feature Extraction for Pulmonary Nodule Detection," Procedia Technology, vol. 24, pp. 957--963, 2016.Google ScholarCross Ref
- O. Demir and A. Y. Camurcu, "Computer-aided detection of lung nodules using outer surface features," Bio-Medical Materials and Engineering, vol. 26, pp. S1213--S1222, 2015.Google ScholarCross Ref
- M. Alilou, V. Kovalev, E. Snezhko, and V. Taimouri, "A comprehensive framework for automatic detection of pulmonary nodules in lung ct images," Image Analysis and Stereology, vol. 33, no. 1, pp. 13--27, 2014.Google ScholarCross Ref
- T. Messay, R. C. Hardie, and S. K. Rogers, "A new computationally efficient CAD system for pulmonary nodule detection in CT imagery," Medical Image Analysis, vol. 14, no. 3, pp. 390--406, 2010.Google ScholarCross Ref
- T. M. Oshiro and P. S. Perez, "Machine Learning and Data Mining in Pattern Recognition" in MLDM, pp. 154--168, 2012.Google Scholar
- H. Wang, Z. Zhou, Y. Li, Z. Chen, P. Lu, W. Wang, and W. Liu, Comparison of machine learning methods for classifying mediastinal lymph node metastasis of non-small cell lung cancer from 18 F-FDG PET / CT images," EJNMMI Research, 2017.Google Scholar
- M. Anthimopoulos, S. Christodoulidis, A. Christe, and S. Mougiakakou, "Classification of Interstitial Lung Disease Patterns Using Local DCT Features and Random Forest" IEEE, pp. 6040--6043, 2014Google Scholar
Index Terms
- Feature Extraction and Analysis for Lung Nodule Classification using Random Forest
Recommendations
Automatic Detection and Classification of Solitary Pulmonary Nodules from Lung CT Images
EAIT '14: Proceedings of the 2014 Fourth International Conference of Emerging Applications of Information TechnologyCancer is one of the fatal diseases, posing threat to human life. Among different types of cancer, lung cancer can be considered as one of the most most deadly one. Lung nodules are small white spots that appear in lung parenchyma. Lung nodules are ...
Lung Nodule Classification Using Deep Features in CT Images
CRV '15: Proceedings of the 2015 12th Conference on Computer and Robot VisionEarly detection of lung cancer can help in a sharp decrease in the lung cancer mortality rate, which accounts for more than 17% percent of the total cancer related deaths. A large number of cases are encountered by radiologists on a daily basis for ...
Lung Nodule Segmentation Using EM Algorithm
IHMSC '14: Proceedings of the 2014 Sixth International Conference on Intelligent Human-Machine Systems and Cybernetics - Volume 01Lung disease is often performed as nodules. Pulmonary nodule is one of important symbols of lung disease. Characteristics of pulmonary nodules always indicate the nature of lung disease. Detection of pulmonary nodules has great significance in ...
Comments