skip to main content
10.1145/3328833.3328872acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicsieConference Proceedingsconference-collections
research-article

Feature Extraction and Analysis for Lung Nodule Classification using Random Forest

Authors Info & Claims
Published:09 April 2019Publication History

ABSTRACT

Early detection of lung nodule decreases the risk of advanced stages in lung cancer disease. Random forest (RF), a machine learning classifier, is used to detect the lung nodules and classify soft-tissues into nodules and non-nodules. A lung nodule classification approach is proposed to improve early detection for nodules. A five stages model has been built and tested using 165 cases from the LIDC database. Stage 1 is image acquisition and preprocessing. Stage 2 is extracting 119 features from the CT image. Stage 3 is refining feature vectors by removing all duplicate instances and undersampling the non-nodule class. Stage 4 is tuning the RF parameters. Stage 5 is examining different collections from the extracted feature sets to select those scores best for classification. The accuracy achieved by RF is the highest compared to other machine learning classifiers such as KNN, SVM, and DT. The proposed method aimed to analyze and select features that maximize classification results. Pixel based feature set and wavelet-based set scored best for higher accuracy. RF was tuned with 170 trees and 0.007 for in-bag fraction. Best results were achieved by the proposed model are 90.67%, 90.8% and 90.73% for sensitivity, specificity, and accuracy respectively.

References

  1. J. E. Roos, D. Paik, D. Olsen, E. G. Liu, et al. "Computer aided detection (CAD) of lung nodules in CT scans: radiologist performance and reading time with incremental CAD assistance" European Radiology, vol. 20, no. 3, pp. 549--557, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  2. S. Lee, A. Kouzani, and E. J. Hu, "Hybrid Classification of Pulmonary Nodules" Communications in Computer and Information Science, vol. 51, pp. 472--481, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  3. American Cancer Society, "Cancer Facts and Figures 2019" Genes and Development, 2019.Google ScholarGoogle Scholar
  4. S. A. El-Regaily, M. A. Salem, M. H. A. Aziz, and M. I. Roushdy, "Survey of Computer Aided Detection Systems for Lung Cancer in Computed Tomography" Current Medical Imaging Reviews, vol. 13, 2017.Google ScholarGoogle Scholar
  5. J. A. Cruz and D. S. Wishart, "Applications of Machine Learning in Cancer Prediction and Prognosis," Cancer Informatics, vol. 2, pp. 59--78, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  6. L. Breiman, "Random Forests," Machine Learning, vol. 45, pp. 5--32, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. C. Bellail, J. J. Olson, and C. Hao, "A Generic Approach to Pathological Lung Segmentation" IEEE Trans Med Imaging, vol. 33, no. 12, pp. 2293--2310, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  8. T. K. Ho, "Random Decision Forests" in Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, pp. 278--282, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T. K. Ho, "The random subspace method for constructing decision forests" IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 8, pp. 832--844, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. L. Breiman, "Bagging predictors" Machine Learning, vol. 24, no. 2, pp. 123--140, 1996. Google ScholarGoogle ScholarCross RefCross Ref
  11. E. Harris, "Information Gain Versus Gain Ratio: A Study of Split Method Biases" Amai, 2002.Google ScholarGoogle Scholar
  12. S. Lee, A. Kouzani, and E. Hu, "Random forest based lung nodule classification aided by clustering" Computerized Medical Imaging and Graphics, vol. 34, no. 7, pp. 535--542, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  13. L. P. Armato III, Samuel G., McLennan, Geoffrey, Bidaut, Luc, McNitt-Gray, Michael F., Meyer, Charles R., Reeves, Anthony P., Clarke, "Data From LIDC-IDRI. The Cancer Imaging Archive," 2015.Google ScholarGoogle Scholar
  14. K. Clark, B. Vendt, K. Smith, J. Freymann, J. Kirby, P. Koppel, S. Moore, S. Phillips, D. Maffitt, M. Pringle, L. Tarbox, and F. Prior, "The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository," J Digit Imaging, vol. 26, pp. 1045--1057, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  15. S. G. Armato, G. Mclennan, and et al.., "The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A Completed Reference Database of Lung Nodules on CT Scans" Medical Physics, vol. 38, pp. 915--931, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  16. M. F. Mcnitt-gray, S. G. A. Iii, and et al., "The Lung Image Database Consortium (LIDC) Data Collection Process for Nodule Detection and Annotation" Acad Radiol, vol. 14, no. 12, pp. 1464--1474, 2008.Google ScholarGoogle Scholar
  17. T. A. Lampert, A. Stumpf, and P. Gancarski, "An Empirical Study Into Annotator Agreement, Ground Truth Estimation, and Algorithm Evaluation," IEEE Transactions on Image Processing, vol. 25, no. 6, pp. 2557--2572, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. N. Patel and P. Tandel, "A Survey on Feature Extraction Techniques for Shape based Object Recognition," International Journal of Computer Applications, vol. 137, no. 6, pp. 16--20, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  19. Shodhganga, 2008, Chapter 6: feature extraction. Retrieved from Information and Library Network Center.Google ScholarGoogle Scholar
  20. INFLIBNET Centre, 2019, January 29. Information and Library Network Center. Retrieved from https://www.inflibnet.ac.inGoogle ScholarGoogle Scholar
  21. M. M. Galloway, "Texture Analysis Using Gray Level Run Lengths" Computer graphics and image processing, vol. 4, pp. 172--179, 1975.Google ScholarGoogle Scholar
  22. R. M. Haralick, K. Shanmugan, and I. Dinstein, "Textural features for image classification," in IEEE Transaction on Systems, Man and Cybernetics, 1973.Google ScholarGoogle ScholarCross RefCross Ref
  23. D. A. Clausi, "An analysis of co-occurrence texture statistics as a function of grey level quantization," Canadian Journal of Remote Sensing, vol. 28, no. 1, pp. 45--62, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  24. V. Michael, "Haralick texture features," 1999.Google ScholarGoogle Scholar
  25. T. Messay, R. C. Hardie, and T. R. Tuinstra, "Segmentation of pulmonary nodules in computed tomography using a regression neural network approach and its application to the Lung Image Database Consortium and Image Database Resource Initiative dataset" Medical Image Analysis, vol. 22, no. 1, pp. 48--62, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  26. T. Zhou, H. Lu, J. Zhang, and H. Shi, "Pulmonary Nodule Detection Model Based on SVM and CT Image Feature-Level Fusion with Rough Sets," BioMed Research International, vol. 2016, 2016.Google ScholarGoogle Scholar
  27. J. John and M. G. Mini, "Multilevel Thresholding Based Segmentation and Feature Extraction for Pulmonary Nodule Detection," Procedia Technology, vol. 24, pp. 957--963, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  28. O. Demir and A. Y. Camurcu, "Computer-aided detection of lung nodules using outer surface features," Bio-Medical Materials and Engineering, vol. 26, pp. S1213--S1222, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  29. M. Alilou, V. Kovalev, E. Snezhko, and V. Taimouri, "A comprehensive framework for automatic detection of pulmonary nodules in lung ct images," Image Analysis and Stereology, vol. 33, no. 1, pp. 13--27, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  30. T. Messay, R. C. Hardie, and S. K. Rogers, "A new computationally efficient CAD system for pulmonary nodule detection in CT imagery," Medical Image Analysis, vol. 14, no. 3, pp. 390--406, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  31. T. M. Oshiro and P. S. Perez, "Machine Learning and Data Mining in Pattern Recognition" in MLDM, pp. 154--168, 2012.Google ScholarGoogle Scholar
  32. H. Wang, Z. Zhou, Y. Li, Z. Chen, P. Lu, W. Wang, and W. Liu, Comparison of machine learning methods for classifying mediastinal lymph node metastasis of non-small cell lung cancer from 18 F-FDG PET / CT images," EJNMMI Research, 2017.Google ScholarGoogle Scholar
  33. M. Anthimopoulos, S. Christodoulidis, A. Christe, and S. Mougiakakou, "Classification of Interstitial Lung Disease Patterns Using Local DCT Features and Random Forest" IEEE, pp. 6040--6043, 2014Google ScholarGoogle Scholar

Index Terms

  1. Feature Extraction and Analysis for Lung Nodule Classification using Random Forest

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICSIE '19: Proceedings of the 8th International Conference on Software and Information Engineering
      April 2019
      276 pages
      ISBN:9781450361057
      DOI:10.1145/3328833

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 April 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader