Machine learning to predict lung nodule biopsy method using CT image features: A pilot study

https://doi.org/10.1016/j.compmedimag.2018.10.006Get rights and content

Highlights

  • Semantic and image features predicted success of minimally invasive lung biopsies.

  • Informative features include nodule volume, spiculation, and depth within the body.

  • Developed a method to calculate distance to a nodule through the bronchial system using CT images.

Abstract

Computed tomography (CT)-based screening on lung cancer mortality is poised to make lung nodule management a growing public health problem. Biopsy and pathologic analysis of suspicious nodules is necessary to ensure accurate diagnosis and appropriate intervention. Biopsy techniques vary as do the specialists that perform them and the ways lung nodule patients are referred and triaged. The largest dichotomy is between minimally invasive biopsy (MIB) and surgical biopsy (SB). Cases of unsuccessful MIB preceding a SB can result in considerable delay in definitive care with potentially an adverse impact on prognosis besides potentially avoidable healthcare expenditures. An automated method that predicts the optimal biopsy method for a given lung nodule could save time and healthcare costs by facilitating referral and triage patterns. To our knowledge, no such method has been published. Here, we used CT image features and radiologist-annotated semantic features to predict successful MIB in a way that has not been described before. Using data from the Lung Image Database Consortium image collection (LIDC-IDRI), we trained a logistic regression model to determine whether a MIB or SB procedure was used to diagnose lung cancer in a patient presenting with lung nodules. We found that in successful MIB cases, the nodules were significantly larger and more spiculated. Our model illustrates that using robust machine learning tools on easily accessible semantic and image data can predict whether a patient’s nodule is best biopsied by MIB or SB. Pending further validation and optimization, clinicians could use our publicly accessible model to aid clinical decision-making.

Introduction

Following the National Lung Screening Trial (NLST), which demonstrated the mortality benefit of screening computed tomography (CT) among patients at high risk for lung cancer, management of incidental lung nodules is poised to become a rapidly growing public health problem (Aberle et al., 2011; Marcus et al., 2016; National Lung Screening Trial Research Team, 2011). Biopsy and pathologic analysis of suspicious nodules is frequently necessary to ensure accurate diagnosis and appropriate intervention. Biopsy techniques vary as do the specialists that perform them and the ways lung nodule patients are referred and triaged. The largest dichotomy is between minimally invasive biopsy (MIB) and surgical biopsy (SB). SB is a more definitive but risky diagnostic modality that is characterized by surgical resection followed by histopathological analysis of the resected tissue (Reck et al., 2014; Rivera et al., 2013). Cases of unsuccessful MIB preceding a SB can potentially delay definitive case, adversely impact prognoses, and incur avoidable healthcare expenditures. An automated model that objectively predicts the optimal biopsy method for a given lung nodule could improve patient referral patterns, thereby saving time, improve outcomes, and reducing healthcare costs.

MIB is often the initial diagnostic modality of choice despite comparatively lower diagnostic yield rates in view of lower cost, no need for hospitalization, and lower risks of procedural complications such as bleeding, infection, pneumothorax, and death (Belanger and Akulian, 2017; Rivera et al., 2013). This becomes all the more relevant considering that 96% of lung nodules detected in the NLST were benign (DR Aberle et al., 2011; Marcus et al., 2016; National Lung Screening Trial Research Team, 2011). MIB can be performed percutaneously under image guidance (Conces et al., 1987), or via fiber-optic bronchoscopy (Herth, 2011).

Even though clinicians carefully review both radiologic and patient characteristics to determine the feasibility of MIB, a significant proportion of biopsies still fail to lead to a diagnosis (Belanger and Akulian, 2017). Plausible reasons include the inability to reach the target site with biopsy instruments and the inability to collect a sufficient amount of tissue for pathological analyses. This can often lead to either a second MIB procedure or a move to surgical resection (SB). In addition to accruing healthcare costs, this can lead to loss of valuable time before the institution of potentially life-saving therapy (Taleb et al., 2017).

The ability to prospectively predict biopsy success would be useful, as it would help identify when a MIB procedure would likely be non-diagnostic and enable patients and providers to proceed straight to SB. Previous research has demonstrated that two-dimensional size on CT imaging can help predict biopsy success. Two studies showed that manual identification of a bronchus sign on a thoracic CT—the presence of a lung bronchus in close proximity directly leading to a pulmonary lesion—can predict the success of a bronchoscopic biopsy (Evison et al., 2014; Gaeta et al., 1991). No prior studies, to our knowledge, have used machine learning of radiomic features to predict the biopsy method.

Considerable work has been done creating machine learning methods to assist doctors perform percutaneous biopsies. All these tools show the position of biopsy instruments relative to the suspicious nodule and thereby improve the accuracy of needle placement. Yaniv et al. used a registration algorithm to directly map a biopsy point drawn on a diagnostic CT to perioperative CT scan (Yaniv et al., 2010). He et al. segmented the nodule and pulmonary vessels from the diagnostic radiologic imaging modality and overlaid them onto the intra-procedural interventional radiology CT using electromagnetic guidance (He et al., 2010). Hagmann et al. generated a type of virtual reality display on operating CT images which can be paired with haptic feedback to further improve biopsy needle placement (Hagmann et al., 2004). However this research solves a problem (percutaneous biopsy needle placement) inherently different from the task considered here: predicting whether a minimally invasive biopsy (percutaneous or bronchoscopic) or a surgical biopsy should be performed on a patient, based on CT scans.

In this study, we have developed a novel algorithm to objectively predict the optimal biopsy method for a given lung nodule through an automated prediction model incorporating computational and semantic image features. We demonstrate proof of concept utilizing the largest publicly available dataset with biopsy information available. We theorized that such a model could save time and healthcare costs by facilitating referral and triage patterns. To facilitate additional research in this field, all our image feature extraction methods are available online in an open-source Python Package on BitBucket (https://bitbucket.org/connorbrinton/lcat_lung_biopsy).

Section snippets

Data source

We harnessed public domain imaging data from the Lung Image Database Consortium image collection (LIDC-IDRI), which is available through the Cancer Imaging Archive (Armato et al., 2015; Clark et al., 2013). LIDC-IDRI contains thoracic computed tomography (CT) images from 1018 patients with lung nodules in DICOM format. For every patient, four expert radiologists identified any and all lung nodules. The seed pixels corresponding to each identified nodule, and additional semantic notes reported

Tracheal distance computation

An expert pulmonologist (MS) visually inspected the tracheal and bronchiole mappings produced by our algorithm and found it performed well on all anatomically normal CT scans tested. An example of each stage of the algorithm for an example patient from our dataset is given in Fig. 2. In the case of one patient, the model was unable to identify the trachea. On manual review of images, the trachea was noted to communicate with the body surface near the thoracic inlet. In the medical expertise of

Discussion

In this study, we built a model incorporating both computational and semantic features from the CT scans of 30 patients to predict the optimal biopsy modality for a given lung nodule (i.e. MIB versus SB). We calculated CT image features: 3D volume, tracheal distance, and distance to outer body. We extracted semantic features from radiologists’ notes: sphericity, lobulation, spiculation, calcification, internal structure, and texture. These features were used to train both logistic regression

Conflicts of interest

None of the authors of this manuscript have any financial or personal relationships with other people or organizations that could inappropriately influence and/or bias this work.

References (31)

  • P.P.R. Filho et al.

    3D segmentation and visualization of lung and its structures using CT images of the thorax

    J. Biomed. Sci. Eng.

    (2013)
  • M. Gaeta et al.

    Bronchus sign on CT in peripheral carcinoma of the lung: value in predicting results of transbronchial biopsy

    Am. J. Roentgenol.

    (1991)
  • M.D. Greer et al.

    Computer-aided diagnosis prior to conventional interpretation of prostate mpMRI: an international multi-reader study

    Eur. Radiol.

    (2018)
  • E. Hagmann et al.

    A haptic guidance tool for CT-directed percutaneous interventions

    The 26th Annual International Conference of The IEEE Engineering in Medicine and Biology Society. Presented at the The 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society

    (2004)
  • T. Hastie et al.

    The elements of statistical learning: data mining, inference, and prediction

    Springer Series in Statistics

    (2009)
  • Cited by (14)

    • A review on utilizing machine learning technology in the fields of electronic emergency triage and patient priority systems in telemedicine: Coherent taxonomy, motivations, open research challenges and recommendations for intelligent future work

      2021, Computer Methods and Programs in Biomedicine
      Citation Excerpt :

      In contrast, SVM model was also developed with a radial basis function (RBF) kernel to automatically detect relevant regions of breast cancer from large post-surgical whole slide images (WSIs) [108]. An automated machine learning can predict the optimal biopsy method for a given lung nodule, and this benefit facilitates triage patterns and saves time and healthcare cost [67]. In the hospital admission process, machine learning models enhance the clinician's ED triage decision using patient history and collected information from the ED triage [114].

    • Lungs cancer classification from CT images: An integrated design of contrast based classical features fusion and selection

      2020, Pattern Recognition Letters
      Citation Excerpt :

      In worldwide, lungs cancer is a leading cause of human deaths [22]. A lot of CAD systems are introduced in literature by several researchers but still, they faced issues of high false rate and poor visibility of lungs lesions [9,23]. In general, a CAD system is based on good features extraction techniques such as classical and deep learning [24].

    • Prediction of Lung Cancer using Ensemble Classifiers

      2022, Journal of Physics: Conference Series
    • Deep-Learning Based Hybrid Model for the Classification of Lung Diseases

      2022, AIST 2022 - 4th International Conference on Artificial Intelligence and Speech Technology
    • Lung cancer prediction in deep learning perspective

      2021, Computational Analysis and Deep Learning for Medical Care: Principles, Methods, and Applications
    View all citing articles on Scopus
    1

    Authors contributed equally to this research.

    View full text