A knowledge-based artificial neural network classifier for pulmonary embolism diagnosis
Introduction
Diagnosis of the medical condition called pulmonary embolism (PE) is a challenging task often requiring highly specialized expertise as typically possessed by nuclear radiologists. Given the scarcity of such expertise on a global basis, it has been and is of interest to be able to automate the diagnosis procedure in light of the advancements in the computing technology during the recent decade or so. Accordingly, there have been numerous efforts as reported in the literature to design a classifier algorithm that would analyze the feature set extracted from an imaging modality typically entailing ventilation–perfusion scans along with correlated chest X-rays (cxr). The most commonly used and promising classifier algorithms are based on artificial neural networks (ANNs), i.e. multilayer perceptron (MLP) neural networks in nearly all cases, configured as “purely” inductive or empirical learners [1], [2], [3], [4], [5], [6], [7] with good performance profiles and, more importantly, with further potential for improvement. In all these studies, one constraining aspect is that the very nature of the empirical learning process precludes, to an extent, incorporating the pre-existing domain knowledge. In the case of PE, the domain knowledge is very important and significant since it represents the prominent and collective experiential comprehension, by the domain experts, of the diagnosis process associated with the PE.
The typical process of development of empirical or purely inductive machine learners starts with a designer-specified structure or topology, which typically reflects the best intuition of the designer at the time, and often randomly or, in many cases, heuristically determined parameter values. The process proceeds with adaptation of these parameter values based on domain-specific input–output data through a procedure called training or learning. Such algorithms often fail to take advantage of any domain-specific collective experiential knowledge other than certain heuristics as provided by the designer. In fact, they assume that this collective experiential knowledge can be acquired through the training process. To some but not to a significant extent, this conjecture may hold true for the PE domain as well: a number of purely inductive machine learners may be able to capture a portion of this experiential knowledge through empirical learning. However, what is needed is a very high-quality training dataset, which is often hard to come by, especially in the domain of PE. Clearly, this is a risky and nondeterministic process and there is a better way for improved performance. Much of risk and nondeterminism may be eliminated if a knowledge-based hybrid and trainable classifier is considered given that there exists substantial experiential knowledge as succinctly stated in the so-called prospective investigation of pulmonary embolism diagnosis (PIOPED) criteria for the PE domain [29]. A knowledge-based hybrid classifier is well positioned to effectively capture the knowledge embedded in the PIOPED criteria as well as effectively incorporating further knowledge refinements that could originate from the domain experts, i.e. nuclear radiologists.
Instantiation of a knowledge-based hybrid classifier is premised on the following assumption. Tasks where domain knowledge exists in both the symbolic form, i.e. a set of if–then rules, and input–output data, i.e. training data for a machine learning algorithm, might be addressed much more effectively if both forms of knowledge can be fused within a common classifier system framework. The goal of instantiating a (learning) classifier initially through domain-specific knowledge and then improving its initial classification accuracy and specificity through follow-up training has been pursued vigorously as reported in the literature [8], [9], [10], [11]. The adaptive neuro-fuzzy inference system (ANFIS), Bayesian belief nets (BBN), and knowledge-based artificial neural networks (KBANN) are among the leading paradigms in this venue. The ANFIS maps a knowledge base with fuzzy if–then rules to a neural network topology that closely approximates that of a radial basis function network. The BBN leverages a knowledge base where the rules represent conditional probabilities modeled after Bayes’ formulation. The KBANN typically employs a MLP neural network to capture and represent the knowledge embedded into a rule base. All three algorithmic paradigms are trainable once the initial computational architecture is instantiated based on the knowledge base.
PE diagnosis can be cast into a classification task that is appropriate for a knowledge-based hybrid algorithm: substantial pre-existing domain knowledge, such as the modified PIOPED criteria [12], [13] and domain expertise possessed by nuclear radiologists, and some modest amount of data, appropriate for training a classifier, are available for facilitating a knowledge-based machine learning algorithm. The modified PIOPED criteria offer a very good framework for the domain knowledge, which can easily be translated into a form of if–then rules. The nuclear radiologists can also formulate further new rules or contribute to the refinement of those readily extracted from the modified PIOPED criteria. Consequently, the set of if–then rules representing the modified PIOPED criteria and the domain-specific knowledge of radiologists can then be mapped to a specific knowledge-based architecture.
A fuzzy inference system (FIS) and a BBN were separately evaluated for the PE diagnosis task with mixed results as reported in two separate earlier studies [14], [15]. The experience with FIS application revealed that the empirical determination of a large number of fuzzy membership functions proved to be a noteworthy hindrance to leveraging the full potential of this computational algorithm. There were noteworthy challenges with the BBN implementation as well, such as specification of entries in the conditional probability tables, which proved difficult even in the presence of domain expertise provided by the nuclear radiologists. Although there are good parameter learning algorithms that could be used to determine these table entries inductively through an automated process, the amount of data available is not substantial in this domain due to logistical challenges associated with the acquisition of such data. Assuming the aforementioned challenges can be overcome and the knowledge entailed by the rule base is adequately captured, there are still potential difficulties associated with either algorithm once the overall knowledge-based effort progresses to the training phase. These difficulties can include the well known and documented lack of ability of radial basis function networks to generalize for the ANFIS and the generation of large amounts of quality training data for the BBN to approximate well to the posterior probability distribution for the class attributes. Consequently, we decided against further pursuing these two knowledge-based hybrid learning algorithms, i.e. the ANFIS and the BBN. Rather, we opted to employ the KBANN for the PE diagnosis task instead given that a KBANN that leverages the MLP neural network appears to be a more appropriate choice for the PE diagnosis task. The rationale for this conclusion entails the following: (a) a knowledge base in the form of if–then rules can easily be incorporated into an MLP topology, (b) the MLP neural network offers superior ability to generalize, (c) most researchers who attempted to develop a purely inductive or empirical classification algorithm for the PE diagnosis employed the MLP neural network, indicating a strong preference for this algorithm by the research community at large, and (d) the performance of a knowledge-based MLP embedded with the pre-existing knowledge, i.e. the PIOPED criteria, can be further improved through training on data as needed if and when such data become available.
The KBANN proposed by Towell and Shavlik [8] is a “hybrid” algorithm, which utilizes both the explanation-based and the empirical learning. In more specific terms, the KBANN captures problem-specific domain knowledge, which might be represented in propositional logic, in the form of neural network instantiations (i.e. MLP neural network). This, in fact, is equivalent to creating a classical expert system, which is infamous for lacking the ability for trainability. However, the KBANN has the computational framework (by the virtue of being an ANN) to further improve the embedded knowledge using an appropriate learning algorithm on the training data [32].
This paper aims to demonstrate as its main contribution that the KBANN is a better choice as a classifier for the PE diagnosis task compared to purely inductive machine learners including the MLP neural network which is the most commonly employed classifier algorithm by researchers in the field [28], [30], [31], [33], [34], [35]. The performance of ANN classifiers for the PE diagnosis task can be enhanced if the neural classifier algorithm is instantiated through the pre-existing collective experiential knowledge, i.e. the modified PIOPED criteria, and the further domain expertise within a knowledge-based hybrid learner framework. Accordingly, this paper presents the design, which entails the instantiation of an ANN classifier based on domain-specific knowledge and further refined domain knowledge acquired from nuclear radiologists. It discusses the comparative performance evaluation of the KBANN for diagnosis of PE based on features extracted from ventilation–perfusion scans and correlated cxr in compliance with the modified PIOPED criteria. In the following sections definition of the medical problem, i.e. PE, along with the diagnosis criteria, i.e. the PIOPED criteria, the design of KBANN, comparative performance evaluation of the KBANN, future work and conclusions are presented.
Section snippets
PE and the modified PIOPED criteria
Pulmonary emboli occur as a result of venous thrombi (blood clots within veins) which dislodge from their original source and eventually create an obstruction once in the lung. Such emboli usually originate in the major veins of the lower extremities but can also originate from pelvic veins, veins of the upper extremities, and from the right heart itself. Upon dislodging from the source, the emboli progressively travel through the venous circulation, entering the vena cava to the right side of
Artificial neural networks
ANNs are computational paradigms with initial inspirations rooted in biology and physics [20]. The computational framework for an ANN entails a large number of inter-connected processing elements or nodes, called neurons, which, for the most part, implement similar and relatively simplistic computational tasks. The true computational power on the other hand derives from a combination of adaptable interconnections, called the weights, a layered topology, and nonlinearities associated with the
KBANN design
This section will introduce the process of instantiating a feedforward neural network using the rule base derived from the PIOPED criteria. A close look at the PIOPED criteria indicates that each diagnosis class has its own set of rules. Specifically, the modified PIOPED criteria suggest a dedicated set of rules to accomplish classification for each probability class associated with the PE diagnosis. The design methodology leverages decomposition of the original classification task into a set
Simulation and testing
This section presents a simulation study to be able to assess the degree the main objectives associated with the proposed design are satisfied. Specifically, performance of the KBANN in terms of classification accuracy is tested on a dataset and compared to those formulated by a nuclear radiologist who was guided by the PIOPED criteria as well as the domain expertise. This was done to validate that the KBANN was able to acquire the knowledge base associated with the PIOPED criteria and further
Conclusions
This paper proposed a knowledge-based MLP neural network as a classifier for the diagnosis task associated with PE. It was observed that the availability of substantial collective experiential knowledge in the form of the modified PIOPED criteria and further domain know-how facilitated the KBANN to project significant expertise for PE diagnosis. The KBANN performance was evaluated using a PE dataset developed with the help of a nuclear radiologist. Results validated that the KBANN was
Acknowledgment
This manuscript greatly benefited from noteworthy improvements and enhancements through anonymous referee comments and feedback.
Gursel Serpen received his BS degree in engineering in 1983, MSEE degree in 1987, and a PhD in EE degree in 1993, the latter from the Old Dominion University, Norfolk, Virginia, USA. His professional activities entailed a position as an application/senior software engineer with Integrated Systems, Inc., Santa Clara, California, between 1992 and 1993, and as a faculty member with the Electrical Engineering and Computer Science department at the University of Toledo, Ohio, USA since 1993. His
References (35)
An automated method for the detection of pulmonary embolism in V/Q-scans
Med. Image Anal.
(2003)- et al.
Knowledge-based artificial neural networks
Artificial Intelligence
(1994) Predicting the presence of acute pulmonary embolism: a comparative analysis of the artificial neural network, logistic regression, and threshold models
Am. J. Roentgenol.
(2002)- et al.
Role of ventilation scintigraphy in diagnosis of acute pulmonary embolism: an evaluation using artificial neural networks
Eur. J. Nucl. Med. Mol. Imaging.
(2003) - et al.
Automated interpretation of ventilation–perfusion lung scintigrams for the diagnosis of pulmonary embolism using artificial neural networks
Eur. J. Nucl. Med.
(2000) - et al.
An independent evaluation of a new method for automated interpretation of lung scintigrams using artificial neural networks
Eur. J. Nucl. Med.
(2001) - et al.
How well can radiologists using neural network software diagnose pulmonary embolism?
Am. J. Roentgenol.
(2000) - et al.
Acute pulmonary embolism: cost-effectiveness analysis of the effect of artificial neural networks on patient care
Radiology.
(1998) - et al.
Knowledge-Based Neurocomputing
(March 2000) ANFIS: adaptive-network-based fuzzy inference system
IEEE Trans. Syst. Man Cybern.
(1993)
A guide to the literature on learning probabilistic networks from data
IEEE Trans. Knowl. Data Eng.
Comprehensive analysis of the results of the PIOPED study
J. Nucl. Med.
Modified PIOPED criteria used in clinical practice
J. Nucl. Med.
Interpretation of indeterminate lung scintigrams
Radiology
Cited by (31)
Seeking an optimal approach for Computer-aided Diagnosis of Pulmonary Embolism
2024, Medical Image AnalysisThe Past, Present, and Future Role of Artificial Intelligence in Ventilation/Perfusion Scintigraphy: A Systematic Review
2023, Seminars in Nuclear MedicineAid decision algorithms to estimate the risk in congenital heart surgery
2016, Computer Methods and Programs in BiomedicineCitation Excerpt :Different types of ANN have been used such as those based on radial basis functions [15]. Artificial intelligence classifiers have also been used in oncology and breast cancer diagnosis [10,16,17]; for pulmonary diseases [18]; in haematology [19] and cardiology [20–23]. Decision trees have been also widely used both to represent and to carry out making decision processes.
Development of a knowledge based hybrid neural network (KBHNN) for studying the effect of diafiltration during ultrafiltration of whey
2011, DesalinationCitation Excerpt :Lack of these generalization of mathematical models insist researchers to rely upon the artificial neural network (ANN) approach, where the prediction is based on the real life dataset fed to the network without considering the physical nature of the system. The ANN is basically a training based computational paradigm with initial inspirations rooted in biology and physics [10]. The computational framework for an ANN entails a large number of inter-connected processing elements or nodes, called neurons, arranged in different layers.
Efficient management of pulmonary embolism diagnosis using a two-step interconnected machine learning model based on electronic health records data
2024, Health Information Science and SystemsPulmonary Embolism Detection Using Machine and Deep Learning Techniques
2024, Blockchain and Deep Learning for Smart Healthcare
Gursel Serpen received his BS degree in engineering in 1983, MSEE degree in 1987, and a PhD in EE degree in 1993, the latter from the Old Dominion University, Norfolk, Virginia, USA. His professional activities entailed a position as an application/senior software engineer with Integrated Systems, Inc., Santa Clara, California, between 1992 and 1993, and as a faculty member with the Electrical Engineering and Computer Science department at the University of Toledo, Ohio, USA since 1993. His current research interests entail machine learning theory including artificial neural networks and its applications in a variety of domains including bio-medical, computer security, and law, and hybrid intelligent systems.
Dilip K. Tekkedil graduated with a MSES degree from the Electrical Engineering and Computer Science department at the University of Toledo in 1999, Toledo, Ohio, USA. His masters studies focused on neuro-fuzzy approaches for intelligent system development for bio-medical applications. Mr. Tekkedil has been working as a software engineer since his graduation with the most recent position as a senior software engineer with the HighData Software Inc., in New Hampshire, USA.
Mike Orra received his BS degree in Computer Science and Engineering in 2003 and MS in Electrical Engineering degree in 2006 from the University of Toledo, Toledo, Ohio, USA. Mr. Orra's masters thesis was on the use of neural networks in predicting state of charge for lithium ion battery cells. He is currently a doctoral candidate with the Electrical Engineering and Computer Science department at the same university. His current research interests include artificial intelligence and acoustic signal analysis.