On the interplay of machine learning and background knowledge in image interpretation by Bayesian networks
Introduction
Bayesian networks have become the state-of-the-art for the representation of and reasoning with uncertain knowledge of a clinical problem. They have a sound statistical basis, yet allow exploiting available background knowledge in a way superior to many other formalisms for statistical machine learning. Even when no data at all are available, it is often still possible to develop a Bayesian network, guided by a mixture of expert knowledge and information from literature. If data are available, one can also learn the network structure and parameters from data. As this holds for any medical domain, it also holds for medical imaging. However, medical imaging has its own characteristics: methods are applied, ranging from image segmentation via region detection to lesion determination, as part of the image processing pipeline. At the very end of this pipeline we find image interpretation; methods for image interpretation are, thus, clearly dependent on the previous processing steps. Some of the characteristics of the image processing steps, such as that image features are continuous, have particular implications for image interpretation that has a foundation in medical knowledge of the structure—histology and anatomy—and function—physiology. As in the end, medical images need to tell something about the patient, medical knowledge offers a natural start for computer-aided detection. However, exploiting explicit representations of medical knowledge into medical image interpretation has so far met with significant challenges.
These challenges bring us back to the relationship between manual construction and learning from data of Bayesian network, a topic discussed repeatedly in the past, without giving rise to scientific consensus. New in this paper is that we address this issue from the point of view of image interpretation. We critically exam the assumptions made in the expert-knowledge-guided development of a Bayesian network for medical image interpretation by the use of image data. Both the assumptions made in choosing the probabilistic parameters and in designing the graphical structure are studied.
The research was carried out in a concrete clinical setting: the interpretation of breast-cancer screening mammograms. Breast-cancer detection is a hard medical image interpretation task. With the digitisation of medical images in the last decade, there has been considerable progress in computer-aided interpretation of mammograms where most of the improvement have come from the development of new pattern-recognition techniques to better detect potentially suspicious breast regions. However, existing systems still exhibit limitations in attaining the required clinical accuracy, i.e. with respect to presence or absence of cancer in the patient. The major reason for this is their failure to explicitly represent the working principles and knowledge of human experts; expert radiologists normally compare image parts and different images of the breasts to each other, i.e. they interpret potentially suspicious regions of the breasts in the context of all other available image information. It is only recently that researchers have started to study ways to incorporate such principles into computer-aided detection (CAD) systems [1], [2].
As part of the research we constructed a Bayesian network that incorporated the most important image features and their relationships as used by radiologists to interpret mammograms. Thus, the resulting Bayesian network can be looked upon as a knowledge representation of mammogram interpretation in terms of breast tissue architecture and signs of abnormality. As image features are continuous variables, we used Gaussian distribution to model their uncertainty.
In a well-cited paper by Pradhan et al., published in 1996, it was experimentally established that the network structure is the single most important factor determining the Bayesian network's performance [3]. In time, this insight has become general wisdom underlying much of Bayesian network modelling. The results of this paper were in particular compelling as they were based on an extensive study of a variety of large, real-world networks. In our paper, we challenge the conclusions from the paper by Pradhan et al. and aim at offering a more balanced view on this important problem. It is also the right time to reexamine this problem, as considerable progress has been made in Bayesian network technology since 1996. Other recent research [4] also suggests that the problem of the sensitivity of Bayesian networks to imprecision in their parameters is domain-dependent and requires more careful investigation.
We emphasise that the problem of medical image interpretation we tackle in this paper is particularly challenging as the input to the network is based on image features automatically extracted by a CAD system through image processing, which in itself is a complex task and ongoing research. Even though the continuous nature of the features obtained in this way is understandable from a physical point of view, their relationships to the clinical abnormalities detected in the image are not straightforward from the radiologist's point of view. In contrast, the features provided by radiologists after visual inspection and interpretation are discrete; they have a specific semantics, although prone to subjective variation [5], [6]. Furthermore, the manual network contained two features obtained from the CAD system's output, that are assumed to have direct causal relationships with the variable that indicates whether or not an abnormality is present. Again, the inclusion of such variables is novel in comparison to available benchmark datasets for breast cancer and their relationships have not been studied before.
Hence, the novelty of our research lies in the thorough investigation of both the quantitative part (probability distribution) and the qualitative part (structure) of the manual network to obtain insight into the appropriateness of the assumptions made in developing a Bayesian network for a highly complex task: medical image interpretation. The selected task of mammogram interpretation is sufficiently similar to other complex medical image interpretation tasks to act as an example problem for the research. As breast cancer is a major disorder that is associated with enormous research efforts, techniques for the automated detection of breast cancer reflect the state of the arts of the field of CAD.
In this study in particular, we build upon our results from the work presented in [7], where we discretised the continuous mammographic features automatically extracted from the CAD system to check whether the probabilistic parameters in the initial expert network were optimal and correctly reflecting reality. It was shown that the parameters play an essential role in the network's performance. Therefore, after preliminary investigations [8], in the current study we provide an extensive and thorough investigation of learning Bayesian network structures, both restricted and unrestricted, from the discretised image data to gain detailed insight into the feature dependencies and independencies assumed in the manual model. The performance of the learned networks is compared with that of the manual network in terms of classification accuracy and knowledge representation.
The structure of the paper is as follows. We start with a review of the theory of Bayesian networks and related work in the areas of discretisation and structure learning in Section 2. Next, in Section 3, some background is provided on mammogram interpretation, the Bayesian network for mammogram interpretation that was developed by hand is presented, and we describe the data used for the experimental part in the research. Our previous work that examines the assumptions about the probabilistic parameters of the Bayesian network is shortly summarised in Section 4. This is done to provide the reader with a good understanding of the choices made about the discretisation of the data used for the research study on structure learning presented in Section 5. Finally, in Section 6 we return to the questions from which the research started and summarise what has been achieved.
Section snippets
Bayesian networks
A Bayesian network (BN) is defined as a pair , where G is a directed acyclic graph (DAG) G = (V, E) and P is a joint probability distribution of the random variables XV [9], [10], [11]. There exists a 1-1 correspondence between the nodes and the random variables ; the (directed) edges, or arcs, E ⊆ (V × V) correspond to direct causal relationships between the variables: a node is a parent of a child, if there is an arc from the former to the latter. We say that G is an I-map of P if
Mammography, the Bayesian network model and data
We start by reviewing the problem domain of screening mammography and describe in detail the Bayesian network that was constructed for the purpose of mammogram interpretation.
Reappraisal of the probabilistic parameters
As most of the variables modelled by the manual Bayesian network were continuous features, they were represented using conditional Gaussian distributions. A limitation of Gaussian distributions is that they are symmetric, which will not allow capturing asymmetries available in the data. Rather than using other continuous probability distributions, that would allow representing asymmetries, however again with particular assumptions, discretisation of the continuous data offers a way to fit the
Network structure reappraisal by learning
In this section we investigate the structures learnt from various algorithms and compare them with the continuous baseline network. For this purpose, we used the discretised data obtained from the best performing FI method as reported in the previous section.
Discussion and conclusions
Our aim was to obtain insight into the validity of the modelling assumptions made when developing a BN for complex medical image interpretation problems based on expert knowledge, with the interpretation of mammograms as a real-world example. Where in other problem domains it might be easier to construct such manual models using knowledge engineering methods, in the domain of image interpretation it is not unlikely that modelling mistakes are made. We carried out this study to find out whether
Acknowledgements
We would like to thank Saskia Robben and Niels Radstake for conducting the initial experiments and providing the preliminary results on discretisation and structure learning. We also thank the reviewers for their useful comments that help improve this paper. This work has been funded by the Netherlands Organization for Scientific Research under BRICKS/FOCUS Grant Number 642.066.605.
References (47)
- et al.
Supervised and unsupervised discretization of continuous features.
- et al.
Towards scalable and data efficient learning of Markov boundaries
International Journal of Approximate Reasoning
(2007) - et al.
A comparison of learning algorithms for Bayesian networks: a case study based on data from an emergency medical service
Artificial Intelligence in Medicine
(2004) - et al.
Learning effective brain connectivity with dynamic Bayesian networks
Neuroimage
(2007) - et al.
Evolving a Bayesian classifier for ECG-based age classification in medical applications
Applied Soft Computing
(2008) - et al.
Computer-aided mass detection based on ipsilateral multiview mammograms
Academic Radiology
(2007) The use of the area under the ROC curve in the evaluation of machine learning algorithms
Pattern Recognition
(1997)- et al.
Improved mammographic CAD performance using multi-view information: a Bayesian network framework
Physics in Medicine and Biology
(2009) - et al.
A probabilistic framework for image information fusion with an application to mammographic analysis
Medical Image Analysis
(2012) - et al.
The sensitivity of belief networks to imprecise probabilities: an experimental investigation
Artificial Intelligence
(1996)