Supervoxel classification forests for estimating pairwise image correspondences
Introduction
Establishing correspondences between images is a fundamental and important problem in many medical image analysis tasks. To this end, dedicated image registration techniques have been developed and successfully employed in fully automated analysis pipelines [1]. Many of these techniques work best when applied on particular types of images, such as brain scans, where simple initialisation strategies work well. In general settings, however, the to-be-registered images might capture very different fields of view, as is often the case in pre- and post-operative abdominal scans. In such settings, estimating an initial alignment can be quite challenging if no prior information is available. It can be beneficial to utilise anatomy recognition and landmark detection methods, which provide spatial priors for registration [2]. However, this requires an annotated image dataset for training; obtaining a large number of manually annotated images can be tedious, costly, and time-consuming.
We propose a general method for estimating initial pairwise correspondences between images, which does not require any prior information or manual annotations. To do so, we employ random classification forests [3], but, in contrast to previous work, class labels for training are generated automatically. Our method consists of over-segmenting a pair of images into supervoxels. We then train a forest classifier on one of the images – the source image – by using its supervoxels indices as voxel-wise class labels. Applying the forest on the other image – the target image – yields a supervoxel label prediction for each of its voxels. Majority voting is then carried out within the supervoxels of the target image, where each voxel casts a vote as to what the final supervoxel label should be. The final labelling yields correspondences between the supervoxels of the two images. Supervoxels are an ideal representation for semi-densely distributed correspondences, relaxing the one-to-one matching assumption between images. Having a set of initial correspondences between two images, on a supervoxel level, can help solve the initialisation problem for many image analysis tasks such as atlas/patch-based segmentation [4], [5], registration, and atlas construction.
The main advantage of a supervoxel classification forest (SVF) is that it does not rely on any prior manual annotations, making it possible to train a forest on an unlabelled image. Using supervoxels that follow boundaries make it possible to perform matching between regions that have different shapes and avoid the constraints of rectangular-shaped patches that tend to contain elements from multiple anatomical regions.
Random forests [3], as a supervised machine learning technique, have found many successful applications in medical image analysis [6], [7], [8], [9]; this is mainly due to their accuracy, robustness, and scalability. They rely on the availability of labelled images, which is in contrast to the approach taken in this paper: the labels for training are generated automatically. While, traditionally, forests are trained on a dataset containing many images, the idea of encoding a single labelled image (or “atlas”) as a forest [9] has been proposed recently in the context of multi-atlas label propagation. This has inspired our idea of using the atlas-forest approach to encode a single source image into a collection of homogeneous regions, obtained automatically via supervoxelisation. Those supervoxel/region-based labels can then be used to predict matching regions in another target image. Supervoxels – and their 2D counterpart, superpixels – have found many applications in computer vision [10], [11]. They allow the grouping of voxels into locally consistent regions that have similar appearance characteristics, thereby reducing redundancy and computational complexity. Supervoxels are mainly used within segmentation pipelines. We are not aware of previous work that has used supervoxels as label entities in classification forests, in particular, with the aim of establishing image correspondences.
In [2], random classification forests are used to provide spatial priors to initialise image registration and it has been shown that those priors yield improved registration of spine CT images. Their method relies on the availability of annotated images. Our method can be used for the similar task of providing priors for registration, except that there is no need for annotated images for training.
Random forests have been used to train on unlabelled datasets before, mainly in the context of density estimation [8] and clustering [3], [12]. For density estimation, the forest, also called density forest, is trained on unlabelled data by assuming multi-variate Gaussian distributions over feature responses at the split nodes. For clustering, the forest is used to extract a similarity measure between points, where two points are considered similar if they both end up in the same leaf node of a tree. The predictions of all trees are then aggregated to get a similarity measure between points. To train the forest to cluster unlabelled data, two dummy labels are introduced: class label 1 assigned to the unlabelled observed data and a class label 2 is assigned to a synthetic dataset. The forest is then trained to distinguish between the observed unlabelled dataset and the synthetic dataset.
Section snippets
Problem formulation
The aim of our method is to estimate correspondences between a set of image regions, i.e. supervoxels. Let Ii be an image that is over-segmented into distinct regions that are represented by an indexed family of sets . The image, therefore, consists of supervoxels, with the index set denoting the distinct indices/labels of the supervoxels. Each supervoxel , in turn, is a set of voxels vil. With Ni representing the total number of voxels in the
Experiments and results
We evaluate our proposed method on two different datasets. Dense ground-truth one-to-one correspondences between images is hard to obtain; there are datasets available that have sparse correspondences, such as spine CT images, for which the location of the vertebrae centroids are available in form of manual annotations. We use a publicly available spine CT dataset to quantitatively evaluate our method. In addition, we test our proposed method in a simple multi-atlas label propagation (MALP)
Discussion and conclusion
In this paper, we propose a method for estimating correspondences between images on a supervoxel level using random classification forests. The advantage of our approach is that it does not rely on the availability of prior organ annotations. Training a random forest using automatically generated supervoxels as class labels allows training on unlabelled images. Qualitative evaluations of the estimated correspondences, in a registration initialisation setting and in a simple multi-atlas
Fahdi Kanavati received his M.Sc. in Advanced Computing, with distinction, in 2013, from Imperial College London, United Kingdom. He is currently a PhD student in the biomedical image analysis group, BioMedIA, at Imperial College London. His research interests include medical image analysis, computer vision, and machine learning.
References (21)
- et al.
Image registration methods: a survey
Image Vis. Comput.
(2003) - et al.
Automatic anatomical brain MRI segmentation combining label propagation and decision fusion
NeuroImage
(2006) - et al.
Patch-based segmentation using expert priors: application to hippocampus and ventricle segmentation
NeuroImage
(2011) - et al.
Discriminative dictionary learning for abdominal multi-organ segmentation
Med. Image Anal.
(2015) - B. Glocker, D. Zikic, D.R. Haynor, Robust Registration of Longitudinal Spine CT, in: Medical Image Computing and...
- L. Breiman, Random forests, Machine learning, 2001, 5-32ISSN...
- A. Criminisi, J. Shotton, D. Robertson, E. Konukoglu, Regression forests for efficient anatomy detection and...
- A. Montillo, J. Shotton, J. Winn, J. E. Iglesias, D. Metaxas, A. Criminisi, Entangled decision forests and their...
- et al.
Decision forests for classification, regression, density estimation, manifold learning and semi-supervised learning
Learning
(2011) - D. Zikic, B. Glocker, A. Criminisi, Encoding atlases by randomized classification forests for efficient multi-atlas...
Cited by (29)
Dense correspondence of deformable volumetric images via deep spectral embedding and descriptor learning
2022, Medical Image AnalysisFrom complex to neural networks
2021, Big Data in Psychiatry and NeurologyMulti-scale superpatch matching using dual superpixel descriptors
2020, Pattern Recognition LettersCitation Excerpt :Therefore, a process applied at such over-segmentation scale can be close to the optimal pixel-wise result. Several works have used superpixels in non-local frameworks, e.g., [12,29], or in unsupervised learning-based superpixel matching approaches using random forests [6,16]. Nevertheless, the geometrical irregularity of such decompositions [11] (i.e., in terms of shape, adjacency or contour smoothness) can become an issue, since neighborhood information is crucial to compute accurate matches in terms of context.
Unsupervised learning-based long-term superpixel tracking
2019, Image and Vision ComputingCitation Excerpt :In summary, two main contributions are proposed towards accurate long-term superpixel tracking. First, unsupervised learning-based superpixel matching is generalized and adapted from medical image processing [16,17] to computer vision in order to find associations along video sequences between consecutive and distant images decomposed into superpixels (Section 2). The approach is carried out using classifiers such as k-nearest neighbors (kNN) or RF [18], incorporates new forward-backward consistency constraints and fully exploits dedicated context-rich features we extended from greyscale [26,16,17] to multi-channel to incorporate neighborhood information on RGB frames.
SQL: Superpixels via quaternary labeling
2019, Pattern RecognitionCitation Excerpt :A variety of computer vision and pattern recognition problems have benefited from above advantages [4]: feature extraction [5], clustering [6], classification [7], segmentation [8–10], saliency detection [11], contour detection [12], stereo computation [13–15], objectness measure [16], proposal generation [17], object localization [18] and object tracking [19–21] to name a few. They also cover some domain specific applications such as remotely sensed image analysis [22,23] and medical image analysis [24,25]. Few approaches produce superpixels that conform to a regular lattice [26–28].
Random forests in medical image computing
2019, Handbook of Medical Image Computing and Computer Assisted Intervention
Fahdi Kanavati received his M.Sc. in Advanced Computing, with distinction, in 2013, from Imperial College London, United Kingdom. He is currently a PhD student in the biomedical image analysis group, BioMedIA, at Imperial College London. His research interests include medical image analysis, computer vision, and machine learning.