Elsevier

Medical Image Analysis

Volume 73, October 2021, 102157
Medical Image Analysis

Leveraging unsupervised image registration for discovery of landmark shape descriptor

https://doi.org/10.1016/j.media.2021.102157Get rights and content

Highlights

  • Proposes an end-to-end model to obtain shape descriptor from images, alleviating pre and post-processing needs.

  • Self-supervised neural architecture model that leverages image registration to facilitate landmark discovery.

  • Frameworks provides model variants to encode additional shape information,regularization and heuristic for removal of redundant landmarks.

  • Discovered landmarks are usable on 2D and 3D images for different downstream tasks ranging from clustering to severity quantification.

Abstract

In current biological and medical research, statistical shape modeling (SSM) provides an essential framework for the characterization of anatomy/morphology. Such analysis is often driven by the identification of a relatively small number of geometrically consistent features found across the samples of a population. These features can subsequently provide information about the population shape variation. Dense correspondence models can provide ease of computation and yield an interpretable low-dimensional shape descriptor when followed by dimensionality reduction. However, automatic methods for obtaining such correspondences usually require image segmentation followed by significant preprocessing, which is taxing in terms of both computation as well as human resources. In many cases, the segmentation and subsequent processing require manual guidance and anatomy specific domain expertise. This paper proposes a self-supervised deep learning approach for discovering landmarks from images that can directly be used as a shape descriptor for subsequent analysis. We use landmark-driven image registration as the primary task to force the neural network to discover landmarks that register the images well. We also propose a regularization term that allows for robust optimization of the neural network and ensures that the landmarks uniformly span the image domain. The proposed method circumvents segmentation and preprocessing and directly produces a usable shape descriptor using just 2D or 3D images. In addition, we also propose two variants on the training loss function that allows for prior shape information to be integrated into the model. We apply this framework on several 2D and 3D datasets to obtain their shape descriptors. We analyze these shape descriptors in their efficacy of capturing shape information by performing different shape-driven applications depending on the data ranging from shape clustering to severity prediction to outcome diagnosis.

Introduction

Statistical shape modeling (SSM) is an indispensable tool for the analysis of anatomy and biological structures. Such models can be viewed as a composite of two distinct steps: shape representation and shape analysis. Shape representation is a quantifiable description of the shape/structure of sample from a population of anatomies (usually given as a cohort of images or surface meshes) that is consistent with the population statistics and is easy to use for subsequent analysis. There are two prominent families of algorithms for shape representation, (i) landmarks, which express shapes as point clouds that define an explicit correspondence map from one shape to another using invariant points across populations that vary in their form, and (ii) deformation fields, which rely on transformations between images to encode implicit shape information. Shape analysis then uses these shape representations to analyze the population’s statistics; in most cases, the representation is projected onto a low-dimensional space via principal component analysis (PCA). This low-dimensional representation is used as a shape descriptor for subsequent shape analysis. Outside of analysis of different modes of shape variations captured by this descriptor, it can also be subsequently utilized in different applications. For instance, the shape descriptor can serve as features to perform classification of different morphological classes (Hufnagel et al., 2007), can quantify the severity of a particular deformity (Bhalodia et al., 2020a), or employed to interpret and discover shape characteristics that are associated with a particular disease (Cates et al., 2014). We consider such downstream applications that are dependent on how well the shape descriptors characterize the given shape to showcase the efficacy of the shape descriptor.

Due to their simplicity and computational efficiency, correspondence-based models are the most prominently used models for shape representation. Correspondences is a term used to describe landmarks on the anatomy that are geometrically consistent across the samples of the population. In the earliest works, Thompson (1917) correspondence was achieved by handpicked landmarks corresponding to distinguishable features. The field has come a long way with many state-of-the-art correspondence discovery algorithms (Styner, Oguz, Xu, Brechbuehler, Pantazis, Levitt, Shenton, Gerig, Cates, Fletcher, Styner, Shenton, Whitaker, 2007). However, many of these algorithms require segmentation of the anatomy from images as well as heavy pre-processing. Such segmentation and or pre-processing often come with a significant computational overhead as well as cost human resources. Segmentation of some anatomies is prone to subjective decisions and hence requires domain expertise. These problems fail to make the automated correspondence discovery model fully end-to-end, i.e., an automated pipeline that for inference just inputs images to produce shape descriptors for analysis.

In recent years, deep learning and neural networks models have had a significant impact on both image registration and shape analysis. With their ability to learn complex functions, several methods (Bhalodia, Elhabian, Kavan, Whitaker, 2018, Milletari, Rothberg, Jia, Sofka, 2017) have proposed learning correspondence from images, bypassing the need for segmentation and preprocessing. However, these methods are supervised and are data-hungry, they require considerable training data with correspondences, which is not always possible in clinical applications. They also need anatomy segmentation and preprocessing for the training set that might not be readily available. Deep networks have also played an essential role in developing computationally fast and unsupervised learning-based algorithms for image registration (e.g., Balakrishnan et al., 2019) that perform equivalently to the state-of-the-art, optimization-based registration methods. However, transformations are not as friendly as correspondences for shape analysis; they often require the development of a fixed atlas (Joshi et al., 2004). The systems that process image-to-image transformations express shape information in a high-dimensional space. Typically for shape analysis, a low-dimensional space is preferred, and therefore, these representations are projected onto a low-dimensional space via PCA (or some equivalent for nonlinear spaces), and the modes of shape variation need to be analyzed by domain experts to check for their usability in downstream applications.

To address the above-stated challenges, we propose an end-to-end system for extracting a shape descriptor from only a population of input images. Ideally, this shape descriptor would not require any post-processing for subsequent analysis. This paper proposes a self-supervised deep learning approach for landmark discovery that uses image registration as the primary task. The proposed method alleviates the need for segmentation and heavy preprocessing (even during model training) to obtain a landmark-based shape descriptor. The discovered landmarks are relatively low in number; hence, they can be directly used for shape analysis and bypass the post-processing required to convert the representation into a low-dimensional space. The work presented here is an extension of the preliminary work presented in Bhalodia et al. (2020b). This work significantly extends and improves on the previous paper in the following ways:

  • Additional experiments, results, and analysis on several different datasets with associated downstream applications for shape descriptors.

  • We propose two different model variants that can incorporate prior information about shape into the model during training and can implicitly enforce the landmarks to encode such information.

  • We propose an additional image matching loss function that preserves the local structure and allows for cross-modality registration or usage of datasets with a lot of intensity variations.

Section snippets

Related work

Since the groundbreaking work of D’Arcy Thompson Thompson (1917) who utilized manually placed landmarks to study variations in shapes of fishes, statistical shape modeling (SSM) has become an indispensable tool for medical researchers and biologists. SSM finds applications in various fields such as cardiology (Gardner et al., 2013), neurology (Gerig et al., 2001), growth modeling (Datar et al., 2009), orthopaedics (Harris et al., 2013a), and instrument design (Goparaju et al., 2018). Shape

Methods

This section covers the necessary background for statistical shape modeling and image registration, the proposed model architecture and training, loss functions and optimization, and generalized model variants.

Results

This section shows the results of the proposed methods on different 2D/3D datasets and is divided into subsections corresponding to each dataset. We also demonstrate the usefulness of the landmark-based shape descriptor obtained in each case paired together with a downstream application. This section also includes an analysis of regularization, redundancy removal, and the application of different proposed framework variants. In most cases, the number of epochs are chosen via early stopping

Conclusions

This paper proposes an end-to-end framework of obtaining usable shape descriptor directly from a set of 2D/3D images. The proposed model is a self-supervised network that works under the assumption that anatomically consistent landmarks will register a pair of images well under a particular class of transformations. The model consists of a landmark encoder, an RBF solver, and a spatial transformer during training. For testing, we only use the landmark encoder to obtain a set of landmarks on a

CRediT authorship contribution statement

Riddhish Bhalodia: Conceptualization, Methodology, Software, Writing – original draft, Visualization. Shireen Elhabian: Data curation, Supervision, Writing – review & editing, Funding acquisition. Ladislav Kavan: Supervision, Writing – review & editing. Ross Whitaker: Conceptualization, Supervision, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The National Institutes of Health supported this work under grant numbers NIBIB-U24EB029011, NIAMS-R01AR076120, NHLBI-R01HL135568, NIBIB-R01EB016701, NIBIB-R21EB026061, and NIGMS-P41GM103545. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. We would also like to thank Dr Jesse Goldstein, Dr Andrew Anderson, Dr Nassir Marrouche and Dr Penny Atkins for making their data available to be used in this

References (45)

  • R. Bhalodia et al.

    A cooperative autoencoder for population-based regularization of CNN image registration

    Medical Image Computing and Computer Assisted Intervention – MICCAI 2019

    (2019)
  • R. Bhalodia et al.

    Self-supervised discovery of anatomical shape landmarks

    Medical Image Computing and Computer Assisted Intervention – MICCAI 2020

    (2020)
  • E.T. Bieging et al.

    Left atrial shape predicts recurrence after atrial fibrillation catheter ablation

    J. Cardiovasc. Electrophysiol.

    (2018)
  • D. Boscaini et al.

    Learning shape correspondence with anisotropic convolutional neural networks

    Proceedings of the 30th International Conference on Neural Information Processing Systems

    (2016)
  • J. Cates et al.

    Computational shape models characterize shape change of the left atrium in atrial fibrillation

    Clin. Med. Insights

    (2014)
  • J. Cates et al.

    Shape modeling and analysis with entropy-based particle systems

    Proceedings of Information Processing in Medical Imaging (IPMI)

    (2007)
  • A. Dalca et al.

    Learning conditional deformable templates with convolutional networks

    Advances in Neural Information Processing Systems

    (2019)
  • A.V. Dalca et al.

    Unsupervised learning for fast probabilistic diffeomorphic registration

    International Conference on Medical Image Computing and Computer-Assisted Intervention

    (2018)
  • M. Datar et al.

    Particle based shape regression of open surfaces with applications to developmental neuroimaging

    International Conference on Medical Image Computing and Computer-Assisted Intervention

    (2009)
  • R.H. Davies et al.

    A minimum description length approach to statistical shape modeling

    IEEE Trans. Med. Imaging

    (2002)
  • D. DeTone et al.

    SuperPoint: self-supervised interest point detection and description

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops

    (2018)
  • H. Du Buf et al.

    Diatom identification: a double challenge called ADIAC

    Proceedings 10th International Conference on Image Analysis and Processing

    (1999)
  • Cited by (0)

    View full text