Placenta segmentation in ultrasound imaging: Addressing sources of uncertainty and limited field-of-view

Automatic segmentation of the placenta in fetal ultrasound (US) is challenging due to the (i) high diversity of placenta appearance, (ii) the restricted quality in US resulting in highly variable reference annotations, and (iii) the limited field-of-view of US prohibiting whole placenta assessment at late gestation. In this work, we address these three challenges with a multi-task learning approach that combines the classification of placental location (e.g., anterior, posterior) and semantic placenta segmentation in a single convolutional neural network. Through the classification task the model can learn from larger and more diverse datasets while improving the accuracy of the segmentation task in particular in limited training set conditions. With this approach we investigate the variability in annotations from multiple raters and show that our automatic segmentations (Dice of 0.86 for anterior and 0.83 for posterior placentas) achieve human-level performance as compared to intra- and inter-observer variability. Lastly, our approach can deliver whole placenta segmentation using a multi-view US acquisition pipeline consisting of three stages: multi-probe image acquisition, image fusion and image segmentation. This results in high quality segmentation of larger structures such as the placenta in US with reduced image artifacts which are beyond the field-of-view of single probes.


V.A. Zimmer et al.
. Manual: S1.1 vs. S1.2 (intra) and S1.1 vs. S3 (inter); UNet /MTUNet : S1.1 vs. UNet /MTUNet (intra) and S3 vs. UNet /MTUNet (inter). (b): The difference in distributions between manual annotations from three raters and automatic segmentations from models UNet, MTUNet, and TMTUNet with MC dropout is measured by the Generalized Energy Distance using IoU as distance measure. This is compared for models trained on sets A, P and AP and tested on both anterior and posterior placentas. Statistical significance between UNet and MTUNet /TMTUNet is indicated by * (moderate effect size) and * * (strong effect size). Fig. A.10 shows the design of the two-and three-probe holder with measurements in mm. The initial design was developed on a fetal phantom in the second trimester (Kyoto Kagu Space-fan CT), and subsequently optimized with regard to comfort and usability in a clinical setting by scanning pregnant volunteers. The result is a flexible system which allows the use of two, three, or even four probes (not used in this study). We fixed the angulation between the probes so that the FoV can be extended with a known spatial alignment of the images. We chose an angle of 30 • which empirically showed to angulate the probes sufficiently to maintain contact between the probe's surface and maternal skin. However, other configurations are possible.

A.2. Data
We perform a 5-fold cross-validation and each fold divides the patients in a test, training and validation set. In each fold, approximately 60% of the data   is used for training, and 20% for both validation and testing. Different folds had different amount of images for validation and testing (up to 10%) because of the heterogeneity of the data: each patient had a different number of images, with and without manual segmentations, and with and without placental tissue. However, we made sure that the images from individual patients were not distributed across training/validation/testing sets, the number of training images with segmentations is always the same for posterior and anterior placentas, and that each patient with manual segmentations is exactly once part of a test set.
Details about the data distribution in the folds can be found in Table A.5. the segmentation when the images was InD or OoD data. Multi-task models, especially TMTUNet (row 4) show a more robust performance with respect to OoD data. Only TMTUNet is able to localize correctly the placenta in these OoD examples. Also, MTUNet and TMTUNet are more robust to image artifacts, such as shadows, which is shown in InD, last example.

B.2. Placenta segmentation -Multi-view images
Additional exemplary multi-view images are shown in Fig. B.12 with corresponding placenta segmentations with MTUNet and combined attention maps. The placenta is better visualized in the multiview images with reduced image artifacts and an extended FoV. The multi-task model MTUNet provides an accurate segmentation and the combined attention maps localize well the placenta.

B.3. Variability and uncertainty
We investigated the inter-and intra-observer variability for the manual annotation of placental tissue in 3D US. In each fold, we use a subset of the test set, for which three manual annotations are available. Fig. B.13(a)-(c) show the agreement of the segmentations as measured by IoU,ASD and RHD,respectively,and Fig. B.13(d) the difference in manual and automatic distributions (as a measure of uncertainty) measured by the Generalized Energy Distance using the Intersection-over-Union (IoU).