Nested star-shaped objects segmentation using diameter annotations

Most current deep learning based approaches for image segmentation require annotations of large datasets, which limits their application in clinical practice. We observe a mismatch between the voxelwise ground-truth that is required to optimize an objective at a voxel level and the commonly used, less time-consuming clinical annotations seeking to characterize the most important information about the patient (diameters, counts, etc.). In this study


Introduction
In recent years, deep learning has become ubiquitous in medical image segmentation, outperforming most traditional segmentation methods (Litjens et al., 2017).However, the training of deep learning algorithms typically requires a large amount of annotated data, and therefore substantial manual labor from clinical experts.While both in clinical practice and in clinical research images are being annotated at a large scale, we observe a mismatch between the simpler and less time-consuming type of annotations that are made for these purposes and the (dense) annotations needed to train segmentation networks.In this paper, we propose a method that aims to reconcile the objectives of those two types of annotations, training a network to segment the carotid artery based on diameter annotations that are commonly made to establish the degree of stenosis.
The presence and the composition of atherosclerotic plaques in the carotid artery are predictive of stroke and coronary events (Bos point extraction in a multi-class setting-which in the case of carotid arteries allows to segment both the artery lumen and wall, providing more relevant clinical information-and provide more extensive validation.Our contributions can be summarized as follows: • we extend the differentiable boundary point extraction developed in Camarasa et al. (2022) to the multi-class segmentation of two nested star-shaped objects; • we optimize this multi-class segmentation problem using diameter annotations extracted at the maximum narrowing; • we comprehensively evaluate our method on carotid artery segmentation in two different MR image datasets of multi-center studies of patients who suffered a recent vascular event.
When strong priors on the segmented object exist, some weakly supervised methods can be tailored to efficiently exploit this information and better compensate for the lack of full annotations.For instance, a volume prior can be enforced through a quadratic penalty on the size of the segmented object (Kervadec et al., 2019b) or the proportion of voxels belonging to the foreground (Bortsova et al., 2018).Sahasrabudhe et al. (2020) and Qu et al. (2019) exploit the structure of deep nuclei segmentation by either learning to predict the image zoom-levels (as a proxy-task), or building Voronoi cells from the dot annotations, respectively.Dorent et al. (2021) assumes that the object is continuous and ''draws'' a path between the extreme points annotations, by minimizing a geodesic distance.In a multi-instance object detection in pathology images, Yang et al. (2020) detects circles using center and radius information at train time, which can be seen as a segmentation with a strong circular prior.
Two existing methods could be tweaked for the aforementioned task of carotid artery segmentation when using only a diameter annotation.InExtremIS (Dorent et al., 2021) can use the diameter as a valid path without resorting to the complex geodesic distance calculation; but it would not exploit the diameter information.CircleNet (Yang et al., 2020) can be modified to predict a single circular segmentation, using the center and diameter information but ignoring the exact boundary of the object.As such, to the best of our knowledge, there exists no method in the current literature that could use the information on the diameters of a vessel lumen and the outer wall while segmenting the object precisely.

Carotid artery segmentation
Considering the segmentation of the carotid artery lumen in magnetic resonance (MR) images, Luo et al. (2019) exploited its hyperintensity on TOF-MR images developing a variation of the level set method.Segmenting both the lumen and the outer wall, Arias Lorza et al. ( 2018), Arias-Lorza et al. ( 2016) used a geometrical prior of the artery relying on an optimal surface graph-cut algorithm.Both these methods required a prior estimate of the artery centerline.
The rise of deep learning methods moved research away from those semi-automatic algorithms towards more generalizable, fullyautomatic, and end-to-end optimizable methods that could apply to many (if not all) segmentation tasks (Isensee et al., 2021) In 2021, a carotid artery segmentation challenge was held based on MR imaging data of the Care II study (Zhao et al., 2017).The participants of the challenge were tasked to automatically segment the outer wall of the internal, external, and common carotid arteries on 3D black blood MR images of 24 patients.Alblas et al. (2021) won the challenge with a two-stage approach that aimed to locate the center-line and subsequently estimate the lumen and outer-wall contours.
To the best of our knowledge, little research focuses on using relevant weak labels to supervise carotid artery MR segmentation algorithms.

Method
In this section, we present a method to approximate the radii (Section 3.1), the diameters (Section 3.2), and the centroid (Section 3.3) of a star-shaped object.We introduce a weakly-supervised segmentation setting based on diameter annotations (Section 3.4) where we apply our methodology to the optimization of star shaped objects segmentation probability maps produced by a deep learning model (Section 3.5).For convenience, all mathematical notation is summarized in Appendix A.1, though each symbol is introduced in time in the manuscript.An object () of an imaging domain () is star-shaped if there exists a non-empty set of root points (  ) such that any ray originating from a root point ( ∈   ) crosses the boundary () of the object exactly once.This translates mathematically, as follows: where S is the set of star-shaped objects of , and   () = { + .((), ()) ∶  ∈ R + } the ray originated from  and of angle .
where   is the probability density function of a Gaussian distribution centered on  with standard deviation .
giving a -parameterized radial estimate of the true radius: The proof of Proposition 3.1.1can be found in Appendix A.3.The -parameterized radial estimates of two nested star-shaped objects can then be obtained by integrating the voxelwise product of the Gaussian beam ( ), the segmentation probability map (1  ) and a distance map (‖ − ‖  ).This process is illustrated in Fig.

Estimating diameters
Definition 3.2.1 (Star-Shaped Diameter).Given a star-shaped object  ∈ S, a root point  ∈   , an orientation  ∈ [0, [, the diameter D is defined as the sum of two opposite radii: Proposition 3.2.1 (Diameter Estimate).Considering a star-shaped object  ∈ S, a root point  ∈   and an orientation  ∈ [0, [, a diameter estimate D can be defined as the sum of two opposite radial estimates: Notice that these definitions of the diameter and of its estimate differ from Camarasa et al. (2022) as they force a certain root point  to belong to the diameter.This modification is motivated by the available annotations and the prior knowledge of our specific application presented in Section 3.4.

Centroid
Definition 3.3.1 (Centroid).We define the centroid C ∶ ( → [0, 1]) →  of a set  ⊂  as the average of the coordinates of the set: where 1  is the indicator function of the set .
The more general notion of the centroid (not limited to star-shaped objects) was already investigated in the literature and shown successful in the weak-supervision of medical image segmentation (Kervadec et al., 2021).In our application, it will play an important role as root point of star-shaped objects (Section 3.4) and provide spatial information to our deep learning model (Section 3.5).

Application
In the following, we apply the previously presented theory to the segmentation of multiple registered sequences  ∶  ⊂ R 2 → R  in three non-overlapping subsets (the background subset (B), the inner subset () and the outer subset (O)): In our setting, we do not have access to the voxelwise ground-truth but two pairs of annotated landmarks on the objects boundaries , indicating the diameters and therewith the degree of stenosis (European Carotid Surgery Trialists' Collaborative Group, 1998), (see Fig. 1).The annotated landmark pairs: maximize the narrowing , i.e. they minimize the ratio between the inner and outer diameter: . We also have the following prior knowledge about the subsets topology and nesting (see Fig. 1): • the inner subset is a star-shaped object: • the foreground subset (corresponding to the union of the inner and outer subsets) is a star-shaped object: • the centroid of the inner subset is a root point of both the inner and the foreground subsets: Our goal is to approximate the unavailable ground truth  with a deep learning model: using the annotated landmarks and the available prior knowledge.

Optimization
As the centroid (C), the diameter estimate ( D) and the parameterized radial estimate (  ) are differentiable, they can be used for the optimization of our deep learning model (  ) (see Fig. 1).
With a mean squared error loss, the predicted inner class  centroid can be learned using the inner class annotated centroid ) and similarly the inner  and foreground B classes diameters estimates using the annotated diameters ( ) as reference: where is the orientation of the annotations.
Notice that contrary to Camarasa et al. (2022), logarithms are applied to stabilize the optimization limiting its range of values.As we will show in the experiments section, additional regularization is desirable.Favoring the agreement of multiple -parameterized radial estimates will reward locally circular, binarized (  ≈ 1  ) starshaped objects segmentation probability maps, and therefore efficiently achieve this goal.
This mathematically translates into a minimization of the variance of the ratio of the -parameterized radial estimates and the radial estimates: where  ⊂ [0, 2[ is a discrete set of angles and  ⊂ R + a discrete set of -parameters.
The final combined, weakly-supervised loss is, therefore: with ,  and  hyper-parameters balancing the different components.

Experiments
In this section, we apply the weakly-supervised method presented in Section 3 to the segmentation of the carotid artery lumen (inner subset ) and outer wall (outer subset O) on two datasets of MR images presented in Section 4.1.Section 4.3 presents our comparison to the state of the art and an ablation study.Finally, the implementation details can be found in Section 4.4.

Datasets
Care II study.We used a subset of the data from the Care II study (Zhao et al., 2017), a multi-center study which enrolled patients had a recent ischaemic stroke or transient ischaemic attack.The subset of 24 MRI scans of 24 patients was made available in the context of a carotid artery segmentation challenge. 1The scans are 3D Motion Sensitized Driven Equilibrium prepared Rapid Gradient Echo, 3D-MERGE.Full lumen segmentations of either left or right internal and common carotid artery are available in on average 12.1% of the slices of a scan.This resulted in 2151 annotated 2D slices over the whole dataset.
PARISK study.We used carotid artery MR images acquired within the multi-center, multi-scanner, multi-sequence PARISK study (Truijman et al., 2014), a large prospective study to improve risk stratification in patients with mild to moderate carotid artery stenosis who recently had a transient ischaemic attack, an amaurosis fugax, or a minor stroke.We used the images of 191 enrolled patients of the four study centers: Amsterdam Medical Center (AMC), the Maastricht University Medical Center (MUMC), the University Medical Center of Utrecht (UMCU), and Erasmus MC (EMC), all in the Netherlands.AMC, MUMC and UMCU performed the MR imaging with a 3.0-Tesla MR scanner with an eight-channel phased-array coil (Shanghai Chenguang Medical Technologies Co., Shanghai) and EMC used a dedicated four-channel carotid phased-array coil with an angulated setup (Machnet B.V., Roden, the Netherlands).UMCU and MUMC acquired the imaging data of all the patients with an Achieva TX scanner (Phillips Healthcare, Best, Netherlands), AMC center acquired the scans of 11 of its patients 1 https://vessel-wall-segmentation.grand-challenge.org/with an Ingenia scanner (Phillips Healthcare, Best, Netherlands), and 2 with an Intera scanner (Phillips Healthcare, Best, Netherlands) and EMC acquired the imaging data of all patients with a Discovery MR 750 system (GE Healthcare Milwaukee, MI, USA).Each enrolled patient underwent 5 MR sequences.For the AMC, UMCU, and MUMC centers the imaged sequences were: T1 weighted quadruple inversion recovery (IR) turbo spin echo (SE) (pre-(a.) and post-contrast (b.)), T2 weighted turbo SE (c.), IR turbo field echo (FE) (d.), and time of flight fast FE (e.).For the EMC center patients, the imaged sequences were T1 weighted double IR fast SE (pre-(a.) and post-contrast (b.)), T2 weighted turbo SE (c.), spoiled gradient echo (d.), and fast spoiled gradient echo (e.).The letters in between parenthesis (e.g.(a.)) indicates the matching of similar sequences across centers.More details on the image sequences can be found in Truijman et al. (2014).MR sequences were semi-automatically, first affinely and then elastically registered to the T1 weighted pre-contrast sequence.The vessel lumen and outer wall were annotated manually slice-wise, by trained observers with approximately 3 years of experience, in the T1 weighted pre-contrast sequence.Registration and annotation were achieved with VesselMass software. 2The observer annotated the common and internal carotid arteries (either left or right) where the clinical symptoms [of atherosclerosis] occurred.

Preprocessing and diameter annotation
The following subsection presents the processing common to Care II and PARISK datasets.
Normalization.Input images were normalized at a slice level to match a mean intensity value of zero and standard deviation of one using the official implementation of MONAI (Cardoso et al., 2022).
Padding.Images were padded with zeros to a common dimension for their number of voxels along a dimension to match the smallest possible multiple of 2 5 = 32voxels.This results in padded images of dimension 576 × 576  2 for PARISK and 768 × 160  2 for Care II.
Cropping.At training and evaluation time all 2D slices are cut in half to have only one carotid artery per sub-image.
Diameter annotations.We simulated the lumen and the full vessel diameter annotations (( ′  ,  ′′  )) ,B slice-wise, with the following methodology: extraction of the lumen centroid, resampling of the boundary point in  = 24 points (to ensure the co-linearity of the lumen and the full vessel diameters) and selection of the diameters in the direction of maximum narrowing (as defined in the method Section 3.4).Examples of the resulting simulated annotations can be found in Fig. 5 of the Appendix.

Baseline and ablation study
We evaluated our method and the different baselines in a 2D setting with the lumen and the full-vessel diameters annotated per slice on both the PARISK dataset and the Care II dataset.We compare our method to InExtremIS (Dorent et al., 2021) and CircleNet (Yang et al., 2020), as they are the most related weakly supervised methods and most relevant to our problem.To perform a fair comparison across all methods and assess the influence of the annotation-based losses, the additional CRF loss (Tang et al., 2018) used by InExtremeIS is not included as all methods could benefit from it.To determine the non-supervised area by InExtremIS, we consider a circle that has the centroid of the lumen annotations ( 1 2 ( ′  +  ′′  )) as a center and a diameter of 2 + ‖ ′ B −  ′′ B ‖ 2 - (PARISK:  = 7 voxels and Care II study:  = 4 voxels) corresponds to the 95th percentile of the slice-wise difference between the maximum full vessel radius and the radius at the maximum narrowing (computed over the whole dataset).For CircleNet, the predicted heatmap and the radius map spatial dimension match the spatial dimension of the input images rather than being down-sampled as originally proposed by Yang et al. (2020).This slight modification enables us to use the same network architecture for all methods and therefore make a more fair comparison.Additionally, we compared to a fully-supervised U-Net as an upper-bound, and perform an ablation study on the different components of the loss in Eq. ( 13).

Implementation details
All methods are built on top of the base U-Net architecture (Ronneberger et al., 2015) using the official MONAI 0.7 implementation as starting point (Cardoso et al., 2022).We trained the model with the same ADAM optimizer (learning rate 10 −4 ,  1 = 0.9,  2 = 0.99) for 1000 epochs (for the Care II dataset) and 2000 epochs (for the PARISK dataset).We increased the number of epochs for PARISK as after 1000 epochs all methods did not reach full convergence.Our loss components (Eqs.( 10), ( 11) and ( 12)) do not require any modification of the network architecture or training regime and are implemented as direct losses.
We use | | = 24 radii equally spread between [0, 2[ and  = {0, 1} as -parameters., , , and  have been empirically set to 100, 100, 10, and 0.15 such that the contributions of the different loss terms are approximately equal, based on the average amplitude of the components of the loss at training time.As the models are trained to segment a single vessel, for all methods the final segmentation is chosen as the largest connected component of the network binarized foreground output (using the argmax operation).
A more extensive explanation of our implementation details can be found in our publically available repository of code at https://gitlab.com/radiology/aim/carotid-artery-image-analysis/nested-star-shapedobjects

Metrics and evaluation
All methods were trained and evaluated using 4-fold cross-validation (the folds are determined at a patient level): 2 folds for training, 1 fold for validation, and 1 fold for testing.Evaluation is performed per-slice with Dice score (DSC), Hausdorff distance (HD), and absolute diameter error in the direction of the vessel's maximum stenosis (ADE).We report the per-patient averages in the testing sets and perform a two-sided Wilcoxon signed-rank test, with a level of significance of 0.05 to determine if a baseline (or ablation) is significantly different from our proposed method.

Results
In this section, we present the results of the ablation study (Section 5.1) and of the comparison to the baseline methods (Section 5.2).For both the ablation study and the comparison to the baselines, we can find the quantitative results in Tables 1 and 2, the qualitative results in Fig. 3 and the influence of the post-processing in Fig. 4.

Ablation study
The ablation study shows, on both Care II and PARISK datasets, that each component of the loss is important.Removing the regularizer, (C + D) still allows the network to locate the lumen and to start to retrieve the shape of the vessel but it under-segments parts of both the lumen and outer wall, while supervising using only the centroid loss (C) gives bad segmentation performances (Fig. 3).
Fig. 4 demonstrates that keeping the largest component as postprocessing benefits less to our method (C + D + ) than our method without regularizer (C + D).

Table 1
Distribution of the metrics computed over the test set of Care II study dataset for the different methods.The reported values correspond to the median and the interquartile range (in between brackets).The results reported in bold the best performing method (apart from the full supervision) and the results denoted by a * differ statistically from our proposed method using a Wilcoxon signed-rank test with a significance level of 0.05.(DSC: Dice Score, HD: Hausdorff Distance, ADE: Absolute Diameter Error). 2.9

Table 2
Distribution of the metrics computed over the test set of PARISK dataset for the different methods.The reported values correspond to the median and the interquartile range (in between brackets).The results reported in bold the best performing method (apart from the full supervision) and the results denoted by a * differ statistically from our proposed method using a Wilcoxon signed-rank test with a significance level of 0.05.(DSC: Dice Score, HD: Hausdorff Distance, ADE: Absolute Diameter Error).2.2

Baselines
As could be expected, the full-supervision outperforms weaklysupervised methods in most metrics (Tables 1 and 2) and is the upper bound of our comparison.However, the best weakly-supervised methods come close to the full supervision result, most notably in terms of Dice for the Care II dataset (Table 1).
The third row of Fig. 3 shows one of the main challenges of both our proposed method and the considered baselines: segmenting the correct vessel when multiple are present in the image.In this example, all methods consistently segment the external instead of the internal carotid artery.
The proposed method shows, across datasets and metrics, significantly better segmentation of the full vessel than (B) InExtremIS (Tables 1 and 2).This translates qualitatively (Fig. 3) into a better understanding of the shape of the outer wall.In terms of lumen seg-mentation, InExtremIS (after post-processing) seems to outperform our proposed method however not on every metric and dataset (Tables 1  and 2).
Our method segments the lumen of the carotid artery better (on both datasets) than CircleNet (Tables 1 and 2).In terms of full vessel segmentation, both methods seem to perform similarly (Tables 1 and  2).However, the qualitative results (Fig. 3) highlight the difficulties of CircleNet to capture more complex shapes such as elongated vessels close to the carotid artery bifurcation as it can only predict perfect circles.
It should be noted that postprocessing in the form of keeping only the largest connected component is crucial for InExtremIS, which has poor results without (Fig. 4).In contrast, this post-processing gives only a modest improvement for both the proposed method and the full supervision.CircleNet, by design outputs a single component and is unaffected by this step.

Discussion
A simple modification of our previously proposed method (Camarasa et al., 2022) transformed the segmentation of a single starshaped object using maximum diameter into the multi-class segmentation of nested star-shaped objects using diameter annotations in the orientation of maximum vessel narrowing.
Our ablation study shows the importance of the regularizer which has two main effects.First and foremost, it pushes the deep learning model to predict a single, binarized, and locally circular object, making use of the prior knowledge at hand.Second, it increases the number of supervised voxels at each backward pass-the Gaussian beams used for the regularizer cover the entire image while the ones of the loss over the diameters concentrate in a shallow area in the direction of the vessel's maximum narrowing.
Our method demonstrates strong performances both quantitatively and qualitatively.It can capture more complex shapes than CircleNet (Yang et al., 2020) while being more stable and having a better understanding of the vessel structure than InExtremIS (Dorent et al., 2021).
A limitation of our study is that we used simulated diameter annotations.A subset of these simulated annotations was checked and approved by a medical doctor.Still, annotations made in clinical practice might not always be precisely at the maximum narrowing of the vessel, which could slightly reduce the performance of the proposed approach.However, we expect methods relying on the exact supervision of voxels (such as InExtremIS Dorent et al., 2021) to be even more sensitive to this type of uncertainty in the annotations.
In our setting, all methods have access to slice-wise diameters.In a more realistic setup, the two diameters, necessary to measure the degree of stenosis (European Carotid Surgery Trialists' Collaborative Group, 1998), would be available on only one slice of the whole volume.
Depending on the modality and available software, clinical experts could prefer to annotate diameters in 3D instead of 2D.Although not applicable out of the box to 3D diameters, the presented method can easily be adapted to this type of annotation.This could be achieved by changing the -parameterized radial estimate to the spherical coordinates system instead of the polar coordinates system (Appendix A.3), and defining the Gaussian beam as a function of the azimuth and the inclination instead of the angular coordinate (Definition 3.1.3).
In this paper, we compared our weakly-supervised method to full supervision.The two approaches can also be combined as they do not require a modification of the network architecture.Similarly to E.L. Jurdi et al. (2021), Karimi and Salcudean (2019), Kervadec et al. (2019a), our proposed loss (Eq.( 13)) could complement classic segmentation losses which usually optimize a voxel-wise classification.This way, result of fully supervised segmentation may be improved for (nested) starshaped object(s) and especially in the case that the training data is limited.In a semi-supervised fashion, our proposed regularizer (Eq.( 12)) could supervise unlabeled data as it can be computed without annotations.
Our proposed approach bridges segmentation probability map and boundary Cartesian coordinates in a differentiable manner.In our case, we use this bridge to derive simple objects: diameters.However, this opens the door to more advanced objects lying in the Cartesian space such as parametric curves (Burdin et al., 1996), Fourier-based shape descriptors (El-ghazal et al., 2012) etc.Those more advanced objects could be used to regularize the optimization of segmentation tasks based on the prior knowledge at hand.
We evaluate our approach in this paper on fairly simple shapes, as many vessel cross-sections are elliptical or almost circular, it could be used to model and supervise more complex shapes as well (Appendix A.4, Fig. 6), such as tumors (Menze et al., 2015).This would be of clinical relevance as the well-established RECIST criterion assesses the progression of tumors based on the evolution of their longest diameter (Schwartz et al., 2016).

Conclusion
We have introduced a fully differentiable approach to locate the centroid and boundary points of a star-shaped object from a segmentation probability map.The method was then successfully applied to train a segmentation neural network to segment nested star-shaped objects, supervised by diameter annotations, and regularized exploiting the  (, , ) ∈ R 3 Hyper-parameters balancing the combined loss available shape-prior knowledge.This provides a mathematically sound way to re-use existing clinically relevant annotations.We validated the method on two datasets of MR images of carotid arteries-using only diameter annotations at training time and segmenting both the lumen and outer wall-and showed segmentation performance approaching that of full voxelwise supervision.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.Method pipeline.The inner class ( in red), the outer class (O in green) and the foreground classes (as the union of inner and outer classes B =  ∪ O).
. As an example, multiple versions of the well-established U-Net network (Ronneberger et al., 2015) trained with voxelwise annotations reported good performances of the joint segmentation of the lumen and the outer wall of the carotid artery: Wu et al. (2019) compared different U-Net variations on 2D T1 weighted MR images, Zhu et al. (2021) developed a cascading approach of residual U-Net for 3D MR and Camarasa et al. (2021) investigated performances and the uncertainty of the prediction of a deep learning method trained on a multi-sequence MR images dataset.Combining a naïve-Bayesian method with a level-set method, Liu et al. (2006) segmented the different components of the plaque while van Engelen et al. (2015) preferred a support vector machine model.Zhang et al. (2019) compared four established machine learning models (random forest, artificial neural network, gradient boosting decision tree, and artificial neural network) applied to the pixel-wise classification of the plaque components.

Fig. 4 .
Fig. 4. Boxplots showing the Dice Score (averaged per patient) before and after post-processing for both the lumen () and the full vessel (B).Each box goes from Q1 to Q3 while displaying the median.The whiskers extend from the box by 1.5 times the interquartile range.

Fig. 6 .
Fig. 6.Application of the differentiable boundary point extraction applied to randomly generated star-shaped objects (green: generated star-shape object, red: extracted boundary points).