An intensity-based self-supervised domain adaptation method for intervertebral disc segmentation in magnetic resonance imaging

Background and objective: Accurate IVD segmentation is crucial for diagnosing and treating spinal conditions. Traditional deep learning methods depend on extensive, annotated datasets, which are hard to acquire. This research proposes an intensity-based self-supervised domain adaptation, using unlabeled multi-domain data to reduce reliance on large annotated datasets. Methods: The study introduces an innovative method using intensity-based self-supervised learning for IVD segmentation in MRI scans. This approach is particularly suited for IVD segmentations due to its ability to effectively capture the subtle intensity variations that are characteristic of spinal structures. The model, a dual-task system, simultaneously segments IVDs and predicts intensity transformations. This intensity-focused method has the advantages of being easy to train and computationally light, making it highly practical in diverse clinical settings. Trained on unlabeled data from multiple domains, the model learns domain-invariant features, adeptly handling intensity variations across different MRI devices and protocols. Results: Testing on three public datasets showed that this model outperforms baseline models trained on single-domain data. It handles domain shifts and achieves higher accuracy in IVD segmentation. Conclusions: This study demonstrates the potential of intensity-based self-supervised domain adaptation for IVD segmentation. It suggests new directions for research in enhancing generalizability across datasets with domain shifts, which can be applied to other medical imaging fields. Supplementary Information The online version contains supplementary material available at 10.1007/s11548-024-03219-7.


Additional Experiments
This section reports the results of the additional experiments made in this study.The proposed strategy (t1t2s-int) was compared with the baseline model (U-Net trained only on the source dataset).For this comparison, we conducted a further experiment to investigate the feasibility of segmenting individual intervertebral discs (IVD) within the spine, to capture the differences in the spatial configuration and morphology of each IVD.To achieve this, we employed a principal component analysis (PCA) approach to divide the overall IVD segmentation into separate individual discs segmentation.Fig. 1 and Fig. 2, illustrate the results obtained by both models considering individual IVD segmentation.
In Fig. 1 results for dataset T 1 are reported, in which t1t2s-int shows overall better performances with higher median values and lower interquartile ranges (IQRs) for the Dice similarity coefficient (DSC ), sensitivity (Sen), and specificity (Spec).As regards the Hausdorff Distance (HD), U-Net shows higher median values for the discs T10/T11, T12/L1, and L2/L3, with a wider IQR with respect to t1t2s-int.
Fig. 2 displays the results for dataset T 2. The performances are particularly lower with respect to T 1, but with similar trends: t1t2s-int shows better performances than U-Net in terms of median values and IQRs for DSC, HD, and Sen. Spec, on the other hand, shows comparable median values for the two models for all discs, except for L2/L3, L3/L4, and L5/S.
By examining the performance across individual discs, Fig. 1 clearly demonstrates how the proposed methodology outperforms the others in terms of stability.Specifically, it exhibits consistently favorable results in median and IQR across different IVD on T 1. Notably, even for the segmentation of challenging IVD such as T10/T11 and L5/S, which are typically more difficult due to their position in the image, the proposed methodology maintains its superior performance.This highlights the robustness of the proposed approach, making it particularly well-suited for accurate and reliable IVD segmentation across different locations within the image.In the case of T 2 the scenario is slightly different, as the complexity that characterizes this dataset is reflected in the variability of the results between one disc and another.In particular, the discs T10/T11, T12/L1, and L5/S resulted in the lowest DSC for both U-Net and t1t2s-int, even though this latter outperforms the other in all cases.A similar trend can be observed for HD and Sen.

Training curves
In this section, we report and discuss the performance of the model by showing the plots of the loss function (Fig. 3) and the DSC metric (Fig. 4), for both training and validation phases over epochs.
As shown in Fig. 3, during the training phase the Dice Loss demonstrates a rapid decline in the initial epochs.The validation loss closely tracks this trend.As the epochs progress, both the training and validation loss values steadily decrease and stabilize, suggesting that the model reaches convergence.On the other hand, Fig. 4 reveals a steady increase in the DSC for both training and validation.Throughout the epochs, the DSC gradually increases and eventually reaches a plateau, which is indicative of the good segmentation performance of the model.
In both figures, the training and validation curves are close to each other, with no significant divergences, suggesting that the model is not overfitting.The lack of overfitting is further corroborated by the consistent performance in the validation set, which implies that the model generalizes well to unseen data.

Figure 3 :
Figure 3: Train and validation loss across epochs.The loss used to train the model is the Dice loss.

Figure 4 :
Figure 4: Dice similarity coefficient (DSC) calculated as the metric for the train and validation phases across epochs.