CAVE: Cerebral Artery-Vein Segmentation in Digital Subtraction Angiography

Cerebral X-ray digital subtraction angiography (DSA) is a widely used imaging technique in patients with neurovascular disease, allowing for vessel and flow visualization with high spatio-temporal resolution. Automatic artery-vein segmentation in DSA plays a fundamental role in vascular analysis with quantitative biomarker extraction, facilitating a wide range of clinical applications. The widely adopted U-Net applied on static DSA frames often struggles with disentangling vessels from subtraction artifacts. Further, it falls short in effectively separating arteries and veins as it disregards the temporal perspectives inherent in DSA. To address these limitations, we propose to simultaneously leverage spatial vasculature and temporal cerebral flow characteristics to segment arteries and veins in DSA. The proposed network, coined CAVE, encodes a 2D+time DSA series using spatial modules, aggregates all the features using temporal modules, and decodes it into 2D segmentation maps. On a large multi-center clinical dataset, CAVE achieves a vessel segmentation Dice of 0.84 ($\pm$0.04) and an artery-vein segmentation Dice of 0.79 ($\pm$0.06). CAVE surpasses traditional Frangi-based K-means clustering (P<0.001) and U-Net (P<0.001) by a significant margin, demonstrating the advantages of harvesting spatio-temporal features. This study represents the first investigation into automatic artery-vein segmentation in DSA using deep learning. The code is publicly available at https://github.com/RuishengSu/CAVE_DSA.


Clinical background
Cerebrovascular diseases are a major contributor to global mortality and long-term disability (Roth et al., 2020).These diseases encompass a range of conditions, including ischemic stroke due to vessel occlusion, stenosis, and aneurysms.In order to diagnose and treat such conditions, dynamic imaging of cerebral blood vessels is conducted using X-ray digital subtraction angiography (DSA).DSA provides a means of visualizing blood flow dynamics and changes in vasculature appearance over time (Figure 1), thereby offering valuable information for diagnosis, procedural navigation, therapeutic decision-making, and evaluation of treatment outcomes.
DSA images are conventionally examined visually by neuroradiologists and interventionalists, which could be laborious, subjective, qualitative, and vulnerable to error.Automatic segmentation of arteries and veins promises to assist in this assessment by highlighting and quantifying vascular changes, providing a foundation for a range of downstream clinical applications, including quantitative evaluation of endovascular thrombectomy, automatic emboli detection, and image guidance for real-time endovascular navigation.For example, automatic artery-vein segmentation can be valuable for providing a venous roadmap for navigation during transvenous procedures, such as dural venous sinus stenting or transvenous embolization of dural arteriovenous fistula or arteriovenous malformations.The segmented arteries and veins provide a wealth of quantitative data that can be used to extract and analyze various blood flow-related biomarkers for peri-operative decision-making and post-operative prognosis.
DSA is a dynamic imaging technique that provides a visual representation of blood flow over time through a series of consecutive frames (Figure 1).Although existing semantic segmentation networks could be utilized for end-toend artery-vein segmentation, most methods only consider static frames.When addressing artery-vein segmentation in DSA series, which are 2D+time image sequences, the temporal dimension is relevant.We hypothesize that effectively incorporating spatio-temporal flow dynamics is key for achieving accurate artery-vein segmentation in DSA.The aim of this work, therefore is to harness the high-resolution contrast flow dynamics of DSA for improved artery-vein segmentation through spatio-temporal learning techniques.

Related work
Vessel segmentation is an extensively studied field in medical imaging for over two decades, with recent advancements propelled by deep learning.Recent literature on vessel segmentation (Moccia et al., 2018) primarily addresses retinal (Fraz et al., 2012;Chen et al., 2021a), lung (Sluimer et al., 2006;Van Rikxoort and Van Ginneken, 2013;Tan et al., 2021), and cardiovascular imaging (Huang et al., 2023a,b).These studies demonstrate the adaptability of segmentation methods across various anatomical structures and imaging modalities.In brain imaging, various methods have been proposed to segment cerebral vessels on different image modalities such as MRA (Phellan and Forkert, 2017;Phellan et al., 2017;Robben et al., 2016), CT angiography (CTA) (Fu et al., 2020;Su et al., 2020), and DSA (Liu et al., 2018;Su et al., 2021;Vepa et al., 2022;Zhang et al., 2020), each modality posing unique challenges and necessitating tailored approaches.The prevalent use of U-Net (Ronneberger et al., 2015) in these studies underscores its versatility.However, existing methods mainly rely on spatial vasculature features learned from static 2D DSA frames without considering the complete series.It has been shown that U-Net tends to generate false positives in the presence of subtraction artifacts that appear similar to blood vessels (Zhang et al., 2020).
Artery-vein segmentation has been primarily explored in fundus images (Hemelings et al., 2019) and non-contrast lung CT images (Qin et al., 2021) using U-Net.However, the segmentation of cerebral arteries and veins remains an under-explored area, with limited studies on 4D CTA (Meijs et al., 2020), 3D MRA images (Hilbert et al., 2020), and 3D-DSA (Raz et al., 2021) using 3D U-Net models.These methods predominantly leverage spatial features (Hilbert et al., 2020) or manually crafted temporal parameters (Meijs et al., 2020) to differentiate between arteries and veins.In a previous study (Van Asperen et al., 2022), we investigated the automatic classification of arteries and veins using means clustering, contingent on vessel enhancement and binarization via the Frangi filter.The quality of vessel segmentation through the Frangi filter depends on the choice of the binarization threshold, a parameter that exhibits high variability across different DSA images.Nevertheless, to the best of our knowledge, no fully automatic and robust arteryvein segmentation algorithm has been developed for DSA images yet.

Contributions
This work presents CAVE, the first automatic method for artery-vein segmentation in DSA using deep learning, establishing a new benchmark.CAVE takes a 2D+time video sequence with variable length as input and produces 2D artery-vein segmentations as output.We leverage U-Net for spatial vasculature representation and temporal modules to learn temporal cerebral contrast flow characteristics simultaneously.On a multi-center clinical dataset, we demonstrate the utility of deep learning in cerebral arteryvein segmentation.CAVE may facilitate fast, accurate, and objective interpretation of cerebral vasculature in DSA, thus assisting endovascular interventions in clinical practice.
The remainder of this paper is organized as follows: Section 2 describes the proposed artery-vein segmentation method and Section 3 details the experimental dataset and annotation process.The extensive experiments and results are presented in Section 4, and further discussed in Section 5. Finally, Section 6 summarizes the conclusions of this study.Output is a two-channel segmentation image with artery and vein represented by red and blue colors respectively.The temporal learning module (TLM) could be GRU, LSTM, or temporal transformer (TT).

Methods
A cerebral digital subtraction angiography (DSA) series consists of a sequence of X-ray images captured postcontrast media injection, subtracted from the initial frame (pre-contrast image).This sequence presents the arrival of contrast media and the cerebral blood flow dynamics over time.The primary objective of this work is to develop an automated method for segmenting arteries and veins from a given DSA series, obtaining 2D artery-vein label masks as shown in Figure 1.
Rather than relying on pixel-wise time-intensity curves (TICs) or vasculature appearance separately, we propose to simultaneously leverage spatial and temporal features to segment arteries and veins in an end-to-end way.The proposed network architecture (Figure 2) takes DSA series of varying lengths as input.In the encoding path, each frame undergoes the same set of convolutional operations with shared weights, and the resulting features are temporally encoded to capture the spatio-temporal flow dynamics across all frames.These aggregated features are concatenated in the decoder path to generate high-resolution segmentation maps, resulting in a two-channel binary image that represents the segmented arteries and veins.The model is fully convolutional, thus allowing for input of varying sizes and temporal resolutions.

Spatial learning
The proposed CAVE (Figure 2) employs a UNet-like architecture for spatial encoding and decoding of DSA frames, which comprises five down layers (yellow boxes) and five up layers (green boxes).Each layer utilizes double-convolution blocks with instance normalization and ReLU activation.Max pooling and bilinear upsampling are utilized for the contracting and expanding path respectively.The number of feature channels starts from 64 and doubles after each max pooling to offset spatial information loss and halves after each bilinear upsampling operation to reconstruct spatial resolution.In Figure 2, it is only for illustration purposes that the down blocks appear to be replicated.In practice, all frames of the DSA series share the same down block for feature extraction, thereby avoiding an exponential increase in training parameters.

Temporal learning
We extend our approach beyond spatial representation learning by incorporating temporal learning techniques to improve the accuracy of artery-vein segmentation in DSA.Specifically, we explore three state-of-the-art techniques: convolutional GRU (ConvGRU) (Ballas et al., 2015), convolutional LSTM (ConvLSTM) (Shi et al., 2015), and temporal transformer (TT) based on Vaswani et al. (2017).The con-vGRU and convLSTM architectures are designed to address the vanishing gradient problem (Hochreiter, 1998) in long sequences.The convGRU employs update and reset gates, while the convLSTM uses forget, input, and output gates for a similar purpose.On the other hand, the transformer model leverages its attention mechanism to focus selectively on various segments of the input sequence, which could be particularly beneficial for understanding temporal relationships in the application of artery-vein segmentation in DSA.These techniques allow for the processing of input series of varying lengths and encoding them into fixed-sized representations, thereby enabling temporal aggregation of information.We apply this feature extraction and temporal aggregation process in all five layers, which operate on multi-scale image features.
Recurrent neural networks (RNN), such as GRU and LSTM, have been widely used in learning sequential information flow using a gate mechanism.In this work, we tailor ConvGRU and ConvLSTM to distinguish vessels from subtraction artifacts.GRU and LSTM models are employed to process sequential frame data, capturing temporal dependencies.The high-level architecture is shown in Figure 2. The module takes the spatial features of size  ×  ×  ×  ×  generated from the U-Net encoders of each frame as input, and produces aggregated feature maps of size  ×  ×  ×  that align in dimensions with the corresponding decoder branch.,  , , ,  represents the batch size, frame length, channel size, image height, and image width respectively.In this work, the ConvGRU and ConvLSTM modules each consist of two convolutional RNN layers with a kernel size of 3 × 3 and with hidden dimensions equal to input dimensions.Such an end-to-end trained module is designed to learn complex spatio-temporal features.It is worth noting that this module allows a variable number of input frames.
Besides RNN, Transformer (Vaswani et al., 2017) has been shown effective in numerous deep learning tasks including image segmentation.We developed a temporal transformer module, as sketched in Figure 2, to learn aggregate temporal features through temporal attention.The temporal transformer module, with its temporal attention capability, is utilized to selectively emphasize pertinent frames in a DSA series to enhance the differentiation between arteries, veins, and subtraction artifacts based on temporal flow dynamics in DSA.The module takes as input the UNet-encoded feature maps of all frames, and aggregates temporal features along the time dimension with a kernel of 1 × 1 ×  .The resulting feature map maintains the same size as a single-frame feature map, which is then concatenated with the decoding branch to construct the high-resolution segmentation map.In this work, the temporal transformer module consists of one multi-head attention layer with a kernel size of 1 × 1 and with hidden dimensions equal to input dimensions.

Loss function
As suggested in Isensee et al. (2021), we define the loss function for vessel segmentation (  (, )) as a combination of a cross-entropy loss and a Dice loss (Kervadec and de Bruijne, 2023): where  and  denote the predicted class probability and the reference binary label respectively.For artery-vein segmentation, we similarly define the loss function   (, ) as The subscript "" and "" denote artery and vein respectively.More specifically, the cross-entropy loss   (, ), defined as measures the difference between the reference labels and predicted class probabilities.The Dice loss   (  ,   ) and   (  ,   ) are defined as where Ω denotes the subset of the image space where  is positive.

Data
The study uses data from the MR CLEAN Registry (Jansen et al., 2018).It is an observational cohort study that included patients with acute ischemic stroke from sixteen centers in the Netherlands between March 2014 and December 2018.From this multi-center clinical registry, we manually segmented arteries and veins on 97 DSA series from different patients with either anterior-posterior (AP) or lateral views.The DSA series were acquired using various imaging systems (e.g., Philips, GE, and Siemens).The size of the individual frames of the acquired DSA series is 1024× 1024 pixels.The series have different lengths, ranging from 10 to 50 frames, and varying temporal resolutions ranging from 0.5 to 4 frames per second.
The artery-vein annotations were created using an inhouse developed tool in MeVisLab (Heckel et al., 2009) by four trained clinical students.To reduce inter-observer variability, the annotations were further refined by another trained student.An experienced radiologist was available for consultation during annotation.As visualized in Figure 1, the annotation results in two segmentation images, with arterial annotation in red and venous annotation in blue.Overlapping pixels have both labels as the two annotations are independent.These annotations serve as the reference standard in this work.
We randomly split the dataset into training, validation, and testing set with a ratio of 50%-20%-30% on the patient level.This resulted in 52 DSA series for training, 19 for validation, and 26 for testing.The models were trained using RMSprop optimization (Hinton et al., 2012) and a ReduceLROnPlateau (Paszke et al., 2019) scheduler with a patience of 10 epochs, a decay factor of 0.5, and an initial learning rate of 1 × 10 −5 .An early stopping strategy was applied with a patience of 50 epochs and a maximum of 1000 epochs.

Baselines
To benchmark the performance of CAVE and comprehensively assess the added value of simultaneous spatiotemporal learning in distinguishing vessels from subtraction artifacts, we implemented three representative solutions, and evaluated the CAVE against those.
Frangi+-means.We implemented a conventional twostep artery-vein classification method (Van Asperen et al., 2022) that combines the Frangi filter and -means clustering.First, the Frangi filter is applied to the static minimum intensity map (MinIP) of an input DSA, followed by fixed thresholding to obtain a binary vessel mask.Subsequently, all the vessel pixels within the mask are classified into arteries or veins.This clustering is achieved through means clustering with  = 2 using a set of intensity-based features derived from the time-intensity curve (TIC) of each pixel.These features include the area under the TIC curve (AUC), peak intensity, time to peak (TTP), arrival time, peak width, variance, standard deviation, and maximum uptake slope of the TIC.To ensure uniformity in the clustering process and mitigate the influence of varying magnitudes, all features are normalized to a range ranging from 0 to 1. Finally, the cluster with the lowest average TTP is identified as arterial, while the one with the highest average TTP is designated as venous.
U-Net semantic segmentation.Apart from classical machine learning, a standard U-Net could be directly utilized for identifying arteries and veins leveraging the spatial vascular features.We implemented a deep learning baseline using U-Net (Ronneberger et al., 2015;Zhang et al., 2020).The architecture is similar to the architecture in Figure 2 with the contracting path replaced by stand-alone down blocks.To have all vessels present in the spatial dimensions, this U-Net approach uses the MinIP image of all frames of a DSA series as input.Consequently, U-Net purely relies on spatial vasculature features to differentiate between arteries and veins without considering the temporal contrast flow features.

U-Net+𝑘-means.
To utilize both spatial and temporal features for artery-vein segmentation, we further developed a two-stage baseline approach that cascades U-Net and means clustering that sequentially leverages spatial and temporal information.The U-Net encodes a MinIP image and produces a binary vessel segmentation.The vessel pixels are subsequently clustered into arteries and veins via -means clustering on the time-intensity curves.

Evaluation metrics
We assess these methods on a hold-out test set composed of 26 DSA sequences from various patients from two perspectives: vessel segmentation and artery-vein segmentation (Table 1).We report the Dice coefficients, accuracy, sensitivity, and specificity for vessel segmentation, respectively defined below.
For artery-vein segmentation, we additionally evaluate the artery Dice (-), vein Dice ( -), and multiclass Dice (-).We define - and vein Dice  - as and where the subscript  and  denote artery and vein respectively.TP, FP, and FN are true positive, false positive, and false negative respectively.We define the multi-class Dice (-) as (11) Besides, we compute statistical significance using the paired Wilcoxon test on the Dice coefficients.

Quantitative analysis
In Table 1, we present the results of CAVE and other representative solutions on the test set in terms of vessel segmentation and artery-vein segmentation.Regarding vessel segmentation, both U-Net and CAVE significantly outperform the Frangi+-means approach in terms of Dice coefficient with a margin of 10% (P<0.001).Notably, CAVE Table 1 Performance of CAVE and other existing methods in vessel and artery-vein segmentation on the test set.Acc: accuracy, Sens: sensitivity, Spec: specificity, A-Dice: Artery Dice (Eq.9), V-Dice: Vein Dice (Eq.10), M-Dice: Multi-class Dice (Eq.11).The accuracy, sensitivity, and specificity of artery-vein segmentation are over both artery and vein classes.

Method
Vessel  Table 2 Performance of CAVE with respect to various temporal resolutions in frames per second (fps).fps 1 0.5 0.25 M-Dice 0.79±0.0560.76±0.0600.64±0.12substantially surpasses U-Net by 3.6% (P=0.023) and the Frangi+-means approach by 14% (P<0.001),demonstrating the effectiveness of incorporating simultaneous spatiotemporal vascular and flow representation for separating vessels from subtraction artifacts or other static instruments.
With respect to artery-vein segmentation, the advantage of spatio-temporal learning is even more prominent as shown in Table 1.While Frangi+-means, by relying solely on pixel-wise temporal characteristics, achieves a multiclass Dice coefficient of 0.60 (±0.084), spatial feature-based deep learning (U-Net) obtains a higher Dice coefficient of 0.67 (±0.060).The U-Net+-means approach, which leverages spatial and temporal features in two sequential stages, achieves a Dice coefficient of 0.69 (±0.050).In contrast, by integrating both vasculature appearance and flow dynamics in an end-to-end spatio-temporal deep learning model, CAVE yields a significantly higher multi-class Dice coefficient of 0.79 (±0.057) compared to U-Net (P<0.001) and Frangi+-means (P<0.001).Overall, CAVE demonstrates significant improvements in artery-vein segmentation by incorporating the temporal contrast flow aspects of DSA in end-to-end learning.We observe no statistically significant differences among ConvGRU, ConvLSTM, and the temporal transformer.
In terms of computational efficiency, CAVE takes on average 0.57±0.2seconds to segment a complete DSA series on an NVIDIA 2080 Ti GPU with 11 GB of memory.

Impact of temporal resolution
We investigate the influence of temporal characteristics, specifically the number of frames of a DSA series and the temporal frequency, on artery-vein segmentation performance.As shown in Figure 5, we do not find a notable correlation between the average Dice coefficient and the frame count of DSA series in the test set, evidenced by a Pearson correlation coefficient of 0.05 (P=0.81).This could be attributed to the fact that all DSA series have the same temporal frequency and are complete in our dataset.In contrast, we observe a notable reduction in segmentation performance when the DSA series are temporally downsampled for testing, as shown in Table 2.This suggests that the proposed CAVE method effectively utilizes temporal flow characteristics to enhance segmentation performance.

Qualitative analysis
To provide a comprehensive comparison between the proposed CAVE and U-Net, we present representative visualizations and qualitative evaluations.CAVE is particularly effective in distinguishing cerebral vessels from the subtraction artifacts, instruments, and subtraction artifacts, thanks to its ability to learn temporal contrast flow dynamics.In Figure 3, we provide visual comparisons of vessel segmentation results using U-Net and CAVE.The error maps, shown in columns c and e, use orange to indicate false positives and light blue to indicate false negatives.We identify two scenarios.First, in regions #1 and #3, where U-Net mistakenly recognizes subtraction artifacts and static instruments as vessels (column c), CAVE correctly avoids such misclassifications (column e).Furthermore, CAVE successfully identifies venous vessels, as shown in regions #2 and #4, even when they are surrounded by subtraction artifacts based on their temporal characteristics, whereas U-Net fails to detect them in the presence of noise in the background.
Figure 4 presents two examples illustrating the performance of U-Net and CAVE in artery-vein segmentation.Compared to manual annotations (column a), CAVE (column e) shows fewer errors than U-Net (column c).While U-Net performs well in recognizing large vessels such as proximal arteries and the superior sagittal sinus (SSS), it struggles to correctly classify distal arteries and veins, as highlighted in regions #1 and #2.In contrast, CAVE achieves accurate classification in both small and large vessels, demonstrating its effectiveness in capturing spatial and temporal features.Additional visual comparisons of these methods are available in the appendix.

Discussion
In this work, we proposed a fully automated deep learningbased method for artery-vein segmentation in cerebral DSA images.Through quantitative and qualitative comparisons, we have demonstrated the added value of simultaneous spatio-temporal learning (CAVE) against spatial learning (U-Net), temporal learning (-means), and sequential twostage spatial and temporal learning (U-Net + -means).
This application-tailored solution innovatively leverages the distinctive temporal contrast flow characteristics in DSA to precisely identify cerebral arteries and veins while alleviating misclassifications of subtraction artifacts and other surgical instruments.The primary focus of the analyses revolves around evaluating the inherent benefits of harnessing temporal information.For quantitative assessments, we employ the foundational U-Net as our benchmark.With the rapidly evolving landscape of deep network architectures, the U-Net may not be positioned among the most recent models in the current literature.Recent advancements in stateof-the-art networks, such as UNet++ (Zhou et al., 2018), PointRend (Kirillov et al., 2020), nnU-Net (Isensee et al., 2021), TransUNet (Chen et al., 2021b), or CTransNet (Wang et al., 2022) may outperform U-Net, when leveraging purely spatial features from MinIP images.Nevertheless, adapting the baseline spatial learning module and capitalizing on the abundant temporal flow dynamics are orthogonal and mutually reinforcing efforts to improve performance.CAVE is designed to take complete DSA series with variable lengths and output an overall 2D artery-vein segmentation map for an input DSA series.The explored TLM modules are flexible in terms of input dimensions.These modules are capable of handling DSA series with variable lengths, while alternative methods such as 3D U-Net, direct feature concatenating, or temporal convolutional network (TCN) (Lea et al., 2016) typically cannot.It necessitates temporal sliding or temporal padding/cropping to utilize models that require fixed frame length.Such models would also require temporarily varying annotations to train.
In comparison with the hold-out strategy, K-fold crossvalidation may provide a more robust assessment of the described methods in this manuscript.We tested the proposed CAVE (with ConvGRU) method on five random data splits and obtained M-Dice coefficients of 0.79 (±0.057), 0.79 (±0.052), 0.78 (±0.043), 0.78 (±0.049), and 0.78 (±0.057).No significant performance differences (P=0.4,Kruskal-Wallis test) were observed among these splits.This indicates that the hold-out method allows for reliable performance estimates.Considering this observation, alongside the substantial computational demands of cross-validation, we opted for the hold-out strategy in subsequent performance evaluations.
In this study, we observe that, despite the differences in network design, the TLM modules (i.e., ConvGRU, ConvL-STM, and Temporal Transformer) perform comparably in the application of artery-vein segmentation on our dataset.This could be attributed to their shared capability to capture temporal dependencies.While the temporal transformer is often considered superior (Vaswani et al., 2017;Devlin et al.,  2018; Dai et al., 2019), its effectiveness may not consistently extend to all scenarios.In this context, temporal information is relevant.All modules do the job of extracting temporal patterns, whereas the exact network design and configuration do not make a notable difference.This aligns with a broader understanding that the practical impact of architectural advantages can vary greatly depending on the specifics of the task and data.
We note that the segmentation Dice scores of veins are on average lower than those of arteries across all methods.This observation can be attributed to several factors: 1) venous vessels typically exhibit relatively lighter intensities and more ambiguous boundaries, making them challenging to be accurately predicted; 2) the quality of vein annotations may also be comparatively lower due to increased annotation difficulties; 3) there tends to be a scarcity of venous phase frames, especially in cases of incomplete DSA series when image acquisition is prematurely stopped by the operator.
Building upon this work, future research may investigate the influence of patient motion on those spatio-temporal techniques for artery-vein segmentation.From the clinical perspective, there is significant value in conducting thorough validations to assess the clinical generalizability of CAVE.

Conclusion
We have presented CAVE, a deep learning-based cerebral vessel and artery-vein segmentation method in digital subtraction angiography.It integrates both spatial vascular appearance and temporal contrast flow dynamics in a unified end-to-end framework, producing high-quality multiclass segmentations from cerebral DSA series with variable lengths.Experimental results on a multi-center clinical dataset demonstrate that CAVE significantly outperforms existing methods.Qualitative analyses further underpin the advantages of CAVE in distinguishing vessels from subtraction artifacts.CAVE has the potential to facilitate vesselbased quantitative analyses for clinical diagnosis, prognosis, and treatment planning in endovascular interventions.

Figure 1 :
Figure 1: Task illustration of artery-vein segmentation in DSA.

Figure 2 :
Figure 2: Network architecture for artery-vein segmentation in DSA.Input is a 2D+time DSA series with variable series length.Output is a two-channel segmentation image with artery and vein represented by red and blue colors respectively.The temporal learning module (TLM) could be GRU, LSTM, or temporal transformer (TT).

Figure 3 :
Figure 3: Example visualizations of cerebral vessel segmentation results of U-Net and the proposed CAVE.Column a: manual annotation of vessels overlaid on the MinIP image; column b: segmentation output of U-Net; column c: U-Net error map with false positives (orange) and false negatives(light blue); column d: segmentation output of CAVE; column e: CAVE error map with false positives (orange) and false negatives (light blue).

Figure 4 :
Figure 4: Two example visualizations (in rows) of artery-vein segmentation.Column a: manual annotation of arteries (red) and veins (blue) overlaid on the MinIP image; column b: segmentation output of U-Net; column c: U-Net error map with false positives (orange) and false negatives(light blue); column d: segmentation output of CAVE; column e: CAVE error map with false positives (orange) and false negatives (light blue).

Figure 5 :
Figure 5: Association between the segmentation Dice and the number of frames of DSA series in the test set.