MAFE-Net: retinal vessel segmentation based on a multiple attention-guided fusion mechanism and ensemble learning network

The precise and automatic recognition of retinal vessels is of utmost importance in the prevention, diagnosis and assessment of certain eye diseases, yet it brings a nontrivial uncertainty for this challenging detection mission due to the presence of intricate factors, such as uneven and indistinct curvilinear shapes, unpredictable pathological deformations, and non-uniform contrast. Therefore, we propose a unique and practical approach based on a multiple attention-guided fusion mechanism and ensemble learning network (MAFE-Net) for retinal vessel segmentation. In conventional UNet-based models, long-distance dependencies are explicitly modeled, which may cause partial scene information loss. To compensate for the deficiency, various blood vessel features can be extracted from retinal images by using an attention-guided fusion module. In the skip connection part, a unique spatial attention module is applied to remove redundant and irrelevant information; this structure helps to better integrate low-level and high-level features. The final step involves a DropOut layer that removes some neurons randomly to prevent overfitting and improve generalization. Moreover, an ensemble learning framework is designed to detect retinal vessels by combining different deep learning models. To demonstrate the effectiveness of the proposed model, experimental results were verified in public datasets STARE, DRIVE, and CHASEDB1, which achieved F1 scores of 0.842, 0.825, and 0.814, and Accuracy values of 0.975, 0.969, and 0.975, respectively. Compared with eight state-of-the-art models, the designed model produces satisfactory results both visually and quantitatively.


Introduction
Segmentation of blood vessels in retinal images plays a unique and crucial role in the initial prevention, diagnosis, and evaluation of ocular diseases, as such diseases typically cause changes in retinal vascular morphology [1,2].In spite of the fact that retinal images are commonly used for the diagnosis and treatment certain illnesses, manual segmentation of blood vessels is a extraordinarily challenging task due to low contrast, complex curvilinear structures, and irregular illumination [3,4], as shown in Fig. 1.Especially, this subjectivity may leads to inconsistent results among different doctors during the segmentation stage, which may hinder the diagnosis by clinical doctors [5].As a result, we can use various deep learning frameworks to automatically and non-invasive capture micro-vessels and furnish rich vascular features from retinal images, effectively assisting clinicians in diagnosing and treating various eye diseases [6,7].However, there is no unifying model has yet emerged for retinal vessel segmentation.
In order to assist clinicians in diagnosing and treating ocular diseases, many distinct algorithms have been presented to acquire a wide variety of features for retinal vessel segmentation.These segmentation approaches can be divided into three categories.The first category is to manually design the feature extraction layers to highlight the curvilinear structures, such as Hessian Matrix [8], second-order image derivatives [9], stick-based filters [10,11], and dynamic evolution models In which, the first row contains four original images, and the second row displays the corresponding areas inside the white box.The third row provides the corresponding ground truths for the images in the first row.Similarly, the fourth row offers the corresponding ground truths for the local regions in the second row.[12].The second classification is to use deep learning methods to segment retinal vessels, such as UNet network [13], CENet network [14], NAUNet network [15], ConvUNext network [16], CS2Net network [17], SA-UNet network [18], and GDF-Net network [19].The third type combines deep learning and manually designed feature layers to improve the segmentation accuracy of retinal images, such as D-GaussianNet [20], LIOT [21], and the combination of an ODoS filter and an IterNet network [22].Although these methods have achieved good results, it is still difficult to detect weak retinal vessels [1].
The traditional strategy is to extract features based on the unique shapes and structures for retinal vessel segmentation.Based on this theory, Lesage et al. [8] proposed the Hessian matrix to precisely segment the curvilinear structures, but it may lead to incomplete and broken blood vessels.Using a different approach, Li et al. [23] presented a hardware-oriented approach to enhance fundus vascular images, but the segmentation efficiency was not good for thin blood vessels.However, the manual methods of designing feature layers could only extract a few features from retinal images and cannot accurately detect blood vessels.
In order to make up for the defect that traditional methods cannot accurately extract fundus information, many deep learning models have been designed to segment blood vessels.Such as the milestone based segmentation network named FCN [24], but it may led to serious information loss and was quickly abandoned by researchers.Its improved version named UNet [13], by designing a skip connection operation to reduce information loss, but the information fusion in the skip connection part has its limitations.The limitation is that the consecutive pooling operations or convolution striding reduce the feature resolution, making it difficult to learn increasingly abstract feature representations.To remedy the shortcoming, Gu et al. [14] designed the CENet, which used the dilate convolution to extract features of the target objects, and introduced DAC and ASPP to reduce the information loss.But dilate convolution could produced grid effects, making it difficult to segment small blood vessels.Similarly, Han et al. [16] presented a unique ConvUnext architecture, which has significant advantages in expanding receptive fields and removing irrelevant information from retinal images.Unfortunately, those methods [14,16] with very high computational power requirements cannot meet the demand for fast segmentation.To reduce time consumption, Yang et al. [15] designed a lightweight NAUNet.By adding channel attention mechanism to the lightweight UNet model, it can segment blood vessels quickly.Using the same strategy, Guo et al. [18] proposed a SA-UNet architecture to filter out irrelevant information, but it could not accurately segment weak blood vessels.Recently, inspired by the self-attention mechanism [25,26], Mou et al. [17] introduced a CS2-Net to capture global information of fundus images.However, those deep learning algorithms [14][15][16][17][18] were mainly looking for powerful structures and frameworks, ignoring the complexity of the algorithms.
Recently, many deep learning models have been combined with manually designed feature layers to achieve the goal of segmenting retinal vessels.For example, Alvarado-Carrillo et al. [20] applied the combination of Gaussian filter and adaptive parameters to segment retinal images.Although good results were achieved, it consumed too much memory and time.Using a different approach, Shi et al. [21] designed a LIOT scheme to detect blood vessels.However, it may result in information loss during the image transformation stage.Recently, Peng et al. [22] improved the model [21] and combined an ODoS filter with an IterNet network, achieving good results in retinal vessel detection.Nevertheless, those effective methods require adjusting parameters for the most effective feature extraction.If the parameters were poorly chosen, the desired effect may not be achieved.
However, most retinal vessel segmentation approaches are improvements on the UNet model, which may result in the loss of some scene information.To cope with this challenge, the selfattention mechanism [25,26] has been gradually integrated into various deep learning frameworks for image segmentation.Unlike former frameworks, the essence of the self-attention mechanism is to extract global information and reduce the loss of image information.Unfortunately, the self-attention mechanism construction takes a long time.Hence, the self-attention mechanism is integrated with convolutional neural networks to develop a simple, practical and efficient algorithm for retinal vessel detection.Furthermore, the ensemble learning model can effectively integrate the advantages of multiple deep learning models, so as to segment retinal vessels more accurately.Therefore, an ensemble learning network is designed to improve retinal vessel segmentation.
In this study, we present a unique and practical framework based on multiple attention-guided fusion mechanism and ensemble learning network (MAFE-Net) for blood vessel segmentation.Firstly, the attention fusion module is applied to reduce scene information loss.Secondly, a spatial attention module is employed to effectively combine low-level and high-level features for redundant and irrelevant information suppression.Thirdly, a DropOut layer is employed to prevent overfitting of the presented framework.Finally, the ensemble learning framework is designed to improve retinal vessel segmentation.Our work can be summarized as follows: 1.A lightweight neural network comprising of four different encoders and decoders is applied to improve the segmentation performance.Furthermore, Batch Normalization (BN) and DropOut modules were imported during convolution operations, preventing the improved MAFE-Net from overfitting.
2. To address the limitations of convolution, a self-attention fusion module is employed to aggregate spatial-channel attention units in parallel.This facilitates the extraction of global information from retinal images, thereby compensating for the aforementioned defects.
3. The skip connection component incorporates a spatial attention module to enhance the model's ability to acquire pertinent features by filtering out extraneous information and assigning greater importance to relevant information.
4. To further improve the segmentation performance, an ensemble learning strategy is used to combine multiple deep learning models for retinal vessel segmentation.
5. An efficient curvilinear structure detection method is developed through the joint application of the ensemble learning framework, attention fusion module, and DropOut layer.[30] proposed a global self-attention mechanism approach for retinal vessel segmentation.However, those models [17,30] exhibit a common drawback of having a high computational resource consumption.

Ensemble learning
Ensemble learning, as a fundamental approach, aims to leverage the strengths of multiple models in order to achieve desirable generalization performance [31].The amalgamation of predictions from various individual models has proven to be an efficacious technique for improving model performance [32].Consequently, ensemble learning has gained significant traction in the medical domain, finding extensive applications in the detection of lung and colon cancer [33], heart disease [34], COVID-19 [35], thyroid nodules [36], and retinal vessel segmentation [37].In terms of retinal image segmentation, Fraz et al. [38] used an ensemble system to segment blood vessels, but the segmentation accuracy cannot meet clinical requirements.To improve the segmentation accuracy, Wang et al. [39] adopted the winner-takes-all classifier to acquire the best classification performance.Taking a different approach, Du et al. [40] presented an ensemble strategy to fuse different deep learning models for retinal vessel segmentation, and achieved satisfactory results.Although the ensemble strategy may increase the computational complexity, it can greatly improve the segmentation results.

Datasets
Three publicly available datasets were used to demonstrate the effectiveness of MAFE-Net: DRIVE [41], STARE [42], and CHASEDB1 [38].Table 1 shows the specific details of these datasets.In this work, a multiple attention-guided fusion network (MAF-Net) is presented for retinal vessel segmentation.As shown in Fig. 2(a), an attention fusion module (AFM) is introduced to extract global information.In contrast, a spatial attention module (SAM) is applied to remove redundant information.In order to better understand the designed model, all details are presented in Table 2. Furthermore, an ensemble learning network MAFE-Net is designed to integrate four different deep learning models (UNet [13], SA-UNet [18], CS2-N et [17] and MAF-Net) for retinal vessel segmentation, as shown in Fig. 2(b).The difference between MAF-Net and MAFE-Net is that the MAFE-Net integrates the advantages of multiple models, which allows for better segmention of fundus images.

Attention fusion module
The dual-attention mechanism, initially designed by Fu et al. [43], enhances contextual information to compensate for scene information lost during downsampling.Building upon the findings of previous studies [17,18,43], the MAF-Net incorporates a self-attention fusion module [17].
It is widely acknowledged that the utilization of local features obtained from neural networks may lead to classification errors, primarily due to their limited ability to effectively model fundus images at a global level.In order to address this issue, a dual-attention mechanism is introduced.This mechanism aims to capture and incorporate global information present in retinal images, as depicted in Fig. 3.By operating in parallel, the mechanism mitigates any potential interference between the attention mechanisms and facilitates the comprehensive extraction of intricate details to retinal vessels.In which, the impact of the spatial attention module on long-range correlation and their ability to extract global information from fundus images is examined.Conversely, the channel attention mechanism primarily focuses on weighting information, assigning higher weights to channels that contain valuable data and reducing the weights of channels with less pertinent information.
The issue of global feature extraction can be effectively addressed by employing the spatial attention mechanism [17,18,43], which improves the model's ability to learn the underlying global features.To improve the segmentation performance of retinal vessels along both the horizontal and vertical axes, the convolution operations are substituted with horizontal and vertical convolutions, respectively.In order to accurately capture the blood vessels in the retinal images, we decompose the fundus image along the horizontal and vertical directions, and use 1x3 convolution and 3x1 convolution to replace the traditional 3x3 convolution, so that more information about the edges of the blood vessels can be obtained.As depicted in Fig. 4, this mechanism takes input features F ∈ R C×H×W (where N=H×W), and linearly maps them to generate three matrices: Qy, Kx, and V = R C×H×W .The variables Qy and Kx pertain to the vertical and horizontal directional characteristics.Additionally, the variable C indicates the quantity of feature channels, while H and W respectively represent the width and height  dimensions.Consequently, the acquisition of the spatial attention map can be accomplished through the utilization of a softmax function.

S(x, y)
Where the relationship between the xth position and yth position is denoted by S(x,y).The spatial attention map effectively captures the vascular structures across various spatial regions, with higher similarity leading to increased values in the feature map.Additionally, another feature V is acquired by employing a 1×1 convolutional layer and reshaping it accordingly.Consequently, the mathematical representation: Using the spatial attention mechanism, blood vessel segmentation could be improved by extracting global features from retinal images.The classification of retinal image channels into distinct categories can contribute to the identification of interconnected internal semantic characteristics, thereby extracting pertinent information from various channels to enhance semantic expression capability.The channel attention mechanism, illustrated in Fig. 5.This function multiplies the input F x ∈ R C×H×W with its transpose matrix F T y ∈ R C×H×W to produce the channel attention feature map.The specific formula is presented below: Here, the relationship between the xth channel and yth channel is denoted by C(x,y).Through the computation of the feature map, channels exhibiting high similarity are enhanced, while channels with low similarity are inhibited.Subsequently, the Softmax activation function is employed to distinguish between background and vascular structures in retinal images.
Consequently, the mathematical representation: The incorporation of the channel attention mechanism leads to an augmentation in the distinction among various channels, thereby enhancing the overall efficacy of the model.Consequently, the dual-attention mechanism can be explicated as follows.
The utilization of the dual-attention mechanism exhibits commendable efficacy in enhancing feature representation, thereby facilitating the acquisition of more accurate segmentation outcomes.

Spatial attention module
The UNet model addresses this issue of downsampling by integrating image information from both the encoder and decoder sides.Nevertheless, this strategy introduces a blend of low-level and high-level features, resulting in limited efficacy.In fact, this amalgamation may introduce superfluous and unrelated noise to high-level features, consequently leading to inadequate detection, particularly for blood vessels detection [44].Inspired by previously published models [15,18,44], the incorporation of the spatial attention module [18] into the skip connection component, aims to eliminate superfluous information.The underlying approach is visually depicted in Fig. 6.Assuming that the input features F ∈ R H×W×C , two features F avgpool ∈ R H×W×1 and F maxpool ∈ R H×W×1 are generated after average pooling and max pooling.By employing max pooling and average pooling techniques, matrix vectors F maxpool and F avgpool are produced.Following this, the spatial attention feature map is generated through convolution and Sigmoid operations.The output features F s ∈ R H×W×C are obtained by the dot product of the attentional feature map and the input features.This operation improves the segmentation accuracy of retinal images.The mathematical representation: where f 7×7 (•) stands for convolution operation with convolution kernel 7, and σ (•) denotes for Sigmoid function.

DropOut module
Despite the initial application of a data augmentation operation in the presented framework, it may lead to overfitting in cases where there is an inadequate number of retinal images for training [45].To address this issue, the DropOut module is introduced to encourage the network to acquire more resilient and efficient features [19].Unlike the conventional convolution module, the incorporation of DropOut and BN (Batch Normalization) into the convolution layers, as depicted in Fig. 7, facilitates the acceleration of the neural network's convergence rate while mitigating the problem of model overfitting.Although data augmentation operation increases the number of images, it is still inadequate for neural networks.Specifically, the DropOut module is applied into the presented model to avoid overfitting phenomenon.In other words, the DropOut module is randomly implemented by deactivating some neurons during training.As a result, these operations reduced the dependence of neural networks on specific features, enabling them to learn more robust and universal features.In this way, the neural network overfitting can be prevented [46].

Ensemble learning
To further improve the segmentation performance, an ensemble learning network [33,47] named MAFE-Net is designed to combine multiple different lightweight deep learning models (UNet, SA-UNet, CS2-Net and MAF-Net) for retinal vessel detection, as presented in Fig. 8.As the first architecture, we applied a lightweight UNet model to learn the contextual information from retinal vessel images for the predicted probability achievement.In which, the skip connection part was used to reduce the spatial information loss, as presented in Fig. 8(a).
The second implemented architecture is the lightweight SA-UNet model presented by Guo et al. [18].As shown in Fig. 8(b), a spatial attention module in the skip connection part is presented to remove redundant information, and assign more weights to relevant information.What's more, a DropOut layer is used to prevent the SA-UNet network from overfitting.
Different from the previous introduced UNet and SA-UNet models, Mou et al. introduced a CS2-Net model [17], which includes two types of attention modules to further integrate local features with global features for retinal vessel segmentation improvement.Unlike the original CS2-Net model, we adopted a lightweight CS2-Net for model integration, as shown in Fig. 8(c).
The MAF-Net model, illustrated in Fig. 8(d), represents the fourth implemented architecture.This model incorporates the attention fusion module to extract diverse blood vessel features.Additionally, a spatial attention module is integrated into the skip connection to eliminate redundant information, while the DropOut layer is utilized to randomly discard certain neurons.These modifications can promote the MAF-Net to learn more effective features, resulting in improved retinal vessel segmentation.
To construct the ensemble learning framework, the convolutional layers were applied to integrate the predicted probability from above deep learning models for final segmentation results achievement.Finally, the Sigmoid activation function is applied to acquire the segmentation results of retinal vessels, and the details is demonstrated in Fig. 8(e).

Implementation details
There are some publicly available retinal image datasets on the internet, but each dataset only includes a few dozens fundus images.To avoid overfitting of the specific neural network, we introduce a data augmentation strategy.Specifically, random rotation, adding Gaussian noise, and color jittering are used on retinal fundus images to increase the number of pictures.In addition, all the algorithms have the same epoches, batchsize and learning rate.Where the epoch is 100, and the batchsize is 8.The details are described in Table 3.

Evaluation matrix
To illustrate the segmentation efficacy of the designed MAF-Net and MAFE-Net, conventional evaluation metrics such as accuracy (Acc), F1-score, sensitivity (SE), and specificity (SP) are employed to illustrate the superior performance of these methods compared to numerous state-of-the-art approaches.This superiority is demonstrated mathematically.
Among these metrics, TP denotes the count of pixels accurately identifying retinal vessels as genuine vessels, and FP signifies the rest pixels erroneously identifying background as genuine vessels, TN represents the count of pixels accurately identifying background as true background, and FN indicates the count of pixels incorrectly identifying the vessel class as true background.

Visual inspection
To illustrate the effectiveness of the presented MAF-Net and MAFE-Net in weak object segmentation, our computerized models were validated on three datasets.As described, Fig. 9(a) and (b) denote the original retinal images and their corresponding labels.Figure 9(c-l) represent the retinal vessel segmentation results by using UNet [13], ResUNet, Attention-UNet, R2UNet, UNet++ [48], SA-UNet [18], CS2Net [17], LIOT [21], MAF-Net and MAFE-Net models, respectively.In which, the areas marked with red boxes are partially weak objects.Based on comparisons with eight state-of-the-art models, the presented framework exhibits amazing performance in weak vessel segmentation.In other words, the presented model achieves the maximum SE value.Ablation studies were performed to evaluate the effectiveness of various modules in the MAF-Net.Specifically, we compared the DropOut layer with the spatial attention module (SAM), the BN layer (DB), and the attention fusion module (AFM).The results, as presented in Table 5, confirm the significance of each module in enhancing the algorithm's performance on the STARE and DRIVE datasets.Notably, the removal of any module resulted in a decrease in both F1 and Acc values for the entire model.This observation supports our claim that the module contributes to improved accuracy in fundus vascular image segmentation, albeit with a marginal increase in computational requirements.The inclusion of the DropOut layer within the MAF-Net framework serves to mitigate overfitting.However, it is important to note that the parameters associated with the DropOut layer can also influence the performance of neural networks.To demonstrate the influence of the parameters, we performed experiments on the STARE, DRIVE and CHASEDB1 datasets, utilizing diverse parameter values for validation purposes.The obtained results, as depicted in Table 6, reveal that setting the DropOut parameters to 0.5, 0.7, and 0.9 respectively yielded varying experimental outcomes.Notably, when the DropOut parameter was set to 0.9, both F1 and Acc achieved their maximum values.

Effect of the model parameters on the presented framework
To prove that our proposed model is a lightweight model, we have conducted extensive comparisons with other methods in model parameters.As shown in Table 7, the model parameters of the presented MAF-Net and MAFE-Net are 0.51M and 5.88M.To further demonstrate the effectiveness of the algorithm, we have combined the three datasets to form a new dataset.Among them, 50 images were used for training and 38 images were used for testing.As shown in Table 8, the presented method has higher parameter indicators F1 and ACC than many compared methods.

Discussion
The study introduces a novel approach for segmenting retinal images by employing multiple attention mechanisms and ensemble learning networks.This approach has been proven to be effective even in scenarios where vessels are thin, weak, and inhomogeneous.The framework offers several advantages and distinct characteristics.Firstly, it enhances the UNet model by incorporating the DropOut module and BN (Batch Normalization) module within the convolutional neural network.These modifications not only improve the performance of the MAFE-Net but also ensure the maintenance of segmentation accuracy.Secondly, in order to address the constraints posed by convolution and attain comprehensive modeling of retinal images, this study introduces a dual-attention mechanism to extract global information.This mechanism enables effective interaction between local and global information, thereby enhancing the simultaneous extraction of information from retinal images.Consequently, this approach ultimately leads to an improvement in the accuracy of segmenting minute vessels.Thirdly, a spatial attention module is applied to eliminate extraneous information from the encoder components, thereby ensuring the effective fusion of decoder images and enhancing the accuracy of retinal image segmentation.Additionally, an ensemble learning framework is devised to enhance the blood vessel segmentation performance by integrating multiple distinct deep learning models.The precise segmentation of blood vessels in fundus images can significantly assistant medical professionals in the diagnosis and treatment of various retinal diseases.
The presented framework was validated on three publicly available datasets.Compared with eight state-of-the-art models, through visual examination and quantitative assessment, it demonstrates the exceptional performance of the presented framework in accurately segmenting thin, weak, and inhomogeneous blood vessels.The main reasons are as follows: (1) The presented MAF-Net and MAFE-Net incorporate a dual attention mechanism that effectively extracts retinal vessel information from both spatial and channel domains.(2) The spatial attention module implemented in this framework serves to eliminate irrelevant information originating from the encoder side, thereby preventing the transmission of irrelevant information to the decoder side.
(3) The reduction in the number of channels within the algorithm enhances its running speed without compromising the segmentation accuracy of retinal images.
However, the proposed method may result in fragmented and incomplete segmentation of small blood vessels due to the uneven contrast and substantial variations in lighting observed in retinal images.This can be observed in Fig. 10, where a comparison with ground truths reveals that the proposed method may lead to incomplete representation of certain small blood vessels.Furthermore, the emphasis of the MAFE-Net lies in the development of deep architectures, neglecting the capture of shape features pertaining to retinal vessels.Despite the numerous limitations of the presented deep learning framework, it demonstrates efficacy in detecting delicate structures.

Conclusion
The purpose of this study is to effectively segment retinal vessels, with a specific focus on the difficulties associated with segmenting vessels that are weak, thin, and exhibit inhomogeneity.To mitigate the issue of overfitting in the UNet model, the utilization of DropOut and Batch Normalization techniques is introduced.Furthermore, considering that convolutional neural networks cannot fully capture the complex information in retinal vessel images, this research employs the integration of spatial-channel attention modules to concurrently extract comprehensive knowledge from retinal images.Unfortunately, the approach may result in irrelevant information from the encoder side to the decoder side.In order to alleviate the troublesome problem, the spatial attention module is imported.Moreover, to further improve the blood vessel segmentation performance, an ensemble learning framework is designed to combine multiple different deep learning models to detect retinal vessels.The presented methodology is assessed using publicly accessible datasets, i.e., DRIVE, STARE, and CHASEDB1.The experimental findings indicate that the MAF-Net and MAFE-Net frameworks, as presented, exhibit commendable performance in retinal vessel segmentation when compared to several contemporary approaches.However, it is worth noting that although our algorithm has successfully segmented certain weak vessels, there may still be instances of vessel breakage.In the future, we will attempt to apply all models using the same encoder with different branches (UNet, SA-UNet, CS2Net, MAF-Net) to construct a new deep learning framework for fundus image segmentation, so as to effectively reduce the issue of vessel breakage.

Fig. 1 .
Fig. 1.Fundus vascular images.In which, the first row contains four original images, and the second row displays the corresponding areas inside the white box.The third row provides the corresponding ground truths for the images in the first row.Similarly, the fourth row offers the corresponding ground truths for the local regions in the second row.

Fig. 3 .
Fig.3.The attention fusion module (AFM).In which, the spatial attention mechanism and the channel attention mechanism are combined to enhances contextual information for scene information compensation.

Fig. 4 .
Fig. 4. The structure of the spatial attention mechanism.

Fig. 5 .
Fig. 5.The structure of the channel attention mechanism.

Fig. 7 .
Fig. 7. DropOut module.The diagram shows the incorporation of DropOut and BN (Batch Normalization) modules into the convolution layers.

Fig. 8 .
Fig. 8. Overall of the deep learning network framework diagram.(a) UNet structure; (b) SA-UNet structure; (c) CS2-Net structure; (d) MAF-Net structure; (e) The ensemble learning network of our proposed MAFE-Net, which integrates the above four deep learning models.

Fig. 9 .
Fig. 9. Experimental results with different state-of-the-art models on DRIVE, STARE, and CHASEDB1 datasets.

Table 4 . Quantitative evaluation with different methods.
As described inTable 4, the presented MAFE-Net model exhibited the highest F1 and ACC values when compared to eight state-of-the-art methods.In which, the highlighted UNet network, SA-UNet, and CS2Net are intergrated to a new deep learning framwork named USC-Net.Conversely, the proposed MAF-Net model demonstrated the highest SE values.Experimental results demonstrate that the presented MAF-Net and MAFE-Net models segment retinal vessels extremely well.