Minimizing-Entropy and Fourier Consistency Network for Domain Adaptation on Optic Disc and Cup Segmentation

Automated segmentation of the optic disc (OD) and optic cup (OC) from different datasets plays an important role in the diagnosis of glaucoma and greatly saves human resources in both data annotation and image segmentation. However, the domain shift between different datasets suppresses the generalization ability of the segmentation network, especially damaging the performance of segmentation in the target domain, which is unlabeled. Therefore, using a transfer learning algorithm or domain adaptation method to enhance the migration ability of segmentation models has become an essential step and has attracted the attention of many researchers. In this paper, we propose an unsupervised domain adaptation network, called the Minimizing-entropy and Fourier Domain Adaptation network (MeFDA), to narrow the discrepancy between the source and target domains and prevent the degradation of segmentation performance. First, we perform adversarial optimization on the entropy maps of the predicted segmentation results to alleviate the domain shift. Then, direct entropy-minimization optimization is applied to the unlabeled target domain data to improve the credibility of the prediction segmentation maps. To enhance the prediction consistency of the target domain data, we augment the target domain dataset through the Fourier transform by replacing the low-frequency part in the target images with that of the source images. Then, a semantic consistency constraint is imposed on the raw images and augmented images of the target domain to improve the prediction consistency of the segmentation model, thereby further narrowing the discrepancy between the source and target domains. Experiments on several public retinal fundus image datasets prove the superiority of MeFDA compared with state-of-the-art methods, and the ablation study analyzes the importance of the different proposed components.


I. INTRODUCTION
Glaucoma, a collective term for a group of eye diseases, usually causes damage to the optic nerve at the back of the eye, which are the most common causes of blindness worldwide [1]. In the diagnosis and treatment of fundus lesions such as glaucoma, it is usually required to accurately detect the symptoms of the lesion in the early stage to prevent irreversible visual damage. The cup-to-disc ratio (CDR) is generally considered a basic criterion for diagnosing glaucoma and is calculated as the ratio of the vertical cup diameter (VCD) and vertical disc diameter (VDD) [2]. Since experts often use The associate editor coordinating the review of this manuscript and approving it for publication was Mohammad Shorif Uddin .
color fundus images to measure CDR to assess glaucoma, precise OD and OC segmentation is an essential step in glaucoma diagnosis [3].
Since manual OD and OC segmentation is a costly, timeconsuming and troublesome task, previous studies [4]- [7] have focused on automated OD and OC segmentation applications with computer methods to improve the efficiency and accuracy of diagnosis. However, the generalization ability of the segmentation model on different datasets is unsatisfactory due to the domain shift, as Fig. 1 shows. Recently, a large number of unsupervised domain adaptation frameworks [8]- [11] have been designed to narrow the distribution discrepancy between the source and target domains in semantic segmentation tasks, aiming to make the model VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ FIGURE 1. Demonstration of domain adaptation on OD and OC segmentation. As the gray dotted frame on the right, the OC and OD are the inner and outer oval areas. The domain shift is attributed to the visual differences between images of different datasets.
trained with the labeled source domain images achieve better testing performance on the unlabeled target domain images.
To address the unsupervised cross domain issue of OD and OC segmentation, some methods [12]- [14] have considered output space alignment and feature alignment to alleviate domain shift. These methods focus only on reducing the domain discrepancy but ignore considering the constraints on unsupervised target domain data, causing the trained segmentation model to not achieve better performance in the target domain. In addition, in most of the abovementioned methods, an adversarial training mechanism was adopted to narrow the domain discrepancy, complicating the model training process and lacking more concise mechanisms to optimize domain alignment.
To address the issues mentioned above, we proposed an unsupervised domain adaptation network called the Minimizing-entropy and Fourier Domain Adaptation network (MeFDA) for OD and OC segmentation in color fundus images. First, the entropy minimization principle [15] is considered in the optimization of the segmentation prediction map.
Specifically, the prediction segmentation results of the source domain images tend to be low entropy in most areas, with high-entropy values only along borders of segmentation objects, which means that the predictions are more certain in the source domain images [9]. In contrast, the prediction maps of target domain images have high entropy noise in many areas, as Fig. 2 shows, which indicates that the pre- dictions of target domain images are more uncertain. Based on the observation above, forcing the prediction entropy map of the target domain close to the low-entropy map of source domain prediction is a feasible solution to improve the segmentation performance.
Following [12], we adopt an adversarial learning mechanism to align the data distribution of different domains to alleviate the domain shift. Based on the adversarial domain adaptation methods, we adopt a separate entropy minimization constraint on the segmentation results of target domain images to force the segmentation model to generate prediction results with higher confidence in the target domain.
In addition, inspired by [16], we propose a Fourier consistency constraint to improve the semantic consistency of the segmentation model. Specifically, we use the Fourier transform to map the images of the source and target domains to the frequency space. Then, we replace the low-frequency part of the target domain images with that of the source domain images since the low-frequency information always represents the background, lighting, brightness and other style information of the images. The generated images are considered new target domain samples and participate in the training of the network. Finally, a cross-entropy constraint is imposed on the predicted segmentation maps of the raw and augmented target domain images to maintain the consistency between them.
The Fourier consistency constraint not only enhances the prediction consistency of the segmentation model on target data but also narrows the domain discrepancy between the domains through the information exchange of the frequency domain. On the basis of the idea of minimizing-entropy, the addition of Fourier consistency augmentation and constraint has opened up another way of alleviating the domain shift.
Relying on the above two ways, we designed an effective framework for cross-domain OD and OC semantic segmentation. In general, the contributions of this paper are as follows: • We propose an unsupervised domain adaptation network for OD and OC segmentation in color fundus images, named the Minimizing-entropy and Fourier Domain Adaptation network (MeFDA). Experiments on several public retinal fundus image datasets demonstrate the superiority of the proposed method compared with the state-of-the-art methods.
• We simultaneously adopt directly minimized optimization of the prediction entropy maps of the target domain and adversarial optimization of the prediction entropy maps of the source and target domains, aiming to improve the certainty of the target domain prediction and narrow the domain discrepancy.
• We generate augmented target domain images with a certain source domain style through Fourier transformation to extend the target domain and impose a consistency constraint on the raw and augmented target domain images, which further narrows the discrepancy between the source and target domains.

II. RELATED WORKS
Studies based on deep learning methods in color fundus images have been very extensive in recent years and can be divided into three kinds: lesion detection and segmentation, biomarker segmentation and disease diagnosis [2]. The task of biomarker segmentation includes optic cup (OC) and optic disc (OD) segmentation [17]- [19], vessel segmentation [20]- [22], fovea segmentation [23] and the judgment of arteries and veins [24], [25]. The OC and OD segmentation task is basic for the diagnosis and treatment of glaucoma [2]. Compared with OD segmentation, the segmentation of OC is more difficult because its boundary is more subtle and difficult to determine. OD and OC segmentation are suitable for most image segmentation methods due to the elliptical shape of the objects in the image. In previous studies, Mohan et al. [26] combined FCN [27] with atrous convolutions to realize automated OD segmentation. In their later work [28], a P-Net was proposed, which took the downscaled image as input to obtain the preliminary segmentation results and sent the output to Fine-Net to guide further segmentation. Fu et al. [29] proposed an OC and OD segmentation network called M-Net, which contained a multiclass loss based on the dice loss to address the challenges of data imbalance and multilabeling. The authors also introduced a polar transformation to obtain spatial consistency. Shah et al. [30] proposed the weak region of interest model-based segmentation (WRoIM) and parameter-shared branched network (PSBN) based on U-Net [31] to implement OC and OD segmentation and reduced the parameters of the model. Wang et al. [32] and Jiang et al. [19] used Faster R-CNN [33] and Mask R-CNN [34] as references to construct OD and OC segmentation networks. However, the abovementioned methods have poor generalization performance when facing different datasets due to the discrepancy between datasets in different domains.
Domain adaptation is a research branch in the field of deep learning and mainly aims to reduce discrepancies between domains. Many previous studies [8], [11], [35], [36] introduced domain adaptation methods into semantic segmentation tasks. Most studies are based on adversarial training [37]- [39] to construct semantic segmentation networks to achieve good performance. However, the process of adversarial training lacks stability to some extent. In addition to adversarial learning, many recent works [40]- [42] focus on using cyclic-consistency methods to transform the source domain image to the target domain and reduce the difference between domains.
Recently, some unsupervised domain adaptation methods have succeeded in the task of cross-domain OC and OD segmentation. Wang et al. [13] designed a network named the patch-based output space adversarial learning framework (pOSAL) to input the target and source domains to a lightweight network to extract ROIs. Liu et al. [43] proposed a framework called collaborative feature ensembling adaptation (CFEA) to perform the unsupervised cross-domain OC and OD segmentation task, which contains a source domain network, a target domain network and a target teacher network. Wang et al. [12] introduced a framework named boundary and entropy-based adversarial learning (BEAL) to take adversarial training in the boundary and entropy of images from the source and target domains. They used two discriminators to distinguish which domain the boundary maps and entropy maps belonged to. Kadambi et al. [44] proposed a WGAN domain adaptation framework which can capture the average divergence between distributions of source and target segmentation outputs over the spatial dimensions with a patch discriminator. Chen et al. [14] proposed a network called input and output space alignment (IOSUDA) to perform feature alignment in both the input and output spaces of the image segmentation network and to use the adversarial learning strategy. Lei et al. [45] proposed an unsupervised domain adaptation based image synthesis and feature alignment (ISFA) method which combined GAN-based image synthesis and content and style feature alignment (CSFA) to alleviate the domain shift. Guo et al. [46] proposed a coarse-to-fine adaptive Faster R-CNN framework which contains a spatial attention-based region alignment (SRA) module to achieve coarse-grained adaptation in a class-agnostic way, and a prototype based semantic alignment (PSA) module to minimize the distances between global prototypes of the same category from different domains. Following these previous studies, we design our novel network MeFDA, which utilizes entropy minimization and Fourier consistency augmentation to alleviate the domain shift between the target domain and the source domain and enhance the semantic consistency of the segmentation network in OC and OD segmentation.

III. METHODOLOGY
In this section, we introduce the main designs of the proposed Minimizing-entropy and Fourier Domain Adaptation network (MeFDA) for optic disc (OD) and optic cup (OC) segmentation in color fundus images, of which the overall framework can be seen in Fig. 3. We mainly take two measures to reduce the domain shift: 1) the optimization of minimizing entropy is applied in the network, including direct optimization in the target domain and adversarial optimization between domains; 2) the Fourier transform is used to augment the target domain images with low-frequency information exchange with the source images, and a loss is designed to maintain the prediction consistency of the raw and augmented target images.
Formally, let X s , X t ⊂ R H ×W ×3 denote the collection of the source and target domain images, and let Y s , Y t ⊂ R H ×W denote the collection of the ground truth segmentation maps of the source and target domains (Y t is not used in the training process of this task). For each image x t ∈ X t , the ultimate goal of the network is to generate the prediction segmentation map y t , which is as close as possible to y t ∈ Y t . Since only the ground truth segmentation maps of source domain y s ∈ Y s are available in the training of the network, we need to train the segmentation network with the source data with labels and the unlabeled target data, along with domain adaptation methods to narrow the discrepancy between the domains to strengthen the adaptability of the network to the target domain.

A. OPTIMIZATION OF MINIMIZING THE ENTROPY
In the network, we use two optimization methods to minimize the entropy in the prediction maps of the target domain: direct minimization and adversarial minimization. Shannon entropy [47] was adopted in our network.

1) DIRECT MINIMIZATION
We first impose the constraint of direct entropy minimization on the target domain. For the predicted segmentation map y t of each target domain image x t , the corresponding pixelwise entropy map E(x t ) can be calculated as where y t denotes the prediction target segmentation map. The direct entropy minimization loss l de is defined as the sum of all pixels in the entropy map: where E i (x t ) denotes the value of the i th pixel in E(x t ). The average direct entropy minimization loss L de can be defined as where n denotes the number of target domain images. Although the ground truth segmentation maps of the target domain are not available in the training process, the constraint of direct entropy minimization can optimize the prediction segmentation map of the target domain to higher confidence.

2) ADVERSARIAL MINIMIZATION
The design of direct entropy minimization loss leads to the following disadvantages: 1) the direct minimization entropy loss aims to minimize the sum of the pixel values of the prediction entropy map, which neglects the relationship between local semantics; 2) the network lacks domain adaptation on the source and target domains and is thus unable to effectively utilize the information of the source domain.
Generally, under the supervision of labeled source data, segmentation networks generate low-entropy predictions for source domain data. Since the source and target domains have similarity in semantic structure, intuitively, we can make the predicted entropy maps of the target domain close to that of the source domain to minimize the prediction entropy of the target domain.
We adopt an adversarial minimization loss to force the target domain's entropy distribution similar to that of the source domain to minimize the entropy of the target prediction segmentation maps. For the source domain image x s , the network predicts its segmentation map y s , and the entropy map is calculated by in order to use entropy adversarial training, a discriminator D is designed to distinguish whether the entropy map comes from the source domain or the target domain. The main objective of adversarial learning is to train the segmentation network to deceive the discriminator while training the discriminator so that the prediction target entropy maps are closer to that of the source domain. Let L bce denote the binary cross-entropy loss, and the loss of the discriminator L D is where m and n denote the number of images in the source and target domains, respectively. Simultaneously, we use the domain adversarial entropy minimization loss L ae to deceive the discriminator D and encourage the target domain entropy maps to be similar to the source domain entropy maps:

B. FOURIER CONSISTENCY BETWEEN DOMAINS
To further alleviate the offset between the source and target domains, inspired by [16], we adopted the Fourier consistency method to augment the images in the target domain. Specifically, we define F ou as the Fourier transform function and F −1 ou as the inverse Fourier transform function. In our method, we utilize the fast Fourier transform (FFT) [48] algorithm to implement on a single image x and obtain the amplitude map F A ou and the phase map F P ou . We randomly choose a source domain image x s and a target domain image x t and replace the low-frequency part in x t with that in x s to assemble the Fourier-augmented target domain image x F t . As Fig. 4 shows, we assume that the center point value of the amplitude map is 0. We draw a square box α with side length a in the center of the amplitude map. For the amplified map F A ou (x s ) and F A ou (x t ) of x s and x t , we cut the part of the area of the square box α in F A ou (x s ) and cover it to the corresponding area of F A ou (x t ) to obtain the new amplitude map F A ou (x F t ). The new Fourier augmented image x F t is calculated by For the Fourier augmented image x F t , its ground truth segmentation map is still y t , but its semantic structure information is much more similar to that of x s compared to x t . Therefore, the domain shift can be drastically alleviated with the use of x F t ∈ X F t . In our method, X t and X F t are both used as target samples and participate in the optimization of minimizing the entropy. In addition, because the target domain image x t and its Fourier augmented image x F t output two predicted segmentation maps y t and y F t , to maintain their consistency, we use a pixelwise cross-entropy loss as the Fourier consistency segmentation loss l t to restrain it: where y i t and y Fi t denote the i th pixel of the prediction segmentation map of x t and x F t , whose values are always float values in the range of [0, 1]. The total Fourier consistency segmentation loss L t is

C. THE OVERALL FRAMEWORK
In our network, the backbone F b is used as the segmentation network, with the fundus images as inputs and the outputs the corresponding segmentation maps. We use source domain data to optimize F b since only the ground truth segmentation maps of the source domain are available in the training process. We use a pixelwise cross-entropy loss l s to optimize F b : where y i s and y i s denote the i th pixel of the ground truth segmentation map and the prediction segmentation map respectively, The final L s can be calculated as First, the target domain is augmented by the Fourier consistency method in section III.B to contain more images similar to the source domain images. Then, the source domain and target domain images are input into the segmentation network optimized with the source domain, and their prediction segmentation maps are obtained, with which the entropy maps are calculated. We take the entropy-minimizing strategies in section III.A using the calculated entropy maps. The overall loss function of the segmentation network is L seg = L s + L ae + λ de L de + λ t L t (12) where λ de and λ t denote the weights of L de and L t .
• REFUGE [49] challenge dataset is composed of 1200 retinal color fundus photography, acquired by ophthalmologists or technicians from patients sitting upright and using fundus cameras. It is divided into training set (400 images), validation set (400 images) and testing set (400 images). The validation set and test set of the REFUGE dataset come from the same fundus cameras, while the training set comes from different fundus cameras.
• Drishti-GS [50] dataset contains 101 Indian retinal fundus images, each of which includes manual labels by four ophthalmologists with different clinical experiences. It is divided into training set (50 images) and testing set (51 images) in our experiment setting. • RIM-ONE-r3 [51] dataset includes 169 Spanish ONH images. In our experiment setting, 99 images were used for training, and 60 images were used for testing. To better compare the performance of the proposed MeFDA methods, following [12], we used the training set of REFUGE challenge datasets as the source domain dataset in our task, while Drishti-GS [51] and RIM-ONE-r3 [50] are used as the target domain datasets.
Examples of samples of the datasets are given in Fig. 1. The statistics of the datasets are shown in Table. 1. As shown in Fig. 1, the fundus images from different domains have some discrepancies in background, light and the size of the optic disc and map. The domain gap between RIM-ONE-r3 and REFUGE is larger than that between Drishti-GS and REFUGE.

B. IMPLEMENT DETAILS
In this paper, we used DeepLabv3+ [52] with MobileNetV2 [53] pretrained on the ImageNet dataset [54] as the backbone architecture and trained the entire framework in an end-to-end manner without any warm-up phases. We trained 200 epochs in total with a minibatch size of 8 on a server with one Nvidia 1080ti GPU. The segmentation network was optimized with the Adam [55] optimizer with an initial learning rate of 1e−3, which was divided by 0.2 every 100 epochs. The discriminator D was optimized by the SGD algorithm with a learning rate of 2.5e − 5. To expand the training dataset, we adopted several data augmentations, consisting of flipping, random rotation, contrast adjustment, elastic transformation, random erasing, and adding Gaussian noise following the previous methods [12], [13]. For the hyperparameters λ de , λ t in Eq. 12, we set λ t = 0.1 and λ de = 2 1+exp(− ·p) − 1, where p ∈ (0, 1) indicates the training progress and = 10. For the hyperparameter α of Fourier transform, we set the value as 0.01 which denotes the ratio of the side length of the square box to that of the amplitude map. It is worth noting that in addition to the task of domain adaption on OD and OC segmentation, the hyperparameter settings in our framework can also be applied to other domain adaption tasks of semantic segmentation [16], [56].

C. QUANTITATIVE ANALYSIS
We analyze the segmentation performance of the proposed MeFDA framework quantitatively based on Dice coefficient (DI) values [13] of OD and OC. The Dice coefficient (DI) evaluates the pixel-wise agreement between the predicted segmentation regions P and the ground truth Y , which is defined as: where TP is the true positive, FP is false positive and FN is false negative. As shown in Table 2, aiming to validate the effectiveness of the proposed MeFDA, we compare our method against the supervised method (Upper bound) and several representative unsupervised cross-domain segmentation methods: 1) TD-GAN [57] proposes a task-driven generative adversarial network to perform simultaneous style transfer and medical image semantic parsing; 2) [58] designs a pixel-level adversarial domain adaptation method, which consist of both global and category specific feature alignment framework; 3) WGAN [44] proposes a Wasserstein distance based adversarial domain adaptation framework for optic disc-andcup segmentation; 4) [59], an adversarial-based cross-domain eye vasculature segmentation approach; 5) OSAL-pixel and pOSAL [13], patch-based output space adversarial methods. The approach proposed a morphology-aware segmentation loss and then utilized adversarial learning to generate consistent predictions in a shared output space for images of the source and target domains; 6) BEAL [12]: a boundary and entropy based adversarial domain adaptation approach. The approach utilizes adversarial learning to encourage the boundary prediction and mask probability entropy map of target domain closer to that of the source domain, which achieved state-of-art in the previous works.
We inherited the results of the above methods from BEAL [12]. From the results in Table 2, we have the following key observations and analyses: 1) Since the domain gaps between RIM-ONE-r3 and REFUGE are larger than those between Drishti-GS and REFUGE, the domain adaptation task for RIM-ONE-r3 is more challenging. However, compared against the state-of-the-art unsupervised cross-domain fundus segmentation method BEAL [12], the proposed method can achieve 1.35% and 1.22% improvement for DI cup and DI disc , respectively, which validates the superiority of our proposed MeFDA to tackle unsupervised cross domain segmentation tasks. 2) Considering that the domain discrepancy between the Drishti-GS and REFUGE images is smaller, the domain adaptation task for Drishti-GS is easier, and the previous methods have achieved sufficient competitive performance, which is close to the upper bound. However, the proposed MeFDA method still outperforms the state-ofthe-art BEAL method [12] in the optic cup segmentation task and achieves competitive performance in the optic disc segmentation task.

D. QUALITATIVE ANALYSIS
We select some visualization results from target domains of our method MeFDA in Fig. 5, along with the visualization results of the previous state-of-the-art network BEAL [12].
In Fig. 5, columns 1-2 show the original fundus images and the ground truth segmentation maps. To the right are the prediction segmentation results and the OC and OD entropy maps generated by BEAL (columns 3-5) and our MeFDA (columns [6][7][8]. From the visualization results, we can observe that the following: 1) in OD segmentation, both methods can achieve ideal performance, and the entropy maps of the two methods tend to be succinct. 2) In OC segmentation, the OC entropy maps of BEAL [12] are noisier than those of our MeFDA, which demonstrates that our method can effectively improve the performance of OC segmentation with the contrast of direct entropy minimizing loss and Fourier consistency loss on the target domain. 3) In general, our segmentation results can better fit the ground truth of the raw image in the target domain, indicating that the MeFDA algorithm we proposed is effective for the task of OD and OC segmentation.

E. ABLATION STUDY
Aiming to analyze the performance of the proposed MeFDA more comprehensively, we compare the effectiveness of several variants using the ablation study. Table 3 shows the ablation study on Drishti-GS and RIM-ONE-r3. In Table 3, the proposed method is denoted as ''MeFDA'', ''MeFDA w/o E'' represents a variant eliminating the Fourier consistency constraint, ''MeFDA w/o F'' represents a variant eliminating the entropy minimization constraint, and ''MeFDA w/o F, E'' denotes the variant eliminating both of them. Through the ablation study, we summarize the following observations and analysis: 1) The performance of ''MeFDA w/o F, E'' falls behind the other variants since it simply adopts a segmentation loss of the source domain and an entropy adversarial loss to achieve the basic domain adaptation segmentation task. 2) The Fourier consistency constraint can greatly improve the cross-domain segmentation performance since it not only enhances the semantic consistency of target domain data but also improves the information exchange between the source domain and target domain. 3) The entropy minimization constraint can enhance the pixels' discriminability for the target domain by VOLUME 9, 2021  optimizing the prediction segmentation maps from the target domain to higher confidence. 4) The proposed ''MeFDA'' obtains the best performance by combining the above designs.

F. PARAMETER ANALYSIS
In this section, we give an empirical analysis of the trade-off weights λ de , λ t in Eq. 11 and the Fourier augmentation parameter α in Fig. 4. When adjusting one of the hyperparameters, the rest are set to the same values as those in Sec. IV-B. The trade-off weight λ t controls the influence of the Fourier consistency loss L t . As shown in Fig. 6, we vary λ t from 0.05 to 0.2 with a step of 0.05, finding that the proposed method achieves the best performance when λ t is set to 0.1, which also applies to most semantic segmentation tasks [56]. And in Fig. 7, we present results for λ de = 0.001, 0.01, 0.1 and a dynamic setting 2 1+exp(− ·p) − 1, in which setting, λ de increases from 0 to 1 with training procedure goes on. And from the results in Fig. 7, we can find that the dynamic setting achieves the best performance. Because in the early stage of training, too strict direct entropy minimization constraint will damage the model since the prediction results are not confident enough. But in the later stages of training, when the model has been optimized well enough, strong constraint is supposed to strengthen the discrimination of samples. And in Fig. 8, we analyze the influence of Fourier augmentation parameter α, which controls the degree of style transfer. The proposed method achieves the best performance when α is set to 0.01, while small α could not transfer the background information enough and larger α will destroy the original texture information of the image.

G. DISCUSSION AND FUTURE WORK
In this work, we propose the Minimizing-entropy and Fourier Domain Adaptation network (MeFDA) to deal with the unsupervised cross domain OD and OC segmentation task. The entropy minimization strategy aims to improve the pixel-level prediction confidence of unlabeled target samples. And Fourier consistency constraint tries to enhance the prediction consistency of target samples. There are two future directions of the proposed method: 1) existing methods lack additional constraints on source domain data, since the segmentation performance of source domain samples always decreases during the domain adaptation process. 2) most of the existing methods consider the semantic segmentation as pixel-level classification, ignoring the relation and structural information of pixels.

V. CONCLUSION
This work proposed a novel unsupervised domain adaptation network for cross-domain OD and OC segmentation, called the Minimizing-entropy and Fourier Domain Adaptation network (MeFDA). The network mainly utilizes two principles to narrow the domain discrepancy: entropy minimization and Fourier consistency. In the part of entropy minimization, we use both direct entropy optimization and adversarial entropy optimization. Direct entropy optimization improves the confidence of prediction segmentation results, and adversarial optimization confuses the entropy maps of the source and target domains to narrow the discrepancy between the domains. In addition, we utilize Fourier transform to generate augmented target domain images with a certain source domain style in order to further alleviate the domain shift and then impose a constraint on the prediction segmentation maps of the raw and augmented images of the target domain to maintain their consistency. In the domain adaptation task of OC and OD segmentation, experiments on several public retinal fundus image datasets verified the effectiveness of the proposed MeFDA compared against the state-of-the-art methods.
SHAO-PENG XU received the Ph.D. degree from Tianjin Medical University, China. He is currently working as the Deputy Chief Physician at the Department of Cardiology, Tianjin Medical University General Hospital. His major is coronary intervention and the annual PCI operation volume is approximately 700. He is also a Master Tutor at Tianjin Medical University.
TIAN-BAO LI is currently pursuing the master's degree with Tianjin University, Tianjin, China. His research interests include computer vision and machine learning.
ZHE-QI ZHANG is currently pursuing the master's degree with Tianjin University, Tianjin, China. His research interests include machine learning and medical image analysis.
DAN SONG received the Ph.D. degree in computer science and technology from Zhejiang University, China. She is currently an Associate Professor at the School of Electrical and Information Engineering, Tianjin University. Her research interests include computer graphics and computer vision.