Breast Regions Segmentation Based on U-net++ from DCE-MRI Image Sequences

Background analysis of breast cancer can depict the progress and states of the tumour, which is based on the whole breast segmentation from MRI images. The focus of this paper is to construct a pipeline for breast region segmentation for the possibility of breast cancer automatic diagnosis by using MRI image serials. Studies of breast region segmentation based on traditional and deep learning methods have undergone several years, but most of them have not achieved a satisfactory consequence for the following background analysis. In this paper, we proposed a novel pipeline for whole breast region segmentation method based on U-net++, that can achieve a better result compared with the traditional U-net model which is the most common used medical image analysis model and achieve a better IoU than CNN models. We have evaluated the U-net++ model with tradition U-net, our experiments demonstrate that the U-net++ with deep supervision achieves a higher IoU over U-net model.


Introduction
Breast cancer is a prevalent deathing tumour, until now, it still is the leading terminal illness among women worldwide [1]. Early detection of breast cancer is crucial for the following tumour evaluation and treatments, in this way, MRI plays the most essential role as a diagnostic tool [2,3]. Breast cancer usually occurs in the fibro glandular area of breast tissue. Performing early treatment can reduce the tumour mortality. For this purpose, dynamic countrastenhanced magnetic resonance imaging (DCE-MRI) is reliable for early detection of breast cancer than mammography and ultrasound [4]. Traditionally, manual analysis on MRI data is time-consuming and labour intensity, even error-prone between different people. Although several specific systems have been developed for detection and diagnose of breast lesions, which have claimed that greatly improves clinicians' work efficiency, unfortunately, most of these systems and methods have solved certain small problems they confront, and the automatic detection of breast lesions is still an ongoing problem.
Generally, a common DCE-MRI image often includes all of the organs in thorax and abdomen, such as lung, liver and other muscles. For the following clinical analysis, to separate these organs is essential and critical task of doctors. It is extremely tedious and time consuming for a person separating all of the organs from a large number of diverse MRI data sets. Therefore, automatic segmentation protocol raises great demands for the same task in different hospital center. Conventionally, some low-level image processing methods have been employed for this task and achieve SOTA, such as thresholding method, morphological operation method and B-spline curve fitting methods [5][6][7][8]. Ensample method such as combination of edge detection and linking can evaluate the chest wall, dynamic programming with preprocessing can extract the breast region and get a better result than the mention method [9][10][11][12]. All of these traditional methods can achieve good results on certain type of data set, but fail in some specific data and cannot extent into other domain [7,13,14]. In addition, counter-based methods extract the objects boundaries by using the characteristics of intensity discontinuities at the edges of organs. Shi et al. proposed an automatically segmentation method on DCE-MRI images combining K-mean clustering and morphological operations before the 3D level-set model [15], which have established the basement of the counter-based pipeline. But these methods suffer from contour initialization and parameters tuning, in addition, the computing time is also another drawback for these points. By contrast, such as thresholding, clustering and graph-based method are collected as region-based approaches. Most of these methods mainly rely on an appropriate homogeneity criterion to group pixels into regions. But over segmentation and noise are greatly affect the results of these methods [15].
Recent years, with the development of deep learning method, convolutional neural networks (CNNs) have been widely used in medical image analysis pipeline [16]. Fully convolution network (FCN) first achieves an end-to-end segmentation task form nature image and medical images [17,18]. SegNet, Unet can also extract high-level features and perform pixel-level organ segmentation [19][20][21][22]. In this paper, we proposed a new breast body segmentation by using the new U-net based model, U-net++, which achieved a better result in IoU gain over the tradition U-net model. Our contribution are as follows:  We employed the U-Net++ model to construct a breast region segmentation pipeline.  We compared the accuracy of U-Net++ and tradition U-net, the results showed a better result for U-Net++.  We also established an intensity evaluation basement for the background extraction and a proprocess of tumour segment.

Dataset and Aims
In this paper, we collected 165 breast cancer patients' DCE-MRI image series, in which each of them contains 60~80 single image. Following that, the doctors labelled each breast region from these images and we take these labels as the golden standard. Figure.1 Breast region DCE-MRI image data and its label, upper four is the origin MRI images, and the other four images is the label respectively In this part, we recollected these 165 patients and labelled as 13254 image-label pair, we divide these data into training set, testing set and valid set as 6:3:1.

U-net model
For medical image segmentation, U-net model is an effective platform and most of the organ segmentation tasks are performed. As it is shown in Fig.2, the U-net consists of an encoder part (left side) and a decoder (right side). The encoder part follows a convolutional network architecture, which is consists of an unpadded convolution with two 3x3 convolutions part, each of them followed by a ReLU and a 2x2 max pooling operation with stride 2 for downsampling. At each down-sampling step we double the number of featurechannels. In the decoder part, it consists of an up-convolution for upsampling with a 2x2 convolution, and two 3x3 convolutions with ReLU. At the final layer a 1x1 convolution is used to map each 64-component feature vector to the desired number of classes.
(a) U-net architecture (b) U-net++ architecture Figure.2 Comparision of U-net model and U-net++ model

U-net++ model
Like U-net model, U-net++ consists of an encoder and decoder connected with nested dense convolutional blocks. U-net++ is to bridge the semantic gap between the feature maps of the encoder and decoder prior to fusion. In Fig.2, shows the detailed analysis of the backbone with skip pathway of UNet++. The left part of U-net++ is an encoder architecture or backbone followed by a decoder. The distinguishes between U-net and U-net++ is the skip pathways that connection of the encoder and decoder with deep supervision.

Coection pathways
For U-net++ model, it has several re-designed skip connection pathways between encoder and decoder, they undergo dense convolution with pyramid feature level. In addition, the dense convolution block can facilitate the encoded and decoded features maps as close as possible. The total connection pathway can also transform the optimization problem into an easer one by receiving the semantically similar decoder features.
The skip pathway is formulated as follows: In this formmulate, x i ,j is the output of node X i, j , where, i and j are the index of down-sampling layer along the encoder and convlution layer of dense block along the skip pathway respectivly. H() is is the combinantion of convolution layer and a activation function, U() is the up-sampling layer, [ ] is the concatenation layer. Generally speaking, at level j=0, the nodes receive only one input from the previous layer of the encoder, at level j = 1, the node receive two inputs, both from the two-consecutive level of encoder part; the same with nodes at level j > 1 receive level j + 1 inputs, bucause all of these nodes are located at the same level of the dense net layers with the same skip pathway. Eq. 1 shows he feature maps ravel through the top skip pathway of UNet++.

Deep supervision
In U-net++ model, the deep supervision can enable the model to operate in the accuraccy of it and makes the output averaged for all the segmentation branches. In addition to that, the fast mode in the segmentation map is selected from from one of the segmentation branches that make the model pruning and speed gain. For the reason of nested skip pathways, the U-net can generate full resolution feature maps from semantic levels, that enables the deep supervisiuon. For these aims, we use the joint loss function with binary cross-entropy and dice coefficient for each semantic level, and the total loss function is described as: Where, ˆb Y is the flatten predicted probabilities, b Y represents the flatten ground truths of th b image, and here we choose the batch size as N.

Experiments and Results
For the data set, we use 165 breast cancer patients' DCE-MRI image series for model evaluation, they cover several breast states from different scan layers. We perform the background normalization and illumination on these dicom image series before fedding into the networks. For comparison, we establish a testbed by using a customized U-Net architecture and the U-Net++ architecture with the similar number of parameters. Aims to keep the performance of each architecture can not be affect by the number of parameters. We use dice coefficien and intersection over IoU as the optimation loss function, and an early-stop mechanism on validation set the same with the original U-Net++. Adam optimizer with 3e-5 learning rate is adopt for the training step. All convolutional layers along each skip pathway with size of 3x3 k kernels (k = 32 x 2 i ). For deep supervision, a 1x1 convolutional layer followed by a sigmoid activation function was appended to each of the target nodes. The segmentation results are listed in table 1, the U-net++ with deep supervision is get a higher IoU than the traditional U-net model, wheras, the numbers of parameters of U-Net++ is larger than the traditional U-Net, this shows that the U-Net++ should take more computing resource. Figure 3 shows the segmentation results, we can draw the conclusion that the U-Net++ can get better segmentation results than U-Net. For the same scale of network architecture, U-Net++ can get more better results than U-Net.

Conclusion
To address the need for higher segmentation results for breast region in DCR-MRI image series, we proposed a U-Net++ pipeline for this task. In this context, we compared the network architecture and