Multi-phase level set algorithm based on fully convolutional networks (FCN-MLS) for retinal layer segmentation in SD-OCT images with central serous chorioretinopathy (CSC)

: As a function of the spatial position of the optical coherence tomography (OCT) image, retinal layer thickness is an important diagnostic indicator for many retinal diseases. Reliable segmentation of the retinal layer is necessary for extracting useful clinical information. However, manual segmentation of these layers is time-consuming and prone to bias. Furthermore, due to speckle noise, low image contrast, retinal detachment, and also irregular morphological features make the automatic segmentation task challenging. To alleviate these challenges, in this paper, we propose a new coarse-fine framework combining the full convolutional network (FCN) with a multiphase level set (named FCN-MLS) for automatic segmentation of nine boundaries in retinal spectral OCT images. In the coarse stage, FCN is used to learn the characteristics of specific retinal layer boundaries and achieve classification of four retinal layers. The boundaries are then extracted and the remaining boundaries are initialized based on a priori information about the thickness of the retinal layer. In order to prevent the overlapping of the segmentation interfaces, a regional restriction technique is used in the multi-phase level to evolve the boundaries to achieve fine nine retinal layers segmentation. Experimental results on 1280 B-scans show that the proposed method can segment nine retinal boundaries accurately. Compared with the manual delineation, the overall mean absolute boundary location difference and the overall mean absolute thickness difference were 5.88 ± 2.38 μ m and 5.81 ± 2.19 μ m, which showed a good consistency with manual segmentation by the physicians. Our experimental results also outperform state-of-the-art methods.

retinal structures in SD-OCT is crucial for diagnosis, prediction and monitoring of retinal diseases. Central serous chorioretinopathy (CSC) is the fourth most common retinopathy following age-related macular degeneration, diabetic retinopathy, and branch retinal vein occlusion. Impaired retinal pigment epithelial (RPE) barrier function leads to serous RPE and/or Neuroretinal detachment. On the OCT images, the main manifestations are the arches of the boundary of myoid and ellipsoid of inner segments (BMEIS). The arched area is the CSC lesion area (see Fig. 1(b)). Figure 1(a) shows a B-scan of normal retinal OCT image. Most of the patients are young adults. The patient's mild vision decreased. The visual object deformed, became smaller, and accompanied by color vision changes. Dark spots appear in the center of the field of view. The patient's hyperopic refractive also changes due to serous detachment in the macular area. The disease is easy to relapse, repeated many times Can cause irreversible damage to visual function. So, it is important to clearly identify the CSC boundaries, which can help doctors diagnose, predict, and monitor central serous chorioretinopathy. Whereas, manual segmentation of OCT retinal layers is tedious, time consuming and often irreproducible. Therefore, many computeraided segmentation methods were developed.
As the fixed mathematical model-based methods, Mayer et al. [2] presented a method to determine the RNFL layer thickness in OCT images based on anisotropic noise suppression and deformable splines. Mujat et al. [3] presented a retinal nerve fiber layer segmentation algorithm for frequency domain data that can be applied on scans from both normal healthy subjects and glaucoma patients. Farsiu et al. [4] presented an automatic method for detection and segmentation of drusen in retinal images captured via high speed SD-OCT systems. Niu et al. [5] presented an automated algorithm to segment GA regions in SD-OCT images. These active contour-based methods effectively solve the problem of topological changes in the retinal structure, but they rely on initialization and require strong prior information of the retinal image layer and thickness. Chiu et al. [6] presented an automatic approach for segmenting seven retinal layers in SD-OCT images using graph theory and dynamic programming. Chiu et al. [7] presented a generalized framework for segmenting closed-contour anatomical and pathological features using graph theory and dynamic programming. LaRocca et al. [8] presented an automatic approach for segmenting corneal layer boundaries in SD-OCT images using graph theory and dynamic programming. Quellec et al. [9] presented an automated method for segmenting 3D fluid and fluid-associated abnormalities in the retina, from 3D OCT retinal images of subjects suffering from exudative age-related macular degeneration. Keller et al. [10] introduced a metric in graph search and demonstrated its application for segmenting retinal OCT images of macular pathology. Tian et al. [11] presented an automatic OCT retinal image analysis algorithm to segment OCT volume data in the macular region accurately. Srinivasan et al. [12] presented an automatic approach for segmenting retinal layers in SD-OCT images using sparsity based denoising, support vector machines, graph theory, and dynamic programming. Karri et al. [13] presented an algorithm for layer-specific edge detection in retinal OCT images through a structured learning algorithm that could simultaneously identifies individual layers and their corresponding edges. However, these methods of graph-based surface segmentation and contour modeling require professionally designed, application-specific transformations, including cost functions, constraints, and model parameters.
As the machine learning-based methods (i.e., support vector machines, neural networks, random forest and so on) are trained to extract features from each layer or its boundaries for determining layer boundaries [14,15]. In Lang et al. [14], authors built a random forest classifier to segment eight retinal layers in macular cube images acquired by OCT. The random forest classifier learns the boundary pixels between layers, producing an accurate probability map for each boundary, which is then processed to finalize the boundaries. In McDonough et al. [15], authors proposed a method by which the boundaries of retinal layers in OCT images can be identified from a simple initial user input. The proposed method is trained to identify points within each layer, from which, the boundaries between the retinal layers are estimated. This method focuses on training neural networks to identify layers themselves, instead of boundaries. With the development of deep learning methods, deep neural network [16][17][18] is considered to be a very powerful tool in the field of computer vision. The implementation of deep learning methods slowly matured from the convolutional neural network (CNN) proposed by Krizhevsky et al. [19]. It automatically learns a series of increasingly complex features and related classifiers directly from the training data. In recent years, CNN based methods have been proposed as an alternative to traditional machine learning for normal and pathological classification of OCT images. Fang et al. [20] proposed a new convolutional neural network and graph-based search method for automatically segmenting nine-layer boundaries on non-exudative AMD OCT images. This was the first CNN to segment the inner layer of the retina. Shah et al. [21] presented a method for performing multiple surface segmentation using CNNs and applied it to retinal layer segmentation in normal and intermediate AMD OCT scans at the B-scan level in a single pass. Apostolopoulos et al. [22] proposed a fully convolutional neural network architecture which combines dilated residual blocks in an asymmetric U-shape configuration, and can segment multiple layers of highly pathological eyes in one shot. And demonstrated the effectiveness of data on patients with advanced AMD. Ben-Cohen et al. [23] explored a U-Net-based full convolution model, training the network to segment different layers, extracting probability maps of different layers from each test case, and then using Sobel edge filter along with graph search methods to detect each layers boundary. Roy et al. [24] proposed a new fully convolutional deep architecture (ReLayNet) for semantic segmentation of retinal OCT B-scan into 7 retinal layers and fluid masses, and substantiated its effectiveness on a publicly available benchmark. Although these frames proved to be effective, they are dependent on the availability of large annotated data sets.
Some of the successful applications of level set algorithms, such as Yazdanpanah et al. [25] presented a semiautomated segmentation algorithm to detect intra-retinal layers in OCT images acquired from rodent models of retinal degeneration. A multi-phase framework with a circular shape prior is adopted to model the boundaries of retinal layers and estimate the shape parameters using least squares. A contextual scheme is also employed to balance the weight of different terms in the energy function. Novosel et al. [26] presented a segmentation method that operated on attenuation coefficients and incorporated anatomical knowledge about the retina. Then, the layers in the retina are simultaneously segmented via a new flexible coupling approach. Novosel et al. [27] presented an approach to jointly segment retinal layers and lesions in eyes with topologydisrupting retinal diseases by a loosely coupled level sets framework.
The priori information of the middle boundaries of the retinal layer is clear, and the relative positional relationship between these boundaries is minimally affected by retinopathy. Therefore, the prior knowledge can be well used as the initialization of the level set function, which can alleviate the main disadvantage of the level set algorithm that it is relies on initialization and make sure the perfectly combining between the level set algorithm and the deep learning network. Inspired by knowledge above, we proposed a new segmentation model of retinal layers with CSC of SD-OCT images. The combination of FCN model and multilevel level set method (named FCN-MLS) can produce semantic accurate prediction and detailed segmentation mapping with a certain computational efficiency. We first decompose the trained OCT image into blocks, which are divided into five classes. The FCN is adopted to automatically extract features centered on the retinal layer boundaries and to train the soft-max layer to divide the five boundaries. The initialization of the remaining boundaries is based on prior information about the thickness of the retinal layer. Finally, we use a multi-stage level set model to segment and evolve to the final boundaries to be segmented. In summary, there are two main contributions of our work: (1) The multi-stage level set model is combined with FCN network to implement the coarse-fine segmentations of nine retinal layers simultaneously.
(2) To alleviate problems caused by low contrast and layer boundary blur, regional restriction is employed to avoid the overlap of the segmentation interface and obtain accurate segmentation of retinal layers. In the following, we will first introduce the deep-learning-based model for OCT images segmentation, and then, two evaluation metrics, i.e., mean absolute boundary location difference (mabld) and mean absolute thickness difference (matd) are employed to evaluate experimental results of OCT images in normal and central serous retinopathy.

Method
The proposed framework can be split into three stages: Pre-processing, FCN for layer boundary classification and distance regularization level set (DRLSE) algorithm for layer segmentation. The flowchart of the proposed method is illustrated in Fig. 2.

Pre-processing
In practical applications, due to the high scattering of biological tissues, the image quality in OCT is greatly affected by speckle noise, and speckle noise can blur the image details and reduce image clarity. In order to preserve more boundary information after denoising the retina image, we use an improved anisotropic double-sided filter proposed by Tomasi et al. [28] to denoise the image with 2 σ = and Half-size = 4.

Coarse layer boundary classification by FCN
FCN proposed by Long et al. [29] transforms the full connection layer of CNN into the convolution layer and add deconvolution layer after the last convolution layer to restore the original image size. The category of each pixel is calculated through the soft-max layer. The output of FCN has same size with original image We adopted the network structure of FCN-8s proposed by Long et al. [29] based on VGG-16 proposed by Yu et al. [30] to segment retinal layers coarsely. The model consists of 13 convolutional layers, 5 pooling layers, ReLU layers proposed by Nair et al. [31] and 3 fully connection layers with according deconvolution layers. Cross-Entropy loss function is adopted in FCN networks. The basic formula is: where ω is the weight assigned to the pixel , x belongs to l, then y l is equal to 1. Otherwise, y l is equal to 0; p l is the probability; i , j is the size of the image.

Layer segmentation by level set functions
The initialization of level set function  Fig. 3, Fig. 4). On the right of Fig. 3 and Fig.  Fig. 3. Schematic diagram of level set function initialization and its projection in a twodimensional plane. In the evolution of level set function, an edge based geometric active contour model is adopted to obtain the segmentation of remaining boundaries. The energy function is defined as: . I is an image on a domain Ω . G σ is a Gaussian kernel with a standard deviation α . The Dirac delta function δ and Heaviside function H in L and S are approximated by the following smooth functions δ (Eq. (6)) and H (Eq. (7)) as in many level set methods. Note that δ is the derivative of H ,i.e., H ′ = δ .The parameter ε is usually set to 1.5. The algorithm aims at minimizing the energy function (Eq. (2)). In the evolution process, ( ) When the zero-level contour reaches the boundary of retinal layer, the value of g is 1.

Regional restriction
Because the gradient information of the middle retinal layer interface is not obvious, the optimal solution obtained by the evolution of multiple level set functions may be overlapped. Therefore, in order to avoid the overlapping of interfaces, we adopt the idea of regional restriction coupling (see Fig. 5), so that the evolution region of each level set function is not lower than the minimum value of the upper boundary of the interface to be segmented, and not higher than the maximum value of the lower boundary of the interface to be segmented at the same time. With the continuous evolution of the level set function of the upper and lower interfaces, the evolving region of the interface to be segmented is updated in real time.

Experimental results and analysis
The MATLAB language MatConvNet toolkit is adopted to implement FCN model. The CPU used for Intel (R) Core i7-4980HQ 2.80GHz. GPU is NVIDIA GeForce GTX 980M. Hyperparameters during the training phase are set as: a total of 50 epochs, the batch-size is 10 and the initial learning rate is 0.0001. 10 SD-OCT data are selected as experimental subjects, which including 5 abnormal eyes with CSC (640 B-scans) and 5 normal eyes (640 B-scans). A data set in total of 1280 B-scan images is built. All of the cubes were obtained using a Cirrus SD-OCT device (Carl Zeiss Meditec, Inc., Dublin, CA). Each SD-OCT cube contained 1280 contiguous 512 × 1024 pixels B-scan images. Manual segmentation of retinal boundaries by an experienced grader is deemed as ground-truth. 860 B-scans were randomly selected as the neural network training set, 360 B-scans were randomly selected as the validation set of the neural network, the remaining 60 B-scans were used as the neural network test set. Figure 6 shows an example of classification by FCN and the corresponding extracted boundaries. Class 0 represents the background, Class 1 represents the ILM layer, Class 2 represents the area between RNFL-GCL and BMEIS; Class 3 represents the CSC lesion area, and Class 4 represents the CL2 layer between IB-OPR and OB-RPE, a total of 5 classes (see Fig. 6(a)). The five boundaries extracted include ILM, RNFL-GCL, BMEIS, IB-OPR, OB-RPE (see Fig.  6(b)). The initialization of the remaining boundaries, including GCL-IPL, IPL-INL, INL-OPL, OPL-HFL (see Fig. 6(c)).

Quantitative evaluation
In order to quantitatively evaluate the performance of the proposed method, we compared our segmentation results with FCN-based method proposed by Longet et al. [29], graphbased segmentation proposed by Chiu et al. [6], OCT Explorer [34,35] and ReLayNet proposed by Roy et al. [24]. Results show that the similarity with manual delineations is higher by our proposed method than others.

Evaluation of the layer boundary position
We use the mean absolute boundary difference(mabd) metric proposed by Niu et al. [36] to estimate the boundaries difference between the segmentation methods, which can be defined as: where N is the width of each B-Scans, the 1 i b C and 2 i b C are defined as the corresponding coordinate value of the segmentation result from the same boundary i-th in the axial direction of B-scan of the manual segmentation method and the method to be compared, respectively.
The standard deviation values (SD) could be expressed as: We performed comparisons on 4 normal SD-OCT cubes and 5 SD-OCT cubes with CSC lesions. The results of the mean absolute boundary position difference and standard deviation values are shown in Tables 1 and Tables 2, respectively. As can be seen from Table 1 and Table 2, the overall mean absolute boundary position difference of the proposed method for the retinal images of normal and CSC lesions is 5.88 ± 2.38 and 5.59 ± 3.21, respectively, which is significantly better than Chiu et al. [6], OCT-Explore [34,35], Long et al. [29] and Roy et al. [24]. For segmentation of OCT images with CSC lesions, the results produced by OCT-Explorer are inaccurate. Since the open method of the graph-based algorithm [6] does not divide the GCL-IPL, IB-OPR boundary, the average absolute difference is expressed as "empty". The segmentation effect of the single FCN network on the middle boundaries of the retinal layer is not ideal, and the segmentation effect of the boundaries with obvious gradient change is better than Chiu et al. [6] and OCT-Explorer [34,35] algorithms. For low-contrast OCT images, Roy et al. [24] have a slightly better segmentation effect than the FCN network. As can be seen from Table 1 and Table 2, the overall mean absolute boundary position difference of the proposed method for the retinal images of normal and CSC lesions is 5.88 ± 2.38 and 5.59 ± 3.21, respectively, which is significantly better than Chiu et al. [6], OCT-Explore [34,35], Long et al. [29] and Roy et al. [24]. For segmentation of OCT images with CSC lesions, the results proposed by OCT-Explorer [34,35] are inaccurate.
Since the open method of the graph-based algorithm [6] does not divide the GCL-IPL, IB-OPR boundary, the average absolute difference is expressed as "empty". The segmentation effect of the single FCN network on the middle boundaries of the retinal layer is not ideal, and the segmentation effect of the boundaries with obvious gradient change is better than Chiu et al. [6] and OCT-Explorer [34,35] algorithms. For lowcontrast OCT images, Roy et al. [24] have a slightly better segmentation effect than the FCN network. But it still can't separate the IPL and INL layers very well. Our method first uses FCN to segment the boundaries with distinct gradient change, so that the specific boundaries have the best segmentation effect. The remaining boundaries are then located based on a priori knowledge of layer thickness and the classification result, which is more accurate. In the gradient-based level set evolution, the regional restriction technique is used to avoid overlap between interfaces, and the segmentation effect is also excellent. Therefore, the overall mean absolute boundary difference of our method and the mean absolute boundary difference of each boundary have the smallest error, i.e., the highest similarity to the ground-truth.

Quantitative evaluation of retina thickness
Since changes in fundus disease are mainly manifested in changes in the thickness of the retinal layer, reliable measurement of retinal thickness is critical to measuring the extent of changes in fundus diseases. We use the mean absolute thickness differences (mahd) metric proposed by Niu et al. [36] to estimate the layers difference between the segmentation methods. Calculate the thickness of each retinal layer by making use of the boundary positions of adjacent layers, which can be defined as: where N is the width of each A-Scan, j presents the thickness of each retinal layer, which is calculated by the boundary position of the adjacent layer and can be expressed as: where j is the number of layers of the retina. Tables 3 and 4 show the results of the mean absolute boundary position difference and standard deviation of the retinal layer thickness compared to other methods. not ideal. The initialization process requires a lot of interpolation and fitting operations. Therefore, we use the FCN network to segment five critical boundaries with significant gradient changes. The second row shows that the classification effect is more accurate. Each column corresponds to the same B-scan, the first column represents the normal retinal image, the second and the third columns represent the retinal image with CSC lesions in the foveal region. It can be seen that since the gradient changes between specific boundaries are obvious and do not interfere with each other, the segmentation effect with less segmentation boundaries is more ideal. However, there is still a slight misclassification in the fovea area, so the fine segmentation of the next level set solves this problem. In order to directly evaluate the quality of the segmentation process, Fig. 8 shows the cross-sectional images from the same cube which are segmented by four different methods. It can be seen from the figure that Chiu et al. [6] method based on graph theory has the worst segmentation effect, while OCT-Explore [34,35] is not very good at the upper boundary of CSC lesion. It is worth mentioning that the overall segmentation of the ReLayNet network proposed by Roy et al. [24] is not bad, but there is a phenomenon of overlap between the GCL-IPL and IPL-INL boundaries. Therefore, most of the deep learning networks we see do not divide this layer, because the average thickness of the IPL layer is too small, the gradient information changes are not obvious, and training on low-contrast OCT images is more difficult to distinguish. From this we can see that regional restriction is important. In addition, the IB-OPR boundary is not segmented by gradient. In contrast, our approach is more suitable for CSC lesion segmentation, the closest to manual segmentation. Since layer segmentation is performed over the entire image area, we calculated a 2D position map (see Fig. 9) for each retinal boundary. The position of each layer boundary is given by the depth at the top of the cross-sectional image. The 2D position maps (e.g. Figure 9(a)-9(i)) of the ILM, RNFL-GCL, GCL-IPL, IPL-INL, INL-OPL, OPL-HFL, BMEIS, IB-OPR and OB-RPE boundaries, respectively. We subtract the position of the ILM boundary from the position of the RPE boundary to obtain the total retinal thickness map (see Fig. 9(j)), and the scale represents the distance between the two boundaries. As we can see from Fig. 9, the color bar area on the right side of each figure represents the coordinate value of the boundary. The deeper the yellow, the larger the value. Starting from the ILM boundary, the value of each next boundary is gradually increased. In addition, the value at the center of the position map is small, which indicates that the boundary around the fovea is correctly segmented.
A cube represents a set of data for a patient, and we calculate a 3-D visualization of each boundary segmentation of the entire set of data shown in Fig. 10. The 3D visualization (e.g. Figure 10(a)-10(i)) of the segmentation results of 9 boundaries including ILM, RNFL-GCL, GCL-IPL, IPL-INL, INL-OPL, OPL-HFL, BMEIS, IB-OPR and OB-RPE, respectively. The 3-D visualization of the segmentation results of all boundaries (see Fig. 10(j)).
It can be seen from Fig. 10(a)-10(i) that the coordinate value of the result graph of each 3-D visualization is getting lower and lower. This indicates that the position of the segmentation boundary in the OCT retinal image becomes lower and lower. In addition, Fig. 10(j) can clearly compare the positional relationship of all the segmented boundaries and can be seen that there is no overlap between the boundaries.

Conclusion
In this paper, a new joint automatic segmentation model of SD-OCT retinal layer and CSC lesion is proposed. The framework is based on FCN network and multi-stage level set method. Firstly, the retinal layers is roughly classified by FCN neural network as the initialization of the level set function. Based on the prior information of retinal image such as layer and thickness, the multi-stage level set method is used to realize the automatic segmentation of nine boundaries of OCT image. This model guarantees continuous boundaries of the retinal layer in all images and provides an efficient segmentation model for low-quality imaging with low contrast and blurred boundaries. Compared with the traditional level set method, the overlapping phenomenon of retinal layer boundary is avoided, and the need of large number of training data sets is effectively reduced compared with the simple use of neural network model.