Development and validation of a deep learning algorithm for distinguishing the nonperfusion area from signal reduction artifacts on OCT angiography

: The capillary nonperfusion area (NPA) is a key quantifiable biomarker in the evaluation of diabetic retinopathy (DR) using optical coherence tomography angiography (OCTA). However, signal reduction artifacts caused by vitreous floaters, pupil vignetting, or defocus present significant obstacles to accurate quantification. We have developed a convolutional neural network, MEDnet-V2, to distinguish NPA from signal reduction artifacts in 6×6 mm 2 OCTA. The network achieves strong specificity and sensitivity for NPA detection across a wide range of DR severity and scan quality.

In our previous work, MEDnet detected NPA well on 6 × 6 mm 2 OCTA images, but it was susceptible to severe signal reduction artifacts caused by opacities anterior to the retina, pupil vignetting, or defocus. Signal reduction due to shadow and defocus affects OCTA with a wider field of view more profoundly, decreasing the specificity of NPA detection. In this study, we evaluate a new algorithm that can distinguish NPA from signal reduction artifacts.

Data acquisition
All OCTA scans were acquired over a 6 × 6 mm 2 region using a 70-kHz OCT commercial AngioVue system (RTVue-XR; Optovue, Inc.) centered at 840mm with a full-width half maximum bandwidth of 45nm. Two repeated B-scans were taken at each of 304 raster positions and each B-scan consists of 304 A-lines. The OCTA data were computed using the split-spectrum amplitude decorrelation angiography (SSADA) algorithm [28]. Retinal layer boundaries [ Fig. 1(A)] were segmented by using a guided bidirectional graph search (GB-GS) algorithm [29]. Angiograms of the superficial vascular complex (SVC) [ Fig. 1(B-C)] and reflectance images of inner retina [ Fig. 1(D-E)] were generated by projecting OCTA/OCT data within the slab of interest [7]. The thickness maps of the inner retina [ Fig. 1(F)] were generated by projecting the distances between the inner limiting membrane (upper boundary) and the outer plexiform layer (lower boundary), excluding the contribution from retinal fluid.

Network architecture
In our previous work, we regarded the NPA detection task as separating two categoriesperfusion and nonperfusion. Although MEDnet limits interference of signal reduction by including the OCT reflectance image of the SVC slab in the network, the constraint of this two-category approach results in the failure of distinction between capillary dropout and severe signal reduction artifacts, especially in the scans with low signal strength (<55). The new approach proposed in this study assigns three possible categories-perfusion, nonperfusion and signal reduction. Figure 2 illustrates the architecture of MEDnet-V2. The input to the network consists of three parts [ Fig. 2(B-D)]. Before feeding the en face image of the inner retinal tissue reflectance [ Fig. 2(A)] into the network, we applied a multi-Gaussian filter (Eq. (1) to produce a reflectance intensity map [ Fig. 2(B)] and remove artifacts (e.g. due to large vessels) and noise: , again empirical values). * is the convolution operator, I is the image matrix, and I is the mean value of image.
In the reflectance intensity map, both the fovea and shadow-affected areas show low reflectance. To distinguish them, we fed the thickness map of the inner retina [ Fig. 2(C)], which shows low values around the fovea, to a subnetwork to remove its impact on the detection of signal reduction artifacts. After passing through two convolutional neural networks [ Fig. 2(E1-E2)], the features from reflectance intensity map and the inner retinal thickness map were added. The decision to add the features from these maps was made empirically, after determining that the network performed better using this operation than alternatives (e.g., concatenation). Then, the en face angiogram of the SVC  MEDnet-V2 is comprised of three sub convolution networks, each with identical structure [ Fig. 3(A)]. The sub convolution network uses a U-Net-like architecture. We modified the multi-scale module [ Fig. 3(B)] and encoder-decoder modules of the previous version of MEDnet to enhance the feature representation capabilities of the network by making the network deeper. In each encoder and decoder block, we replaced plain connection blocks with residual blocks [ Fig. 3(C-D)] from ResNet [10]. After each convolution layer, we added a batch normalization layer to accelerate the training phase and reduce overfitting.

Generation of ground truth and training data
To obtain accurate ground truth maps for training, three certified graders (trained technicians) manually delineated NPA and signal reduction artifacts using in-house graphical user interface software [ Fig. 4(A)]. The software allowed graders to delineate NPA and signal reduction artifacts simultaneously on SVC angiogram. The signal reduction artifacts can be delineated using the reflectance intensity map as a reference. To generate the final ground truth map from these three manual delineation results, a voting method was employed. The category that receives the majority of votes (≥2/3) decided each pixel's identity (back ground (perfusion area), NPA (green), and signal reduction artifacts (yellow)) [ Fig. 4(B-C)]. If the case contains undecidable pixels, each expert vote it with different categories, the final ground truth map will be determined by a discussion of these three graders. The data set was collected from 180 participants in a clinical diabetic retinopathy (DR) study (76 healthy controls, 34 participants with diabetes without retinopathy, 31 participants with mild or moderate non-proliferative DR (NPDR) and 39 participants with severe DR). Two repeat volume scans were acquired from the same eye of each participant. OCTA scans were also acquired from 13 healthy volunteers, and 6 repeat volume scans (one reference scan, two scans with manufactured shadow, and three defocused scans with different diopters) were acquired from each volunteer (Table 1). To increase the number of training samples, we applied several data augmentation operations, including addition of Gaussian noise (mean = 0, sigma = 0.5), salt and pepper noise (salt = 0.001, pepper = 0.001), horizontal flipping, and vertical flipping.

Loss function and optimizer
In healthy eyes, the NPA is limited to the macular area accounting for a small proportion of the overall angiogram. Even in the eyes with DR, NPA constitutes a minority of the angiogram. However, signal strength reduction can affect en face angiograms at any location. This constitutes a serious category imbalance problem in this segmentation task. To address this, we designed a weighted Jaccard coefficient loss function (Eq. (2). This loss function L imposes different weights to each category to adjust the category balance: Here, N is category index, and i w is the weight of i -th category associated with Jaccard coefficient i J . In this task, we set the three categories (perfusion area, NPA, and signal reduction artifacts) with weights as (0.25, 0.5, 0.25) w = .
x denotes the position of each pixel, ( ) y x is ground truth,  ( ) y x is the output of the network, and α is a smoothing factor set to 100.
We used the Adam algorithm [30], a stochastic gradient-based optimizer, with an initial learning rate 0.001 to train our network by minimizing the weighted Jaccard coefficient loss function. An additional global learning decay strategy was employed to reduce the learning rate during training. In this learning rate decay strategy, we reduce the learning rate l to *0.9 l when the loss shows no decrease after 10 epochs. This decay strategy will stop when the learning rate l is lower than 1 × 10 −6 . The training process will also stop when both the learning rate and loss stop declining. To initialize the convolution kernels we used He normal initialization [31].
We implemented MEDnet-V2 in Python 3.6 with Keras (Tensorflow-backend) on a PC with an Intel i7 CPU, NVidia GeForce GTX 1080Ti graphics card, and 32G RAM.

Performance evaluation
We applied six-fold cross validation to evaluate the performance of MEDnet-V2 on the entire data set. The data set was split into six subsets and the data of training set and test set are from different eyes. Six networks were trained on five of these six subsets alternately and validated on the remaining one. The performance of the network might be affected by several factors, principally the severity of the disease and low OCT signal strength index (SSI). We separated the test set into two groups, a group with different disease severity and a group with different SSI. For each group, we divided scans into 4 different sub-groups according to a gradient of disease severity or SSI. We calculated 4 measures (accuracy, specificity, sensitivity, and dice coefficient (Eq. (3)) and NPA of each sub-groups ( Where TP is true positives (correctly predicted NPA pixels), TN is true negatives (perfusion area and signal reduction artifacts were considered as negatives), FP is false positives (perfusion or signal reduced area segmented as NPA), and FN is false negatives (NPA segmented as either perfusion or artifact). The specificity approached unity across disease state and SSI, indicating nearly perfect segmentation of healthy tissue. Sensitivity and dice coefficient deteriorated in more severe cases, because of the cumulative error increases with the increasing complexity and size of NPA. In the SSI group, sensitivity and dice coefficient didn't show obvious decline as SSI decreased, which means our network was robust to low-quality images and avoided introducing an artificial trend into the NPA measurements. In fact sensitivity actually rose slightly with decreasing SSI. This can be explained as an artifact in the data: we found that lower SSI correlates with larger NPA. According to Eq. (3), less NPA means more sensitivity to segmentation error since the number of true positives (TP) is low. Signal reduction artifacts originate in a variety of ways. As a supplement to the data set, several typical signal reduction artifacts were simulated on healthy controls [  In clinical cases, the signal reduction artifacts are considerably more complex than simulated ones. Shadows on en face angiograms may connect to the center of the macula [ Fig. 7(A1-D1)], and several kinds of signal reduction artifacts can overlap [ Fig. 7(A2-D2)]. And, furthermore, since NPA and signal reduction artifacts can occur anywhere in our OCTA scans on eyes with DR, the two may co-occur [ Fig. 7(D3-D5)]. When the signal reduction artifacts combined with NPA [ Fig. 7(D4-D5)], our network can still produce an accurate prediction result.

Repeatability
We measured the repeatability using the pooled standard deviation (Eq. (4) and coefficient of variation (Eq. (5) in healthy controls and DR cases with two intra-visit repeated scans (Table  3) and compared it to manual delineation by retinal experts. Where P is the pooled standard deviation, C is the coefficient of variation, N is the number of eyes, s is the NPA standard deviation of two repeat scans within the same visit from the same eye, and μ is the mean NPA of two repeat scans within the same visit from the same eye. Our method shows high repeatability, with a smaller coefficient of variation than grading by human experts. In the DR group, the mean and standard deviation of NPA is larger than th standard devia cumulative er shows higher T NPA (mm 2 , m Pool std.

Defocus
Defocus can c We scanned 1 repeat scans w defocuses, the condition with and a low sig V2 are not aff

Discussio
In this paper, a key biomar algorithms to terms of the v even in the exacerbated f quality. NPA utilized unles hat in healthy c ation and coeff rror of detecti performance th  [2], and Nesper et al. [33], or required manual correction, as in Alibhai et al. [3]. Similarly, our previous algorithm, which was also based on a deep learning approach, failed to distinguish severe signal reduction artifacts from true NPA, but recent work of our group has demonstrated that the shadow artifacts in OCTA can be detected [34]. MEDnet-V2 represents a significant improvement in that it can accurately distinguish between true NPA and signal reduction artifacts without appealing to user input. The ability to discriminate signal reduction artifacts from NPA means that we gain access to data from lower quality scans that may otherwise have been unusable.
MEDnet-V2 achieves these results through several architectural decisions. We used a U-Net-like architecture that enabled MEDnet-V2 to obtain a stable training process while achieving high resolution in the output results. In our previous work, MEDnet showed excellent feature extraction capability. As we expanded the size of the network by embedding new structures from state of the art networks, MEDnet-V2 correspondingly acquired a stronger ability to extract an expanded cohort of features. To obtain an accurate ground truth for NPA and signal reduction artifacts, we developed in-house graphical user interface software to help certified graders delineate a ground truth map. In order to suppress the delineation errors caused by individual subjectivity, we counted each grader's classification as a vote, and took the majority opinion as the final ground truth map to train our network. Our experimental results indicate that MEDnet-V2 gives excellent performance (dice coefficient > 0.87) on scans of different disease severity and defocus.
Although MEDnet-V2 achieves good performance on most OCTA scans, there are some factors that may cause segmentation to fail. The inner retinal thickness map used to help the network distinguish FAZ from low reflectance area is vulnerable to inaccuracy in retinal layer segmentation, for instance due to the presence of structural abnormalities like edema. Although we automatically excluded the edema area when calculating the thickness map, this process still may need minor manual adjustments for complex cases. Similarly, the algorithm [29] can fail in eyes with severe anatomic abnormalities. Finally, when NPA and signal reduction artifacts coincide, the algorithm can generate false negatives by choosing to segment affected areas as artifact, even though NPA is present.
In future work we can attempt to resolve these issues. There are also several possibilities for broadening MEDnet-V2's functionality. MEDnet-V2 was trained and tested on 6 × 6 mm 2 macular angiograms of the SVC, but it is known that NPA in DR can manifest first in the periphery [35]. As OCT systems continue to develop, larger fields of view are becoming available. Semi-automated NPA detection has already been demonstrated in wide-field OCTA [3,25], but they require more human intervention at this time. Extending MEDnet-V2's capabilities to obtain a fully automated widefield NPA detection algorithm could be particularly useful. Finally, MEDnet-V2 currently only functions on SVC scans, but DR causes NPA in other plexuses also. Integrating the network with projection-resolved OCTA [36,37] can extend NPA detection to the deeper retinal plexuses.

Conclusions
In summary, we proposed a deep learning based solution, which we named MEDnet-V2, to address the problem of signal reduction artifacts in the detection and quantification of capillary dropout in the retina using OCTA. The network contains three input images, and outputs a nonperfusion area and signal reduction artifacts distribution map. Features of signal reduction artifacts and NPA were extracted separately before being fused together, which is the key to this network's favorable performance.

Disclosures
Oregon Health & Science University (OHSU), Acner Camino, David Huang and Yali Jia have a significant financial interest in Optovue, Inc. These potential conflicts of interest have been reviewed and managed by OHSU.