Reconstruction of high-resolution 6x6-mm OCT angiograms using deep learning

Typical optical coherence tomographic angiography (OCTA) acquisition areas on commercial devices are 3x3- or 6x6-mm. Compared to 3x3-mm angiograms with proper sampling density, 6x6-mm angiograms have significantly lower scan quality, with reduced signal-to-noise ratio and worse shadow artifacts due to undersampling. Here, we propose a deep-learning-based high-resolution angiogram reconstruction network (HARNet) to generate enhanced 6x6-mm superficial vascular complex (SVC) angiograms. The network was trained on data from 3x3-mm and 6x6-mm angiograms from the same eyes. The reconstructed 6x6-mm angiograms have significantly lower noise intensity and better vascular connectivity than the original images. The algorithm did not generate false flow signal at the noise level presented by the original angiograms. The image enhancement produced by our algorithm may improve biomarker measurements and qualitative clinical assessment of 6x6-mm OCTA.


Introduction
Optical coherence tomographic angiography (OCTA) is a non-invasive imaging technology that can capture retinal and choroidal microvasculature in vivo [1]. Clinicians are rapidly adopting OCTA for evaluation of various diseases, including diabetic retinopathy (DR) [2,3], age-related macular degeneration (AMD) [4,5], glaucoma [6,7], and retinal vessel occlusion (RVO) [8,9]. High-resolution and large-field-of-view OCTA improve clinical observations, provide useful biomarkers and enhance the understanding of retinal and choroidal microvascular circulations [10][11][12][13]. Many enhancement techniques have been applied to improve the OCTA image quality, including a regression-based algorithm bulk motion subtraction in OCTA [14], multiple en face image averaging [15,16], enhancement of morphological and vascular features using a modified Bayesian residual transform [17], and quality improvement with elliptical directional filtering [18]. These approaches can improve vessel continuity and suppress the background noise on angiograms with proper sampling density (i.e., sampling density that meets the Nyquist criterion). However, while commercial systems offer a range of fields of view, only 3×3-mm angiograms are adequately sampled for capillary resolution as the OCTA system scanning speed limits the number of A-lines included on each cross-sectional B-scan. Conventional image enhancement techniques like those mentioned above are not effective on the under-sampled 6×6-mm angiograms. This is unfortunate since the larger scans, with reduced resolution, are in more need of enhancement. The difficulty in enlarging the field without sacrificing resolution is a significant issue for development of OCTA technology, as its field of view is significantly smaller than modalities such as fluorescein angiography (FA). Recently, deep learning has achieved dramatic breakthroughs, and researchers have proposed a number of convolutional neural networks (CNN) for OCTA image processing [19][20][21][22][23][24][25][26]. As an important branch of image processing, super-resolution image reconstruction and enhancement also benefited from deep-learning-based methods [27][28][29][30][31]. Here, we propose a high-resolution angiogram reconstruction network (HARNet) to reconstruct high-resolution angiograms of the superficial vascular complex (SVC). We evaluated the reconstructed high-resolution OCTA for noise level in the foveal avascular zone (FAZ), contrast, vascular connectivity, and false flow signal. We also demonstrate that HARNet is capable of improving not just under-sampled 6×6-mm, but 3×3-mm angiograms as well.

Data acquisition
The 6×6-and 3×3-mm OCTA scans of the macula used in this study were acquired with 304×304 A-lines using a 70-kHz commercial OCTA system (RTVue-XR; Optovue, Inc.). Two repeated B-scans were taken at each of the 304 raster positions and each B-scan consisted of 304 A-lines. The split-spectrum amplitude-decorrelation angiography (SSADA) algorithm was used to generate the OCTA data [32]. A guided bidirectional graph search algorithm was employed to segment retinal layer boundaries [33] (Fig. 1 A1 and B1). 3×3-and 6×6-mm angiograms of the SVC (Fig. 1 A2 and B2) were generated by maximum projection of the OCTA signal in a slab including the nerve fiber layer (NFL) and ganglion cell layer (GCL). We normalized the pixel value range in both structural OCT and OCTA to 0-1 using min-max normalization (Eq. 1).
where S (i, j) is the normalized pixel value at location (i, j), and min(·) and max(·) are minimum and maximum pixel value of overall image, respectively.

Network architecture
Our network structure is composed of a low-level feature extraction layer, high-level feature extraction layers, and a residual layer (Fig. 2). Input to the network consists of SVC angiograms. The network first extracts shallow features from the input image through one convolutional layer with 128 channels. Then the high-level features are extracted through four convolutional blocks. Each convolutional block is composed of 20 convolution layers (C 1 -C 20 ) with 64 channels. The kernel size in all the convolutional layers is 3×3 pixels. Skip connections concatenate the output and input of each convolutional block as the input to the next convolutional block. The output and input of the last convolutional block are concatenated and then fed to the residual layer. The residual layer contains a channel that produces the residual image. The residual image and input image are summed to produce the final reconstructed output image. For the most part, low-resolution and high-resolution images have the same low-frequency information, so the output consists of the original input and the residual high-frequency components predicted by HARNet. By only learning these high-frequency components, we were able to improve the convergence rate of HARNet [27]. After each convolution layer, excluding the residual layer, we added a rectified linear unit (ReLU) [34]  We trained HARNet by reconstructing 6×6-mm angiograms from their densely-sampled 3×3-mm equivalents. To do so, we first used bi-cubic interpolation to scale the size of the 6×6-mm SVC angiograms (Fig. 3A) by a factor of 2, so that they would be on the same scale as a 3×3-mm scan. Then we used intensity-based automatic image registration [35] (Fig. 3D) to register the 3×3-mm angiograms ( Fig. 3C) with the scaled 6×6-mm angiograms (Fig. 3B). Finally, we cropped the overlapping region from each by taking the maximum inscribed rectangle to construct the input for HARNet and the ground truth ( Fig. 3E and F).

Loss function
We trained the network on a ground truth composed of the original 3×3-mm angiograms filtered with a bilateral filter. To minimize the difference between the output of network and the ground truth, the loss function used in the learning stage was a linear combination of the mean square error (MSE; Eq. 2) and the structural similarity (SSIM; Eq. 3) index [36,37]. MSE is used to measure the pixel-wised difference, and SSIM is based on three comparison measurements: reflectance amplitude, contrast, and structure: where w, h refer to the width and height of the image, X and Y refer to the output of HARNet and the ground truth, respectively, and µ X and µ Y are their mean pixel values, σ X and σ Y are their standard deviations, and σ XY is the covariance. The values of the constants C 1 = 0.01 and C 2 = 0.03 were taken from the literature [37]. The loss function (Eq. 4) was a linear combination of the MSE and the SSIM.

Subjects and training parameters
The data set used in this study consisted of 298 eyes scanned from 196 participants. Each eye was scanned with both a 3×3-mm and a 6×6-mm scan pattern. Ten healthy eyes from 10 participants were intentionally defocused and used in defocusing experiments. Of the remaining 288, we used 210 of these paired scans (randomly selected) for training, and reserved the rest for testing (N=78). The performance of this network on testing data was separately evaluated on eyes with diabetic retinopathy (N=53) and healthy controls (N=25). Finally, false-flow generation experiments also used 10 cases from the test set of healthy eyes. We used several data augmentation methods to expand the training dataset, including horizontal flipping, vertical flipping, transposition, and 90-degree rotation. For training, we used 38×38-pixel sub-images. Thus the 1050-images in the training dataset after augmentation can be decomposed into 174,555 sub-images, which are extracted from cropped SVC angiograms with a stride of 19. An Adam optimizer [38] with an initial learning rate 0.01 was used to train HARNet by minimizing the loss. We used a global learning rate decay strategy to reduce the learning rate during training in which the learning rate was reduced by 90% when the loss showed no decline after 2 epochs, provided the rate was greater than 1 × 10 −6 . Training ceased when loss didn t change by more than 1 × 10 −5 in 3 epochs. The training batch size was 128.
We implemented HARNet in Python 3.6 with Keras (Tensorflow-backend) on a PC with a 16G RAM and Intel i7 CPU, and two NVIDIA GeForce GTX 1080Ti graphics cards.

Results
To validate the performance of our algorithm, we used a test dataset that composed of 78 paired original 3×3-and 6×6-mm angiograms and evaluated the reconstructed 3×3-mm and 6×6-mm angiograms using three metrics: noise intensity in the FAZ, global contrast, and vascular connectivity. In addition, we also performed experiments on defocused SVC angiograms, angiograms with different simulated noise intensities, and DR angiograms.

Noise intensity
In healthy eyes, the FAZ is avascular, so to obtain an estimate of noise intensity I Noise , we consider the pixel values in 0.3-mm diameter circle R centered in the FAZ where S(i, j) is the pixel value at position (i, j). Compared to the original 6×6-mm angiograms, the noise intensity in reconstructed 6×6-mm SVC angiograms was significantly reduced ( Table  1).

Contrast
The global contrast of the SVC angiograms produced by the network was measured by the root-mean-square (RMS) contrast [39], where A is the total area of the SVC angiogram and µ is its mean value. Our results indicate there is no significant difference between original and reconstructed angiograms on image contrast (Table 1).

Vascular connectivity
We also assessed vascular connectivity. To do so, we first binarized the reconstructed angiograms ( Fig. 4 A2-D2) using a global adaptive threshold method [40], then skeletonized the binary map to get the vessel skeleton map (Fig. 4 A3-D3). Connected flow pixels were defined as any contiguous flow region with at a length of at least 5 (including diagonal connections), and the vascular connectivity was defined as the ratio of the number of connected flow pixels to the total number of pixels on the skeleton map [32]. The connectivity of the reconstructed angiograms ( Fig. 4 B, D) was significantly improved over the originals (Fig. 4 A, C), (Table 1).

Performance on defocused angiograms
In order to further verify that our algorithm can improve the image quality of low-quality scans, we also evaluated its performance on defocused angiograms. To obtain defocused scans, we first performed autofocus to optimize the focal length to get optimal scans, and then manually adjusted the focal length to obtain angiograms defocused by 3 diopters. Finally, 10 defocused 3×3-mm angiograms and 12 defocused 6×6-mm angiograms were obtained. Defocused angiograms have lower signal-to-noise ratios than correctly focused angiograms, and vessels also appear dilated.
The results show that angiograms reconstructed from defocused 3×3-and 6×6-mm angiograms had lower noise intensity and better connectivity than scans acquired under optimal focusing conditions (Fig. 5, Table 2). Therefore, our algorithm is also applicable to defocused angiograms and improves the quality of such scans. Since defocus leads to a general reduction in scan quality, this result also implies that our algorithm could be used to clean low-quality scans.

Assessment of False flow signal
One concern in OCTA reconstruction is the generation of false flow signal. Because OCTA reconstruction methods are designed to enhance vascular detail, they are susceptible to mistakenly enhancing background that may randomly share some features with true vessels. In order to evaluate whether HARNet produces such artifacts, we selected 10 3×3-mm angiograms with good quality from 10 healthy eyes and then produced denoised angiograms by applying a simple Gabor and median filter to the original 3×3-mm angiograms (Fig. 6 A1). Then we added Gaussian noise to the denoised angiograms using different parameters (µ, σ) ( Fig .6 B1-E1). We varied µ and σ separately in increments of 0.005 from 0.001 to 0.1 and from 0.001 to 0.05, respectively, to obtain 2000 noisy 3×3-mm SVC angiograms with different noise intensities (0 -2100). Next, we input the denoised and noisy angiograms into the network to obtain reconstructed angiograms from each (Fig. 6 A2-E2). The false flow signal intensity was defined as where I flow signal is the false flow signal intensity, and R corresponds to the same, physiologically flow-free 0.3-mm diameter circle within the FAZ as previously. We found our algorithm did not generate false flow signal when the noise intensity was under 500, which is far above the noise intensity measured in original 3×3-mm (117.15 ± 97.40) and 6×6-mm (22.98 ± 30.19) angiograms (Fig .7).

Performance on DR angiograms
Many diseases present outside of the central area of the macula. The enhancement of larger field-of-view angiograms resolution and image quality may improve the measurements of disease biomarkers such as non-perfusion area and vessel density, thereby further helping ophthalmologists diagnose such diseases. However, since features in diseased eyes may vary from healthy, it is possible that image reconstruction algorithms could suffer from reduced performance on such images. To investigate, we examined reconstructed 6×6-mm angiograms (Fig. 8) of eyes with DR, a leading cause of blindness [41]. Although the 6×6-mm angiograms of eyes with DR have higher noise intensity than healthy eyes, results show that the reconstructed DR angiograms also demonstrate the improvement on noise intensity, contrast, and connectivity comparable to that of healthy controls (Table 3).

Discussion
Image analysis of low-quality or under-sampled OCTA is challenging in several respects. Noise affects the visibility of small blood vessels, especially capillaries, leading to artifactual vessel fragmentation. Motion and shadow artifacts are common, and amplified by under-sampling. OCTA quality, then, can have a significant impact on the judgment of ophthalmologists or researchers. To help mitigate this concern, several noise reduction and image enhancement procedures have been proposed. To reduce noise and enhance vascular connectivity, datasets are sometimes obtained by acquiring multiple images of the same location over time, making it possible to apply various averaging techniques [15,16,42,43]. However, the acquisition of larger and larger amounts of data makes the total acquisition time longer, increasing the probability of image artifacts caused by eye motions and introducing additional difficulty for clinical imaging.
Filtering is also often applied to OCTA images to improve image quality [18,44], but typical problems in data filtering are reduced image resolution and the loss of capillary signal. Other noise reduction strategies suffer similar issues. For instance, a regression-based algorithm [14] that can remove decorrelation noise due to bulk motion in OCTA has been reported. Although image contrast was improved by this method, the drawback is worse vessel continuity, and it also suffers the loss of capillaries with weak signal. In this study, our proposed method can not only reduce noise and enhance connectivity, but also improve the capability to resolve capillaries in large-field-of-view scans. The two most common scan patterns used in research and the clinic are 3×3-mm and 6×6-mm [45,46]. While the smaller 3×3-mm OCTA can obtain higher image quality due to the denser scanning pattern, its small fields-of-view is a major limitation. Our algorithm s ability to enhance 6×6-mm OCTA is a step toward compensating for this limitation. We achieved this enhancement by training a network to reconstruct images by learning features from the high-definition 3×3-mm images. This means that we did not need to manually segment vasculature to generate the ground truth, or generate high-definition scans by using a new scanning protocol in a prototype [19]. Therefore, our approach is a practical method to enhance 6×6-mm images by using an acquired 3×3-mm image, that could in principle also be extended to even larger fields-of-view with sparser sampling. Such enhancement via intelligent software could prove to be a superior method for achieving high-quality, large-field scans since hardware solutions (like, for example, increasing sampling density or incorporating adaptive optics) quickly lead to prohibitive cost and imaging times. Improving image quality and resolution may in turn promote better measurements of disease biomarkers such as non-perfusion area and vessel density; by extending improved image quality to a larger field-of-view we also increase the chance that we will detect pathology since disease can manifest outside of the central macular region usually imaged with OCTA [13,47].
We investigated the quality of our algorithm s output by evaluating reconstructed angiograms with three metrics: noise intensity in the FAZ, global contrast, and vessel connectivity. The angiograms obtained by our algorithm have almost no noise in the FAZ (0.01 ± 0.00) and vascular connectivity was likewise increased in the HARNet-processed images. In addition to these quantitative improvements, we consider the HARNet output images to appear qualitatively cleaner than the unprocessed input. We also performed experiments on defocused SVC angiograms, and the results show that the algorithm can improve such scans, which is an indication of robustness and broad utility. To demonstrate that the restored flow signal in the reconstructed angiograms is real, we verified whether a false flow signal is generated by using angiograms with different simulated noise intensities. The results show that our algorithm did not generate false flow signal when the noise intensity was under 500. This value far exceeds the noise intensity in the clinically-realistic OCTA angiograms examined in this study. Because the noise intensity in the FAZ and inter-capillary space is similar, we also think that artifactual vessels should not be generated outside of the FAZ.
HARNet improved the quality of both 3×3-and 6×6-mm OCTA angiograms according to the metrics examined in this study. Specifically, HARNet enhanced the quality of under-sampled 6×6-mm OCTA, while other enhancement algorithms perform poorly on such scans [48,49]. And it is interesting that, while HARNet was trained to reconstruct high-resolution 6×6-mm angiograms from sparsely sampled scans, the network also improved 3×3-mm images. In particular, the angiograms reconstructed from defocused scans compared favorably to equivalent images acquired at optimal focus for both scanning patterns. This implies that HARNet is effective as a general OCTA image enhancement tool, outside of the specific context of 6×6-mm angiogram reconstruction. Additionally, the image improvement provided by HARNet is more than just cosmetic, as demonstrated by the improvement in vessel connectivity. Although beyond the scope of this study, we speculate that other OCTA metrics (e.g., non-perfusion area or vessel density) may also prove to be more accurately measured on HARNet-reconstructed images.
There are some limitations to this study. Since we trained HARNet by using optimally sampled, centrally located 3×3-mm angiograms, features specific to the periphery, like for instance the grating-like vascular structure of the radial peripapillary capillaries (Fig. 9 C1), could not be learned during training. HARNet therefore may introduce features that are physiologically specific to the central macula into more peripheral regions ( Fig. 9 B2, C2). Likewise, HARNet may remove features specific to peripheral regions, particularly if there are disease-specific features that are more prevalent in the periphery compared to the macula, such as neovascularization elsewhere, which tend to occur more along the major vessels, away from the central macula. Unfortunately, due to the lack of a high-resolution ground truth for the region outside the central macula, we can only speculate on this issue. HARNet also currently only works in only one vascular complex (the superficial), but the intermediate and deep capillary plexus, as well as the choriocapillaris, are important in several diseases [50][51][52][53][54][55]. Reconstruction of these vascular layers would also be beneficial; however, issues such as shadowing that present preferentially in low-density scanning patterns are only exacerbated in these deeper layers. This makes image reconstruction in these locations significantly more challenging. Finally, to completely characterize HARNet, it will also be important to assess performance on pathological scans. While our data indicates that HARNet can also perform well on DR angiograms, there are of course many other diseases that could be examined for a more thorough assessment. Furthermore, a complete investigation of HARNet s performance on these diseases would include the extraction of relevant biomarkers to determine if they are more or less accurately measured on reconstructed images. Future work can focus on these shortcomings.

Conclusions
We proposed an end-to-end image reconstruction technique for high-resolution 6×6-mm SVC angiograms based on high-resolution 3×3-mm angiograms. The high-resolution 6×6-mm angiograms produced by our network had lower noise intensity and better vasculature connectivity than original 6×6-mm SVC angiograms, and we found our algorithm did not generate false flow signal at realistic noise intensities. The enhanced 6×6-mm angiograms may improve the measurements of disease biomarkers such as non-perfusion area and vessel density.

Disclosures
Oregon Health & Science University (OHSU), Yali Jia has a significant financial interest in Optovue, Inc. These potential conflicts of interest have been reviewed and managed by OHSU.