Volumetric non-local-means based speckle reduction for optical coherence tomography

: We present a novel tomographic non-local-means based despeckling technique, TNode, for optical coherence tomography. TNode is built upon a weighting similarity criterion derived for speckle in a three-dimensional similarity window. We present an implementation using a two-dimensional search window, enabling the despeckling of volumes in the presence of motion artifacts, and an implementation using a three-dimensional window with improved performance in motion-free volumes. We show that our technique provides eﬀective speckle reduction, comparable with B-scan compounding or out-of-plane averaging, while preserving isotropic resolution, even to the level of speckle-sized structures. We demonstrate its superior despeckling performance in a phantom data set, and in an ophthalmic data set we show that small, speckle-sized retinal vessels are clearly preserved in intensity images en-face and in two orthogonal, cross-sectional views. TNode does not rely on dictionaries or segmentation and therefore can readily be applied to arbitrary optical coherence tomography volumes. We show that despeckled esophageal volumes exhibit improved image quality and detail, even in the presence of signiﬁcant motion artifacts.


Introduction
Optical coherence tomography (OCT) is an imaging technique which produces high-resolution cross-sectional images of biological tissues [1]. Due to the coherent nature of OCT, its tomographic images present speckle [2][3][4] which degrades image quality and hinders visual interpretation. Consequently, the reduction of speckle has been a perennial topic of interest in the OCT community . Indeed, the suppression of speckle and speckle-like noise while preserving the visibility of fine-scale structure is an active area of research for most coherent imaging techniques, including magnetic resonance imaging [28-30], synthetic aperture radar (SAR) [31-33] and ultrasound imaging [34,35]. The main goal of speckle reduction techniques is to increase image quality, thereby facilitating visual interpretation and, in the case of medical applications, improving diagnostic utility. Speckle reduction in coherent imaging is typically achieved by performing an incoherent averaging of uncorrelated speckle realizations of the imaged object [5][6][7][8][9]. Incoherent averaging reduces the speckle contrast, and as the number of uncorrelated realizations increases, the averaged result approaches the equivalent incoherent image [36]. However, for static objects where the scatterer distribution does not change in time, the incoherent addition of uncorrelated speckle realizations carries an implicit resolution loss with respect to the original information content. The search for improved speckle reduction techniques relies on obtaining uncorrelated speckle patterns reducing the impact on the spatial resolution of the image.
Speckle-reduction techniques for OCT can be classified in two general families: hardware-based and post-processing algorithms. Hardware-based approaches modify the acquisition system to produce uncorrelated speckle patterns within or between B-scans. Examples include angular [5,6], spatial [7,8] and frequency compounding [9]. It is easy to see the loss of information content in each implementation: in angular compounding, the raw data contains information corresponding to a higher numerical aperture (NA) than the compounded image -although to create the high-NA image coherent addition of multiple angles is required; in spatial compounding, neighboring sample locations are merged into a single pixel; in frequency compounding, the raw data contains information corresponding to a higher axial resolution. Moreover, the addition of components in commercial OCT systems and special acquisition schemes required in hardware-based techniques is not a straightforward task [8]. In Ref. [7], strain induced on the sample changes the arrangement of the scatterers inside each resolution volume. This spatial approach can, in principle, preserve the spatial resolution while decreasing the speckle contrast. However, the number of uncorrelated speckle patterns that can be realistically created without incurring resolution loss is small [7].
Perhaps the most common embodiment of speckle reduction in OCT relies on the acquisition of repeated B-scans comprising the same sample structure but uncorrelated speckle. This is a special case of spatial compounding and has long been used in commercial ophthalmic OCT systems with success. However, if two B-scans are acquired at exactly the same spatial location, the speckle patterns must necessarily be highly correlated and compounding does not reduce the speckle contrast [10,11]. When compounding B-scans, speckle reduction comes from two sources: in perfused tissue, like the choroid, speckle decorrelation arises mainly from moving red blood cells. In avascular tissue, speckle decorrelation is due to small lateral offsets given by small errors in the eye tracking system.
Available speckle suppression post-processing algorithms for OCT can be classified in general in three families: transformation based, sparse representations and spatial domain. Although many advanced post-processing algorithms have been developed and applied to OCT, it is surprising that most approaches focus on improving individual B-scans, without regard to the volumetric data structure [12][13][14][15]. Frame-to-frame analysis after filtering is generally hindered and further volumetric processing is difficult. Additionally, most of these algorithms treat speckle as additive Gaussian noise through logarithmic conversions, ignoring its specific statistical properties [16,17].
Among post-processing algorithms, transform-based approaches represent images in a different domain by using wavelet or curvelet transforms. In the curvelet or wavelet domain, the coefficients representing the image and speckle are assumed to be distinguishable. Choosing the correct threshold for the speckle coefficients, however, is difficult [13,17,21] and the wrong selection of coefficients causes point-like structures to disappear in the filtered images while preserving the contrast between tissue layers [17]. Sparse representation techniques in OCT adopt a special scanning approach in which few B-scans are captured with high nominal signal-to-noise ratio (SNR) and used to improve the quality of low SNR B-scans. For example, multiscale sparsitybased tomographic denoising (MSBTD) [12] utilizes the high-SNR images to train dictionaries used in the denoising of neighboring low-SNR B-scans. MSBTD was further improved by sparsity based simultaneous denoising and interpolation [18], where the dictionaries were developed from a sparse representation of previous datasets consisting of pairs of low-SNR / low-resolution and high-SNR / high-resolution images, thus not requiring the peculiar scanning system of the MSBTD. More recently, segmentation based sparse reconstruction has enhanced the filter performance by using segmentation of retinal layers to construct specific structural dictionaries, taking advantage of similarity in the patches within segmented layers [19]. Nevertheless, the reliance on a dictionary limits applicability: previously unobserved structures do not have dictionary counterparts and are therefore poorly addressed. Other methods, such as blockmatching and 3D filtering (BM3D) [23][24][25][26]37], are combinations between transform-based and sparse representation algorithms. Although these approaches may function well in specific instances, they tend to over-smooth regions, and visible artifacts are commonly found in the final images.
Resolution-preserving spatial-domain techniques are mainly represented by non-local methods, where the evaluation of each pixel is performed within a large neighborhood based on the existing redundancy of natural images [14, 27-32, 38, 39]. Some approaches, such as the non-local means (NLM) algorithm, denoise images by comparing patches in a surrounding neighborhood and mapping the similarities between patches into weights in order to perform a weighted maximum likelihood estimation of the noise-free image. The NLM algorithm has been implemented in OCT in a probabilistic context (PNLM) [14]. However, PLNM treats speckle as additive Gaussian noise and speckle nulls are seen to corrupt the probability calculation. These incorrectly characterized speckle nulls remain after the filtering process, degrading image quality [14].
We have developed an NLM-based algorithm for OCT -Tomographic Non-local-means despeckling (TNode)-built upon the polarimetric and interferometric non-local framework developed by Deledalle et al. [31,32,38] for SAR images. In analogy with the SAR work, we treat speckle based on its physical properties and use proper speckle statistics in order to adequately estimate weights in the despeckling process. The core idea is to determine small volumetric patches in the tomogram that represent different speckle realizations of the same underlying object, such as small blood vessels. The incoherent averaging is performed non-locally from these patches. Ideally, because only uncorrelated speckle realizations of the same object are compounded, resolution is preserved and speckle contrast reduced. From speckle statistics, we employ a general context of single speckle realization and multiple speckle realizations tomograms to determine the speckle-free intensity. This general context allows TNode to be used in-vivo with singlelook frame analysis and easily adapted to a context where multiple speckle realizations are acquired.
As opposed to the approach used in SAR, we exploit the volumetric information available in OCT by making use of 3D similarity windows to retrieve the weights from the volumetric patch-similarity. We have developed two kinds of search windows: a 2D search window oriented to include the fast-axis of transverse scanning for imaging in-vivo and a 3D window for motionartifact-free or motion-corrected volumetric tomograms. The 2D search window allows speckle reduction even in the presence of motion artifacts as the weighted estimation from the NLM algorithm searches for the most-similar neighbors [32], unaffected by relative distances as long as the motion is smaller than the search window size. The 3D search window enhances the contrast of point-like structures while preserving resolution in cross-sectional images and improving the quality of en-face projections.
Our 3D similarity window enhances the selection of similar patches. Many tissue structures appear one-dimensional (e.g. vessels, nerves) in traditional OCT imaging due to resolution limitations [40,41], as shown in Fig. 1. Such structures are not generally aligned with the scanning direction and therefore appear as point-like structures in cross-sectional views. State-of-the-art filtering is incapable of correctly identifying these structures in a single B-scan, hence detail is lost and only tissue layers and other larger structures are preserved. Our 3D similarity window ensures that one-and two-dimensional structures are correctly characterized and consequently preserved upon despeckling. In essence, TNode considers the 3D geometrical properties of volumetric structures while performing the averaging. We do this voxel-per-voxel in a feature-agnostic way, avoiding the need for image segmentation or dictionaries. Our results show that TNode volumes present high-quality, cross-sectional views oriented along both slow and fast axes of transverse scanning, as well as in en-face views. TNode B-scans resemble those obtained from out-of-plane averaging, without the associated out-of-plane resolution loss.

Speckle statistics
OCT imaging exhibits speckle because of the superposition of coherent light backscattered with random phases [36]. Tomograms with a single realization of speckle have unity speckle contrast C: the speckle fluctuations are of the order of the signal itself and the image of the underlying object is severely degraded. To quantify the deleterious effect of speckle in coherent imaging, we define the signal-to-speckle ratio, SSR, as the inverse of the speckle contrast: SSR= 1/C =Ī/θ, whereĪ is the speckle-free object intensity and θ the speckle standard deviation. We also define the signal-to-noise ratio (SNR) as SNR= I s /I n , where I s represents the OCT noise-free signal intensity and I n is the intensity of the noise. In this formulation, I n encompasses all sources of noise (e.g. shot, thermal, excess photon, digitization, etc.) in the OCT system [2,42,43].
We use speckle statistics in order to find the speckle-free object intensity by determining the parametric distribution that is characteristic of speckle. On an intensity basis, speckle is described by an exponential probability distribution [2,36], which is described by a single parameter. Assuming the object has an incoherent, speckle-free intensity θ(x) at pixel x, the probability of measuring an intensity I(x) under coherent imaging for an arbitrary speckle realization is where θ(x) is the distribution parameter. In an exponential distribution, θ(x) is the distribution mean and standard deviation, and as explained above, it equals the mean intensityĪ(x) of a large set of speckle realizations. Therefore, the correct determination of θ(x) is equivalent to findinḡ I(x), the underlying object intensity under incoherent imaging. Speckle suppression can be understood as an estimation of the parameter θ(x) from data with speckle by using uncorrelated speckle patterns. In OCT systems that acquire uncorrelated speckle patterns using spatial, angular, frequency or any other type of compounding, the averaged intensity from L speckle patterns has a gamma probability distribution [36] where Γ is the gamma function. Equation (2) describes a distribution with SSR= √ L, and reduces to Eq. (1) for L = 1.

Non-local denoising and speckle suppression
In general terms, non-local methods reduce noise by performing a weighted intensity average among a set of regions within a vicinity [31, 32, 38, 39]. These regions, called patches, are not necessarily adjacent, but are ideally from anywhere in the image. The restriction to a bounded vicinity is generally driven by performance considerations. Non-local methods base the denoising process on the probability of two given patches to be different realizations of the noise with the same underlying object. In this case, the pixel values of the two patches are described by a single probability distribution with a common set of parameters and these parameters describe the underlying noise-free object [14,15,[28][29][30]35].
In the context of non-local despeckling, the goal is to determine the probability that the pixel x being despeckled has the same underlying object -its speckle is described by the same distribution parameter-as any other pixel x inside a neighborhood or search window. To define the probability of having a common parameter θ, the intensity I(x) in the pixel x is compared with the intensity I(x ) in the set of neighbors. For each comparison, the pair of pixels are considered similar if their intensities match distributions with a common parameter θ(x, x ), hence I(x) and I(x ) have a high probability of being two realizations of speckle of the same object. The pair of pixels are considered dissimilar if their intensities have a high probability of being observations of two distributions with different parameters θ(x) and θ(x ), and therefore different underlying objects [38]. This problem can be rephrased as a hypothesis test [38]: In practice, comparing just two pixel intensities does not provide a robust measure of similarity. For this reason, non-local methods consider a group of pixels x, a patch, centered at the pixel of interest x. Hereafter, we use a bold symbol to denote a vector quantity with element values assigned to each pixel inside a patch. The set of pixels x are compared pairwise with the pixels x centered at x . Similarly, θ(x) represents a vector of distribution parameters, and I (x) a vector of pixel intensities, for the patch centered at x. Note that all pixels inside the patch are not considered to share the same parameter θ. The patch size, also called similarity window, defines the smallest structure that will be considered during comparisons [39]. Large patch sizes improve the robustness of the similarity metric but reduce the probability of finding similar patches because the compared structures are large. Therefore, the patch size should contain enough independent speckles to have a robust similarity metric but not too many to compromise the effectiveness of the despeckling process.
A similarity criterion makes use of the two hypotheses above to map a pair of speckle patches (x, x ) into a single real value. The process consists of determining the similarity, C r , pixelwise and then compounding the probabilities inside a patch into a log-probability ∆(x, x ): where the purpose of the logarithm is to transform the product of the probabilities into a sum for all patch pixels. It is known that the optimal criterion, C r , based on the hypothesis in Eq. (3), is the likelihood ratio test [38]: where the numerator is the probability p that intensities in x and x have a common parameter θ(x, x ) and the denominator is the probability that the intensities have different parameters θ(x) and θ(x ). Note that there is one probability per pixel, and therefore L is a vector containing a criterion value for each pixel in the patch. This vector is transformed into a single scalar for each patch using Eq. (5).
In practice, the parameters θ(x), θ(x ) and θ(x, x ) in Eq. (6) are unknown. To address this, we use the generalized likelihood ratio (GLR) L G , which solves this problem by replacing θ(x), θ(x ) and θ(x, x ) by their maximum likelihood estimates (MLEs)t 1 ,t 2 andt 1,2 , respectively [38], A detailed derivation of the GLR is shown in Ref.
[38]. The essence of Eq. (7) is that if the intensities in compared pixels are likely to have the same parameter from a unique distribution, the numerator increases and the GLR value tends to a maximum. However, if the intensities of the pixels are more likely to be given by different individual distributions, the denominator decreases the GLR value. In the case of identical patches, L G = 1. As the MLE for the gamma distribution in Eq.
In TNode, we use GLR in order to map the similarities between patches into weights. Now, referring to locations in the tomogram, the pixels x and x become vectors containing the three coordinates ì x = (x, y, z) and ì x = (x , y , z ), which indicate the locations in the volume. Throughout this work, z corresponds to the depth dimension, x to the lateral fast scanning axis and y the lateral slow scanning axis. In these terms, Eq. (5) becomes: where ì τ is a 3D shift vector indicating locations in each patch such that ì x + ì τ and ì x + ì τ span the pixels in the neighborhood of x and x . The components of ì p correspond to the patch half-size, with total elements P = (2p x + 1) × (2p y + 1) × (2p z + 1). The windows indicated by ì x + ì τ and ì x + ì τ represent the similarity windows. We define a 3D similarity window that accounts for the volumetric nature of OCT tomograms. This 3D similarity window considers one-dimensional structures in the volumetric data that appear as point-like structures in cross-sectional views and ensures their preservation upon despeckling since they are correctly characterized in the volumetric patch similarity metric.
After computing the log-probability using Eq. (9), the weights are obtained by recovering the compound probability modified by the parameter h [38]: where h > 0 controls the overall distribution of weight values. Assuming the pixel x is being despeckled, the weights w( ì x, ì x ) are calculated for all pixels within a vicinity and then used in the averaging process to retrieve the speckle-free intensity at ì x. Note that w( ì x, ì x ) is a scalar. NLM performs a weighted average over all pixels ì x + ì τ in an extended vicinity ì v, called a search window, centered at ì x and with total size V = (2v x + 1) × (2v y + 1) × (2v z + 1) [32, 39]. The search window defines a subvolume where the central pixel ì x of the similarity window can be located, and therefore the number of similarity windows contained in the search window is always V regardless of the similarity window size. TNode is based on a weighted maximum likelihood estimation of the speckle-free intensityÎ( ì x) in terms of a weighted average from the intensity with speckle I( ì x ) asÎ In the event of having an object in the similarity window with no similar neighbors within the search window, weights will be small and thus no despeckling is performed. Now, in order to better explain the effect of the h parameter, assume, for simplicity, two one-dimensional patches consisting of only two pixels: patch A (x A1 , x A2 ) and patch B (x B1 , x B2 ). Explicitly, Eq. (10) states Considering that 0 ≤ L G ≤ 1: if h is near unity then the weights remain unchanged; a larger h increases the probability ratio for each pixel; and, all probability ratios approach 1 when h approaches ∞. This implies that when h is large, all weights increase and their contributions are equalized, which increases averaging. When h is very large the behavior approaches that of local averaging with a kernel size equal to the search window. Finally, the weight definition in Eq. (10) is unaffected by the relative distance between patches; it is influenced only by the patch-similarity, making TNode completely non local. In general, the weight for the self similarity is significantly larger than any other weight {x |x x } and therefore, for effective despeckling, we have found that h needs to take a value around 50 h 100. In most cases, it is more effective to replace the self similarity by a value given by all the other values in the search window, such as a multiple of the maximum [28]. This produces a more consistent despeckling, reduces the value of h needed and homogenizes the despeckling behavior for different samples. In this work, we replace the self similarity by w( ì Figure 2 describes the idea behind TNode. As presented in Eq. (2), the exponential distribution followed by speckle in singlelook intensities (L = 1) turns into a Rayleigh distribution if an average between two images is produced (L = 2); it transforms into a Gaussian-like distribution when more intensities are averaged [36]. In the ideal case, after acquiring a large number of realizations (L → ∞), the mean intensity corresponds to the parameter of the distribution, as shown in Fig. 2(a). However, this requires the acquisition of multiple speckle realizations. The NLM algorithm uses patches inside the image [ Fig. 2(b)] in order to produce a mean intensity similar to that obtained after averaging many speckle realizations [32]. The similarity windows depicted in Fig. 2(b) are shown for illustrative purposes. Ideally, the search window should be comprised of the entire data set; in practice, the search window is limited to a given subvolume by memory usage and maximum allowable computation time. The circular patches, 1 to 3, and the square patches, 4 to 7, in Fig. 2(b) illustrate how different patches in the image look alike, and averaging their pixels can produce a despeckled image. Unlike in radar imaging, OCT has a wide SNR range, reaching 50dB or more. In regions with low SNR, the similarity criterion in Eq. (8) may produce artificially high values, as the noise in the tomogram arises from a unique distribution parameterized by the average noise intensity with the same statistical properties as speckle -a circular complex Gaussian random variable. To account for this, we modify the filtering parameter, h, to achieve the same level of despeckling across a wide range of SNRs. Guided by the functional relationship between the correlation coefficient of OCT signals and their SNR [44], we propose the following form for h: where h 0 is the base despeckling parameter and h 1 is an SNR-dependent parameter. With this criterion, in regions with low-SNR h approaches h 0 when SNR→ 0, increasing the selectivity in the averaging process. In high-SNR regions, h increases (h = h 0 + h 1 for SNR→ ∞), relaxing the similarity criterion and therefore equalizing the weights of similar neighboring patches.

Ophthalmic OCT imaging
We tested the performance of TNode in an eye scan of a healthy volunteer acquired with a Spectralis OCT2 Imaging System (Heidelberg Engineering, Germany). This spectral-domain OCT system uses a super luminescent diode with 870 nm central wavelength, 50 nm bandwidth and an A-line acquisition rate of 85 kHz. We selected a region of interest (ROI) consisting of 257 B-scans with 512 A-lines per B-scan and 640 depth samples. We produced a multilook tomogram by recording four speckle realizations (L = 4). The axial pixel size of the reconstructed tomograms is 3.9 µm in air, the A-line spacing is 14 µm and the B-scan spacing is 28 µm. We used a similarity window with τ = 7 × 7 × 7 pixels, equivalent to ≈ 4 × 7 × 2 speckles. We despeckled the data set using a two-dimensional search window (TNode 2D with v = 41 × 1 × 21 pixels, 861 locations containing ≈ 180 distinct speckles) oriented in the fast transverse scanning plane prior to correction of motion artifacts between B-scans. We also performed despeckling after motion correction with a three-dimensional search window (TNode 3D with v = 15 × 5 × 11 pixels, 825 locations containing ≈ 170 distinct speckles). We set the base filtering parameter h 0 = 0 and the SNR-dependent parameter h 1 = 40 for singlelook and h 1 = 100 for multilook tomograms. Figure 3 presents an individual B-scan and the results of despeckled B-scans with the following algorithms: out-of-plane averaging (OOPA), block matching and 3D filtering (BM3D) [37], probabilistic non-local means (PNLM) [14] and our TNode with both 2D and 3D search windows. We used default parameters in all algorithms and produced OOPA with seven adjacent B-scans after correction of motion artifacts. After motion correction, neighboring B-scans exhibited similar structures and the spatial compounding produced by OOPA reduced speckle successfully. However, since 1D structures may be located in different positions across B-scans, there is potential for loss in spatial resolution. This effect can be seen in the insets (red box). BM3D, as a processing algorithm designed for white noise suppression, requires transformation of speckle into additive noise through a logarithmic conversion. This may result in artifacts and speckle remains visible in the filtered B-scans. PNLM relies on characterizing speckle nulls by measuring the probability of having signal or speckle in each pixel; undetected speckle nulls remain visible in the despeckled B-scans. TNode-filtered B-scans resemble the visual appearance of OOPA but without the associated degradation of resolution. Similar results are obtained with the 2D search window with motion artifacts and the 3D window after correction of motion. However, TNode in 3D enhances layer homogeneity after despeckling, as evidenced by the photoreceptor layer (orange box). For a comparison of the tomograms after despeckling see Visualization 1.
In order to show the resolution-preserving and volumetric nature of TNode, we focus now on blood vessels in the ganglion cell layer of the eye scan after despeckling the tomogram. Generally, OCT angiography would be used for visualizing blood vessels. Here, we take advantage of their one-dimensional structure to evaluate the resolution after despeckling. To facilitate visualization of the volumetric data, we flattened the tomogram, referenced to the retinal pigment epithelium layer, and performed correction of motion artifacts. Figure 4 presents a comparison of en-face and cross-sectional views of the eye scan, showing the original volume and results of spatial compounding with OOPA and after despeckling with TNode 3D. As OOPA takes advantage of the similarity between neighboring B-scans, retinal layers are well-defined after despeckling. The orange and green arrows show large blood vessels which are difficult to identify in single cross-sections but easier to identify in en-face projections. After OOPA, these large vessels become more visible due to their high scattering compared to avascular tissue. However, because their location shifts among B-scans, OOPA produces significant blurring after the averaging process. In contrast, TNode prevents this blurring as the similarity is evaluated non-locally to produce the weighted average.
The red box in Fig. 4 highlights an ROI containing two blood vessels, closely spaced along the x axis. In the original tomogram, the vessels appear as two speckles, slightly brighter than the surrounding tissue. There is simply not enough information in a single B-scan to infer the true identity of these two speckles. In the OOPA case, the resolution loss associated with averaging multiple B-scans is evident: the two vessels appear as a single blurred bright area. In contrast, as TNode exploits volumetric structures when despeckling, the geometrical distribution of the vessels is considered in the averaging. This ensures that bright pixels containing information belonging to a one-dimensional structure are preserved in both axes while surrounding speckle is suppressed. The blue box highlights an ROI where a blood vessel appears as a bright speckle. After using OOPA, the resolution loss and the low contrast produces a blurred bright area. The Singlelook OOPA

PNLM BM3D
TNode 2D TNode 3D same vessel is clearly visible after using TNode. Note that even with anisotropic sampling along axes, the en-face view and longitudinal cross-section zy exhibit a homogeneous speckle reduction without evident resolution degradation. As TNode can be used in tomograms captured with multiple realizations of speckle and previously averaged tomograms, we tested its performance after averaging the four realizations acquired with the eye scan. Since perfused tissue varies speckle dynamically during acquisition, most of the improvement involved in the multiple realizations and multilook process is achieved near vascular tissue, while static avascular tissue remains largely unaffected. Figure 5 is a comparison between single and multilook en-face views in the outer plexiform layer of the same eye scan after flattening. We have adjusted the dynamic range of the images in the multilook case, where we have found an increase in the mean intensity after producing the compound intensities. The mean intensity of the multilook tomogram is 2.7 dB higher compared to the singlelook case, and TNode 3D (c). TNode preserves the two speckle-sized vessels shown in the red box as two independent structures, while OOPA merges them into a single structure. Resolution in the yz projection (small blood vessel in blue box) is mostly preserved in TNode. Scale bars correspond to 300µm in each direction. therefore we shifted the intensity range to match this increase. This modification has been applied to the multilook tomogram after using TNode as well. Figure 5 clearly shows how perfused tissue in the multilook images has improved contrast with respect to the singlelook images. Moreover, after despeckling with TNode, vessels are preserved. Structures strongly corrupted by the presence of speckle are clearly noticeable after despeckling, as indicated in the boxes in Fig. 5. See Visualization 1 for a comparison of the en-face views of the tomogram.

Gastrointestinal OCT
The voxel-per-voxel approach of TNode, without the need for dictionaries, enables its use in any imaging application of OCT. To demonstrate this, we used TNode to despeckle a gastrointestinal OCT (GI-OCT) clinical data set, acquired in a recently completed clinical registry [45], with an NvisionVLE Imaging System (NinePoint Medical, Inc., Bedford, Massachusetts) [46]. This is a frequency-domain OCT system with a wavelength-swept laser centered at 1310 nm and with a 90 nm sweep range and a 50 kHz A-line repetition rate. We selected an ROI of 300 B-scans with 1024 A-lines per B-scan (out of the 4096 A-lines per B-scan in the full volume) and 900 pixels in depth. The data set was processed with OOPA using 7 adjacent B-scans, the PNLM filter with default parameters and TNode 3D, in all cases, non-uniform rotational distortion (NURD) Fig. 5. En-face view comparison of a single layer in the outer plexiform layer with single and multilook tomograms: singlelook (a) and multilook (b) tomograms with speckle, and corresponding singlelook (c) and multilook (d) after using TNode 3D. Boxes highlight structures corrupted by speckle that appear clearer after TNode. See Visualization 2 for a comparison of the tomogram. correction [47] was previously performed. The axial pixel size of the reconstructed tomograms is 6 µm in air. The OCT system has a transverse e −2 beam radius of 33 µm, the A-line spacing is ≈ 15 µm and the B-scan spacing is ≈ 50 µm. Because of the highly anisotropic lateral sampling, we used a similarity window with τ = 11 × 7 × 7 pixels equivalent to ≈ 7 × 7 × 2 speckles, and a search window with v = 13 × 5 × 9 pixels (585 locations containing ≈ 120 distinct speckles). The parameters were set to h 0 = 0 and h 1 = 45. Figure 6 presents the comparison of the ROI cross-sectional view with the obtained results. As motion artifacts, even after NURD correction, are routine in endoscopic OCT because of patient involuntary motion, OOPA-processed B-scans have very poor quality, with structures, such as glands, and tissue morphology appearing significantly blurred and with reduced contrast. The PNLM filtered B-scan shows speckle nulls which were unrecognized during the calculation of the corruption probability. Additionally, because of the high SNR range in GI-OCT data, which is significantly higher than in ophthalmic data, some high-SNR regions still present significant speckle, while low-SNR regions have an over-smoothed appearance. Our results indicate that TNode 3D produces the highest homogeneity in the B-scan while preserving most of the morphological information, even with evident patient motion. The TNode 3D B-scan shows a significant reduction of speckle, preserving contrast and maintaining the structural and morphological characteristics of tissue. A homogeneous reduction of speckle is evident across the whole B-scan. For a comparison of the 300 B-scans see Visualization 3.

Phantom data and quantitative quality metrics
Quantitative image metrics, such as the contrast-to-noise ratio or the peak signal-to-noise ratio, rely on having known homogeneous regions or access to the underlying speckle-free intensity. When applied to natural images, these metrics are prone to misinterpretation because oversmoothed images with lack of detail get highly rated. In order to calculate meaningful image quality metrics, we performed an experiment with a three-dimensional structured phantom made from cellulose. The detailed cellulose structure was obtained in an OCT scan by immersing the phantom in water. This allowed us to define regions with signal produced exclusively by the phantom and void regions defined as the background. Next, we added intralipid at 20% volume concentration to the water bath to create scattering in the cellulose voids. We acquired 2 volumes at different intralipid concentrations, corresponding to 1% and 0.3%. The volumes were acquired with a custom-built wavelength-swept source OCT system, with 1310 nm central wavelength, 130 nm bandwidth and an A-line repetition rate of 54 KHz. We selected an ROI having 256 B-scans with 256 A-lines per B-scan and 256 depth samples, acquired with an objective lens with 35 µm e −2 beam diameter. The reconstructed tomograms had an A-line spacing of 10 µm, a B-scan spacing of 10 µm and an axial pixel size of 6 µm in air. In order to quantitatively compare the performance of TNode with other speckle-reduction algorithms, we despeckled the phantom tomograms with OOPA, BM3D and PNLM. For TNode, we used a τ = 7 × 7 × 7 pixels similarity window, equivalent to ≈ 4 × 7 × 2 speckles, a v = 41 × 1 × 21 pixels (861 locations containing ≈ 180 distinct speckles) two-dimensional search window for TNode 2D and a v = 15 × 5 × 11 pixels three-dimensional search window for TNode 3D (825 locations containing ≈ 170 distinct speckles). Although the objective lens has the same lateral resolution in the x and y axes, the Brownian motion of the lipid particles produced decorrelated speckle patterns across B-scans regardless of the B-scan spacing; this effect was considered in the calculations above. We set the base filtering parameter h 0 = 0 and the SNR-dependent parameter h 1 = 38. We produced OOPA with seven adjacent B-scans and set default parameters for PNLM and BM3D. For BM3D and PNLM tomograms images were first transformed into logarithmic scale, while in OOPA and TNode tomograms were kept on a linear scale.
With clearly defined homogeneous areas (corresponding to cellulose voids filled with intralipid), we defined three two-dimensional ROIs corresponding to small cellulose structures surrounded by intralipid for all possible projections: A is an x y (en-face) ROI, B is an yz ROI and C an xz ROI [see Fig. 7(a)]. The phantom in water was used to define a mask for the pixels containing the cellulose structure, which when immersed in intralipid had an intensity I S ( ì x), and the inverse mask for the background containing intralipid with intensity I I L ( ì x). Logarithmic counterparts to these intensities are denoted by L [L X ( ì x) ≡ 10 log 10 I X ( ì x)]. Based on these, we calculated the following quality metrics: 1. The average contrast between cellulose and intralipid, defined asĪ S /Ī I L , where¯denotes the average among all the pixels ì x in a given ROI. This average implicitly removes the speckle fluctuations and therefore, when calculated in a singlelook tomogram, defines a speckle-free ideal contrast. Despeckling algorithms should preserve this contrast as much as possible. where σ(L) corresponds to the standard deviation of the intensity L in an ROI. CNR describes the contrast between two regions of different intensities in presence of noise, and uses logarithmic intensity to characterize this metric in a similar manner to when tomograms are examined visually. Despeckling algorithms should heavily reduce σ I L , increasing the CNR. σ S is expected to decrease. However, its value will be dominated by changes in the intensity of the structure itself. Nevertheless, a reduction in the overall contrast in the image as a result of despeckling will contribute to a reduction of the CNR.

The equivalent number of looks (ENL)
[49] is defined as E N L = SSR 2 =Ī 2 I L /σ 2 I L , and provides an estimation of the number of uncorrelated speckle patterns used to achieve a given image quality (see Eq. (2). As such, this is a metric related solely to the quality of the despeckling. To guarantee enough data for proper statistics, a larger dedicated background ROI was used to calculate the ENL.  TNode 2D and TNode 3D show a similar preservation of the cellulose structure with a superior degree of speckle reduction. Figure 8(a) shows the despeckling results for ROI B at two intralipid concentrations. An increase in the intralipid concentration reduces the contrast with the cellulose. In order to assess the resolution loss after despeckling, we measured the full width at half-maximum (FWHM) of the cellulose structure across the lines indicated in Fig. 8(a) and (b) at 0.3% intralipid concentration. The FWHM for the singlelook tomograms was measured at a location away from a speckle null. The FWHM in water corresponds to the FWHM expected for the PSF of the objective lens, once the confocality is taken into account (35µm/ √ 2/ √ 2 ln 2 = 21µm FWHM). Our results indicate that the FWHM increases from ≈20 µm to ≈30 µm after immersion in intralipid. However, it does not further increase after despeckling with BM3D, PNLM and TNode. The increase in FWHM between water and intralipid could be attributed to the change in the liquid-air interface geometry given by surface tension, which would deform the cellulose structure between the two measurements. In the case of OOPA, as expected, the FHWM of the structure is increased by a factor of ≈ 2 compared with other despecking algorithms. This is not a definitive conclusion, because assessing resolution in scattering media is a complex task. Alternative experiments, such as evaluating the ability to separate two sources inside a scattering medium, implies the use of phantoms with extremely large homogeneous regions, which facilitate despeckling unrealistically.  We measured the contrast, CNR and ENL for the two intralipid concentrations. The numerical results are tabulated in Table 1. Overall, the contrast drops less than 1 dB for most algorithms and therefore this performance metric is comparable for all of them. For the 0.3% intralipid, the visually strong performance of TNode is numerically supported by the results of the CNR and ENL. TNode 3D had consistently the highest CNR among all algorithms, meaning that details in the image are preserved and are visually more evident after despeckling. Interestingly, despite the drop in contrast and CNR of the singlelook tomogram going from 0.3% to 1% intralipid, TNode 3D is able to generate a despeckled ROI with a similar CNR for both concentrations. In terms of pure speckle suppression, TNode significantly outperforms all other techniques as quantified by the ENL. We believe our phantom experiments -as well as the ophthalmic data-support the conclusion that the possible resolution loss of TNode in a realistic sample, with intertwined regions of fine structures and homogeneous scattering, is very small. Together with its excellent despeckling performance as quantified by the ENL, its improvement of CNR and its preservation of contrast, TNode outperforms all other state-of-the-art techniques for despeckling OCT scans, improving the interpretation of OCT volumes of scattering media.

Conclusion
In conclusion, we have presented a novel despeckling technique for OCT, TNode, which is based on the non-local means algorithm with proper speckle statistics. TNode uses a 3D similarity window that leverages the volumetric nature of OCT data and accounts for the full three-dimensional geometry of structures normally found in OCT imaging. By using the generalized likelihood ratio to retrieve the volumetric patch similarity, we used a weighted maximum likelihood estimation to compute the speckle-free intensity tomogram, while permitting the use of single and compound tomograms with multiple speckle realizations. By accounting for the high dynamic range of OCT tomograms, we achieved homogeneous reduction of speckle at all depths.
In the retinal data set, visual comparison of the results obtained with different despeckling techniques demonstrated the strong performance of TNode in terms of resolution and speckle reduction. Cross-sectional and en-face views after despeckling showed the preservation of resolution in all dimensions, while conserving even speckle-sized structures. In singlelook and multilook tomograms the contrast improvement accomplished with TNode was evident. Even when applied to OCT data having significant motion artifacts, such as images acquired endoscopically in the gastrointestinal tract, TNode showed excellent speckle reduction while preserving resolution. With the phantom data, we demonstrate the potential of TNode for resolution and structure preservation and reduction of speckle, by using different metrics to quantitatively evaluate the quality of filtered images.
Since speckle statistical properties are common to all coherent imaging techniques, TNode could be used in other volumetric techniques, such as ultrasound imaging. However, we believe its strongest impact could come with its use as a pre-processing step in advanced OCT signal processing. For example, OCT intensity-based tractography [50-52] could greatly benefit from a resolution-preserving despeckling technique to improve the tracking process. Polarizationsensitive OCT in the Stokes formalism could also greatly benefit from TNode [53-56], where it could mitigate the negative effect of speckle nulls in retardation and optical axis calculations and reduce the associated lateral resolution degradation.

Supporting material
We have made a Matlab implementation of TNode, Code 1, available at Ref.
[57] as well as on http://octresearch.org, including the full source code of TNode 2D and TNode 3D, as well as exemplary raw datasets for testing purposes.