Three-dimensional curvelet-based dictionary learning for speckle noise removal of optical coherence tomography

: Optical coherence tomography (OCT) is a recently emerging non-invasive diagnostic tool useful in several medical applications such as ophthalmology, cardiology, gastroenterology and dermatology. One of the major problems with OCT pertains to its low contrast due to the presence of multiplicative speckle noise, which limits the signal-to-noise ratio (SNR) and obscures low-intensity and small features. In this paper, we recommend a new method using the 3D curvelet based K-times singular value decomposition (K-SVD) algorithm for speckle noise reduction and contrast enhancement of the intra-retinal layers of 3D Spectral-Domain OCT (3D-SDOCT) images. In order to beneﬁt from the near-optimum properties of curvelet transform (such as good directional selectivity) on top of dictionary learning, we propose a new plan in dictionary learning by using the curvelet atoms as the initial dictionary. For this reason, the curvelet transform of the noisy image is taken and then the noisy coeﬃcients matrix in each scale, rotation and spatial coordinates is passed through the K-SVD denoising algorithm with predeﬁned 3D initial dictionary that is adaptively selected from thresholded coeﬃcients in the same subband of the image. During the denoising of curvelet coeﬃcients, we can also modify them for the purpose of contrast enhancement of intra-retinal layers. We demonstrate the ability of our proposed algorithm in the speckle noise reduction of 17 publicly available 3D OCT data sets, each of which contains 100 B-scans of size 512 × 1000 with and without neovascular age-related macular degeneration (AMD) images acquired using SDOCT, Bioptigen imaging systems. Experimental results show that an improvement from 1.27 to 7.81 in contrast to noise ratio (CNR), and from 38.09 to 1983.07 in equivalent number of looks (ENL) is achieved, which would outperform existing state-of-the-art OCT despeckling methods.

Many methods have been proposed over the years to address speckle noise reduction from the OCT images. An extensive literature review on OCT denoising methods has been provided by Kafieh et al [5], which is concluded in Table 1 (see [5], and references therein for more details). OCT despeckling methods are categorized as OCT signal denoising on complex domain (before producing the magnitude of OCT interference signal) or on magnitude domain (named as hardware/software-based methods in Table 1, respectively). The magnitude domain techniques can be applied either directly to the raw OCT image or to the transformed data (using an appropriate sparse representation). The traditional speckle filtering methods in raw image domain such as median and Lee filtering [6][7][8][9], adaptive median and Wiener filtering [10,11] provide inadequate noise reduction under high-level speckle noise, as well as cause loss of meaningful subtle features. In the recent years, numerous more advanced methods have been proposed for speckle noise reduction, such as anisotropic diffusion-based techniques [12][13][14], wavelet-based methods [15], denoising using dual-tree complex wavelet transform [16] and curvelet transform [17], sparsity-based denoising [18,19], complex wavelet-based K-SVD dictionary learning technique (CWDL) [5], deep convolutional neural network based methods [20][21][22] and robust principal component analysis (RPCA)-based method [23].
In this paper, a novel speckle noise reduction algorithm is developed, which is optimized to reduce the speckle noise of OCT images while preserving edge sharpness. For this purpose, we introduce a new K-SVD dictionary learning strategy in the curvelet transform domain for speckle noise reduction of 3-D OCT images. By taking advantage of this sparse multiscale directional transform, we propose a new plan in dictionary learning by denoising and modifying each noisy curvelet subband with a pre-defined initial 3-D sparse dictionary. The 3-D initial dictionary for each 3-D curvelet subband is independently selected from thresholded coefficients in the same scale, rotation and spatial coordinates of the image. This method does not need any high-SNR scans (averaged versions of repeated scans) for dictionary learning used in other works [18,19].
The paper is organized as follows. Section 2 provides an introduction to 3-D digital curvelet transform (3DCUT). In Section 3 our proposed method, which is a dictionary learning method by using 3DCUT atoms as start dictionary, is described and the results and performance evaluation are presented in Section 4. Finally, the conclusions are given in Section 5. Table 1. Available denoising methods in OCT images [5] Denoising Method Hardware based methods

Modification in optical setup
Alternation in incident angle of the laser beam [24][25][26][27] Alternation in the recording angle of back reflected light [28] Alternation in the frequency of the laser beam [29]

Adjustment in
Weighted averaging schemes [30] imaged subject itself Registration of multiple frames by cross correlation [31] Eye tracking systems [32] Software Based Methods

Three-dimensional (3D) digital curvelet transform
The curvelet transform provides a sparse representation of objects in natural images. The basis functions of this joint time-frequency transform have good directional selectivity and are highly anisotropic [53]. The directional selectivity of curvelets and the spatially localized property of each curvelet can be employed to preserve the image features along specific directions in each subband. Following this reasoning, curvelets are appropriate basis elements (atoms) for representing multidimensional objects, which are smooth apart from singularities along smooth manifolds of codimension 1 [54].
Although the direct analyzing of 3-D data as a volume and considering the 3-D geometrical nature of the data is computationally expensive, it has been shown that 3-D analysis of 3-D data outperforms 2-D slice-by-slice analyzing of data [55]. The parabolic scaling, good directional selectivity, tightness and sparse representation properties of 3-D curvelet singularities, provide new opportunities for processing and analysis of large scale medical data-sets. 3-D curvelet elements are plate-like shapes of 2 −j/2 in two directions and width about 2 −j in the orthonormal direction and are suitable for demonstrating smooth 3D objects with singularities along smooth surfaces which is well adapted for represntating smooth intra-retinal surfaces in 3D-OCT images. The 3-D discrete curvelet transform is a 3-D extension of 2-D curvelet transform proposed by Candes et al [54]. In 2-D, the curvelet dictionary is generated by translation (b ∈ 2 ) and rotation (θ) of the basic element φ a,0,0 : where R θ = cos θ − sin θ sin θ cos θ is 2×2 rotation matrix with angle 8. φ a,0,0 ∈ L 2 ( 2 ) in Fourier domain (φ a,0,0 (ξ) = U a (ξ)) is proposed as a polar wedge used for building curvelet atoms [56] as follows:φ By sampling of scales a j = 2 −j , j ≥ 0, orientations θ j,l = πl2 − j/2 2 , l = 0, 1, . . . , 4.2 j/2 − 1 ( x / x denote the smallest integer being greater/smaller than or equal to x), and locations b j,l k = b j,l k1,k2 = R −1 θ j,l ( k 1 2 j , k 2 2 j/2 ) T , k 1 , k 2 ∈ Z, the curvelet coefficients are defined as: The a-scaled window U a provides the polar tiling of the frequency plane while the Cartesian arrays are desired for digital analysis of the images. In this base, the Cartesian windowŨ j (ξ) is proposed [56], which recognizes the frequencies in the trapezoid So, the Cartesian counterpart of the curvelet coefficients can be obtained as: where k j = (k 1 2 −j , k 2 2 − j/ 2 ) T , (k 1 , k 2 ) T ∈ Z 2 andb j, l k = S −T θ j, l (k 1 2 −j , k 2 2 − j/2 ) = S −T θ j, l k j and S θ = 1 0 − tan θ 1 . Using common rectangular grid instead of tilted grid, the Cartesian curvelets can be calculated as: where k j = (k 1 2 −j , k 2 2 − j/ 2 ) T taken on values on a rectangular grid, is substituted byb j, l k = S −T θ j, l k j in Eq. (4). Similar to 2-D, the Cartesian windowŨ j (ξ) in 3-D is defined byŨ j (ξ) =Ũ j (ξ 1 , ξ 2 , ξ 3 ) that isolates the frequencies in the truncated pyramid With the angles θ j,l and υ j,m the 3D shear matrix is defined as S θ j,l ,υ j,m = In 3-D, according to 6 faces of the unit cube, each Cartesian corona has 6 components regularly partitioned into wedges with same volume (Fig. 1). The curvelet function in the cone (ξ 1 , ξ 2 , ξ 3 ) : 0<ξ 1 j,l,m k )) (9) So, its Fourier transform would be: whereφ j,0,0,0 (ξ) =Ũ j (ξ) . Analogously as in [5], the curvelet coefficients are calculated as follows: In this paper, a new implementation of 3-D fast curvelet transform (3DFCT) [57,58] with a reduced redundancy factor and strong directional selectivity at the finest scale (comparing to the wrapping-based implementation proposed in curvelab Toolbox [53,54]) is proposed. For this purpose, by taking curvelet coefficients: 1. Cartesian coronization is performed to decompose the object into dyadic coronae based on concentric cubes. Each corona is subdivided into trapezoidal regions conforming the usual parabolic scaling as shown in Fig. 1.
2. The 3-D coefficients are obtained by applying an inverse 3-D FFT to each wrapped wedge as shown in Fig. 1, which appropriately fits into a 3-D rectangular parallelepipeds of dimensions∼ (2 j , 2 j/2 , 2 j/2 ) centered at the origin.

Proposed OCT denoising
OCT denoising can improve the image quality in favor of an accurate analysis of image information such as interpretation of intra-retinal layers (e.g., the result of accurate detection of these layers is dependent on edge enhancement through image denoising [59,60]). Unprocessed OCT images, similar to ultrasound images, have a rough appearance due to the presence of speckle [48], which contaminates the image features. Speckle noise is not pure noise and may carry important information correlated with it, which should be separated from the noise. However, much of the speckle noise can be suppressed after applying an appropriate despeckling algorithm; making image features more clear and resulting in a more accurate OCT image analysis.
In our proposed method we show the ability of dictionary learning for image denoising in transform domain instead of the image domain. First, 3D curvelet transform is applied to the 3D noisy OCT image, and then in each subband in the 3D curvelet domain, the coefficient matrix is denoised using K-SVD for dictionary learning.
Since the curvelet coefficients give an almost optimal sparse representation of the image, we have only a few large coefficients, which reflect the basic structures of the image and the remaining coefficients are close to zero [54]. This transform maps signals and noise into different regions and the total energy of the signal is concentrated in a small number of curvelet coefficients.
The choice of the start dictionary plays an important role in the performance of the aforesaid model. In order to prevent the solution of the optimization problem from falling into the local minimums because of the non-convexity of cost function, it is important to start from a wisely selected dictionary [18]. Since the Spectral-Domain OCT (SDOCT) images are affected by speckle noise, the quality of the trained dictionary from the noisy image may be degraded due to the inefficiency of the initial dictionary extracted from the corrupted image itself, which subsequently results in a suboptimal image restoration. Another choice to have a near-optimal solution is to learn the dictionary from the noiseless/ high SNR images. Since in practice such an ideal noise-free image may not be available, the thresholded curvelet coefficients of the noisy image in each subband is selected as initial dictionary for K-SVD based denoising of the 3-D curvelet coefficients in the same subband. So, each initial dictionary will have a different size, depending on the size of the coefficient matrix in each scale, rotation and spatial coordinates. Selecting the initial dictionary to be variable in size instead of traditional fixed forms will effectively represent different structures in the image. By increasing the scale of the curvelet coefficients matrix (or reducing in resolution), the block size is also increased while in high resolutions the block size is reduced, resulting in better representation of particular structures of the image.
The proposed method for the initial dictionary selection is as follows: • Forward DCUT: Produce the curvelet coefficients C(j,l,p) (j is the scale, l is the orientation, and p is the spatial coordinates) by applying 3-D curvelet transform.
• Initial Denoising: Apply a threshold T j,l,p to each curvelet coefficient such that: To set the threshold T j,l,p , we use a traditional strategy called k-sigma method [61], in which T j,l,p =kσ 1 σ 2 , where k is a tunable parameter, σ 1 is the noise standard deviation obtained from a region that does not have any image features (background region), and σ 2 is the standard deviation of noise in the related curvelet subband [61].
At this stage by taking the inverse 3-D curvelet transform of the thresholded curvelet coefficients, the initial high SNR enhanced image is obtained as shown in Fig. 2. In order to get a substantial increase in SNR without taking inverse 3-D curvelet transform, the initial dictionary is selected as follows: • Initial Dictionary: Each thresholded 3D-coefficient matrix in each scale, rotation and spatial coordinates is selected as the initial dictionary of the same subband. Since selecting a global dictionary is not effective to demonstrate different structures, we select the initial dictionary from thresholded curvelet coefficients in different scales, rotations, and spatial coordinates. After finding the appropriate 3-D initial dictionary, D, for each subband, the noisy curvelet coefficient matrix of the noisy image in the same scale, rotation and spatial coordinates is despeckled as follows: • Final K-SVD Denoising: If α i represents a sparse vector of coefficients for the i th denoised patch and R i shows the matrix extracting patch C i from the curvelet coefficients C at location i, and Y and C indicate the noisy and denoised version of the curvelet coefficients respectively, the following problem should be solved: where λ and µ i are scalar multipliers for minimization of the cost function.
-Repeat J times (J is the number of training iterations): • Sparse coding based on Orthogonal Matching Pursuit (OMP) to compute the representation vectors α i for each patch R i C.
The parameter σ is the noise level and g is the noise gain set to g = 1.15. • Using the extracted representation vectors in the above stage, the dictionary D is updated (one column at each time) based on the K-SVD algorithm. • Set each coefficient matrix in scale j, rotation l and spatial coordinate p with: The parameter λ is dependent on the noise level and is set to λ=30/σ.
• Contrast enhancement: Since curvelet transform is able to successfully deal with curve singularities and edge discontinuities, it can be employed for edge enhancement in natural images. So, in order to enhance the contrast of intra-retinal layer boundaries, before taking the 3D inverse discrete curvelet transform (3D-IDCUT), denoised curvelet coefficients are modified for the purpose of edge enhancement [62,63]. For this reason, the following function is defined to modify the values of the curvelet coefficients (k c (C j,l,p )): In this equation N = 0.2M, where M is the maximum value of curvelet coefficients in the relative subband.
• Converting to image domain: Then, the enhanced image is reconstructed from the denoised and modified curvelet coefficients by applying IDCUT.
The outline of the whole denoising procedure is shown in Fig. 3. A summary of the proposed algorithm for despeckling and enhancing of OCT images is as follows:

Results
We tested our algorithm on 17 publically available 2D and also 17 3D OCT images [19,64] with and without non-neovascular AMD. All volumetric scans, a square ∼6.6 × 6.6 mm volume scan with 100 B-scans of size 512×1000 and 1000 A-scans, were acquired using SDOCT, Bioptigen imaging systems (In the 3D implementation of our proposed method, for computational time saving and memory constraint, selected B-scans are resized to achieve B-Scans of size 256×512). For comparison, a 2D implementation of the proposed method [65] is also applied to the 2D images. So, we take the 2D curvelet transform of the noisy image, then each coefficient matrix is despeckled based on the 2D curvelet-based K-SVD dictionary learning (by selecting each thresholded curvelet coefficient matrix of the noisy image in each scale and rotation as the initial dictionary of the same subband). Before taking the inverse curvelet transform, the despeckeled coefficients are also modified in order to further enhance the intra-retinal layer boundaries. Figure 4 demonstrates the samples of the 2D initial selected dictionaries used for 2D K-SVD based denoising of each curvelet subbands. The dictionary size in (a) and (b) is 16×128, which is used for denoising of the curvelet coefficient matrix in scale 6, and in (c), (d) is 16×256 for the coefficient matrix in scale 7. In order to quantitatively compare the efficiency of different denoising algorithms, the following parameters are computed: -Contrast-to-Noise Ratio (CNR), which measures the contrast between a feature of interest and the background noise [66].
-Texture Preservation (TP), which is a measure of retaining texture in a region of interest (ROI) [67] (TP would be close to 0 for severely flattened image and in the best case remains close to 1).
-Equivalent Number of Looks (ENL), which measures smoothness in areas that should appear homogeneous, but are contaminated by speckle. A strong speckle smoothing leads to a large ENL [67].
-Edge Preservation (EP), which shows the degree of edge blurring inside the ROI based on the methods discussed in [67].
-Mean to Standard-deviation Ratio (MSR), which measures the mean to the standard deviation of the foreground regions [68].
-The Structural Similarity Index (SSIM), which is a perceptual metric that quantifies image quality impairement caused by processing [69] Figure 5 shows the qualitative performance of the proposed method on two selected slices from two retinal 3D-SDOCT. The 3D implementation of the proposed method is applied on 17 3D data set and MSRs, CNRs, TPs, ENLs and EPs obtained from ten ROIs for each B-scan (similar to the foreground ellipse boxes #2-11 in Fig. 6) are calculated and averaged. The 2D implementation of the proposed method is also evaluated on 17 available 2D and each slice of 17 available 3D data sets from AMD patients. Table 2 shows the mean and std of the CNR, MSR, ENL, TP and EP for different despeckling techniques against our proposed method [19]. The EP and TP values in this table show the ability of the proposed method in edge preserving and maintaining image structures. These measurements have smaller values close to 0 when the edges inside the ROI are more blurred and the image structures are more flattened. The implementation of algorithm in MATLAB requires around 31 minutes of computation time for denoising of each 3D image (with 100 B-scans of size 256×512) using a desktop with an Intel (R) Core i7 CPU and 4 GB of RAM. (15) , K-SVD [38], MSBTD (37) , NWSR (68) , 3D CWDL (5) and Proposed Method.

Original
Tikhonov (30) K-SVD (5) MSBTD (19) 3D CWDL (5) NWSR (68) 2D Implementa-tion of Proposed Method [65] Proposed Method Mean ± STD (CNR) We also computed the SSIM of despeckled image with our proposed method for 17 subjects that their averaged (high-SNR) B-scans are available. Figure 7 shows the local structure similarity map calculated between a denoised image by our algorithm and its corresponding averaged (high-SNR) B-scan (Large values of local SSIM appear as bright pixels). The averaged global SSIM for these 17 subjects was 0.71. In order to see the effect of the type of transform, we substituted the curvelet transform with shearlet transform [70] and denoised each shearlet subband by applying hard thresholding function. Similar to the proposed strategy in the previous section, the threshold is chosen based on k-sigma method [61]. As shown in Fig. 8, some ringing artifacts are appeared in the reconstructed image by thresholded coefficients. Just like our proposed method in the previous section, the thresholded coefficients for each scale and direction are selected as the initial dictionaries in our K-SVD based denoising method. For this reason, each image is decomposed by using four-level shearlet decomposition in which there are l = 3, 3, 4 and 4 numbers of shearing directions at each level, respectively. Figure 9 shows decomposed coefficients in the eight directional subbands (N = 2 l ) of the first level.  In the available implementation of shearlet toolbox [71], the size of the coefficient matrix in each subband is the same as that of the original image. So, by selecting this matrix as the initial dictionary, the computational time of K-SVD denoising is increased and the reconstructed image by applying the inverse shearlet transform to the K-SVD-based denoised coefficients suffer from blurring effdect, as shown in Fig. 10.

Discussion
In this paper, we introduced a new atomic representation for speckle noise removal from 3D OCT images and showed the advantage of the proposed method over other prevalent methods. Using curvelet transform and decomposing the image to different scale and applying the K-SVD dictionary learning in transform domain for each subband will be well representing retinal layers with different thicknesses and orientations. Also selecting the initial dictionary to be variable in size instead of traditional fixed form in K-SVD dictionary learning will effectively represent different structures in the image. At the same time with noise suppression the intraretinal boundaries will be enhanced by using Eq. (16). In comparison to the recently published 2D and 3D complex wavelet based dictionary learning (CWDL) method [5] that uses initial fixed size complex wavelet-based dictionary, the drawback of the proposed method in [5] is its time complexity (it takes more than 5 hours for denoising each 3D OCT image including 50 B-scans of size 256×512). The comparison of the quantitative performance of our algorithm with the proposed method in [5] indicates the similar performance of these methods in CNR (7.81 in our vs. 7.31 in [5]) and SNR (14.36 in our vs. 14.45 in [5]), EP (0.96 in our vs. 0.91). However the TP of the proposed algorithm is higher (0.7534 in our vs. 0.41 in [5]), which indicates the absence of unwanted flattening and edge blurring inside the region of interests. Figure 11 shows the results of our proposed method for sample B-scans correspond to AMD eyes. As shown in this figure, the visual interpretation of the proposed method (e.g., in the intra-retinal layer boundaries) also confirms the superiority of the proposed method in comparison with other OCT denoising methods.
The proposed method does not require high-SNR scans for learning noise-free dictionary or any repeated scans or averaged versions of scans used in other works [18,19]. In addition, Fig. 12 shows the results of applying 3D dictionary learning in the image domain for different values of σ (σ=5, 15,25) in comparison to our proposed method. As shown in this figure, the selection of σ will influence the results of dictionary learning in the image domain.
The threshold T j,l,p in Eq. (12) is determined based on a traditional strategy called k-sigma method. As shown in Fig. 13, by increasing this threshold (k = 2), we suppress more noises as well as thin edges in the image, on the other hand by decreasing this value (k = 0.01) some ring artifact will be appeared in the reconstructed image.
By taking curvelet transform of an image and modification of its coefficients, the light areas in an image will become lighter and dark areas will become darker which in turn will generally increase the contrast of reconstructed image. As shown in Fig. 12, the sigma parameter influence the resulted image by K-SVD algorithm. The larger values of this value results in loss of structural information and over-smoothing of layers (especially RPE and choroidal layers) in the reconstructed image by modified curvelet coefficients. Selecting the initial dictionary to be The denoising results using K-SVD method [41]. (c) The denoising results using the Tikhonov method [33]. (d) The denoising results using MSBTD method [19]. (e) The denoising results using AWBF method [47]. (f) The denoising results using NWSR algorithm [72]. (g) Results of the proposed method. variable in size instead of traditional fixed forms will effectively represent different structures in the image. By increasing the scale of the curvelet coefficients matrix (or reducing in resolution), the block size is also increased while in high resolutions the block size is reduced, resulting in better representation of particular structures of the image.
Decomposition level and the number of scales used in curvelet transform will change the shape of wedge like curvelets at each subband and the size of coefficient matrix will be decreased by increasing the decomposition level. Fine details in an image will be well represented by decreasing the size of coefficient matrix at each scale and rotation. Also reducing the size of coefficient matrix will increase the speed of our proposed K-SVD based denoising algorithm in which the initial dictionary was selected from thresholded coefficient matrix at each scale. Figure 14 shows the effect of decomposition level of curvelet transform in the resulted enhanced image by proposed method.
As shown in Fig. 15, the proposed algorithm (with no changes in parameters) can well afford to suppress the noise in images taken by various OCT imaging systems such as Topcon, Nidek and Cirrus HD-OCT (Carl Zeiss Meditec).

Conclusion
Speckle noise in OCT tomograms results in an incorrect recognition and interpretation of morphological characteristics such as the width of intra-retinal layers, and the shape of structural features such as drusens, macular holes, macular edema, nerve fiber atrophy and cysts (that can be used as imaging markers in the clinical investigation and diagnostics of retinal diseases). So, in order to suppress noise while preserving and enhancing edges and to preserve the geometric properties of the main structures of the image based on exploiting the regularity of edges, we introduced a new 3-D curvelet-based K-SVD algorithm and demonstrated its instrumentality in the speckle reduction of OCT datasets. We discussed the application of dictionary learning along with curvelet transform for denoising normal and AMD 3D SDOCT retinal images. As the curvelet transform gives a sparse representation of objects and is computationally efficient in dealing with geometric features like line and surface singularities, and in order to use multiscale directional properties of this transform, we introduced a new dictionary learning strategy in the curvelet domain. The K-SVD algorithm may fall in local minimums during dictionary updating because of highly non-convex functional terms in its formula. So, instead of redundant traditional dictionaries, we selected our initial dictionary from the thresholded curvelet coefficients that are less affected by noise.
We also modify each curvelet coefficient after updating each coefficient matrix in each scale and direction in order to further enhance intra-retinal layer boundaries and abnormal features. Our proposed method also does not need any high-SNR / repeated / averaged versions of scans for dictionary learning as in some cases there is no access to such scans. Designing 3D dictionaries for 3D OCT images ensures a considerable improvement in the denoising results. In addition, applying 3D curvelet transform and decomposing the full size 3D image to low size matrices (subbands), and denoising these subbands in the curvelet domain by K-SVD-based dictionary learning will significantly reduce the computation time.
We observed that the proposed curvelet-based K-SVD algorithm outperforms other OCT despeckling methods. This method has potential to increase the accuracy of available segmentation methods especially for the automatic identification of abnormalities such as cystoid spaces in 3D OCT data and the resulted image can be used to accurately detect intra retinal layer boundaries.