Cluster-based filtering framework for speckle reduction in OCT images

: Optical coherence tomography (OCT) has become a popular modality in the dermatology discipline due to its moderate resolution and penetration depth. OCT images, however, contain a grainy pattern called speckle. To date, a variety of filtering techniques have been introduced to reduce speckle in OCT images. However, further improvement is required to reduce edge smoothing and the deterioration of small structures in OCT images after despeckling. In this manuscript, we present a novel cluster-based speckle reduction framework (CSRF) that consists of a clustering method, followed by a despeckling method. Since edges are borders of two adjacent clusters, the proposed framework leaves the edges intact. Moreover, the multiplicative speckle noise could be modeled as additive noise in each cluster. To evaluate the performance of CSRF and demonstrate its generic nature, a clustering method, namely k-means (KM), and, two pixelwise despeckling algorithms, including Lee filter (LF) and adaptive Wiener filter (AWF), are used. The results indicate that CSRF significantly improves the performance of despeckling algorithms. These improvements are evaluated on healthy human skin images in vivo using two numerical assessment measures including signal-to-noise ratio (SNR), and structural similarity index (SSIM).


Introduction
Optical coherence tomography (OCT) is an optical imaging modality comparable to ultrasound imaging, except that OCT uses light while ultrasound uses sound waves [1,2]. OCT is used for performing high-resolution cross sectional imaging and works based on lowcoherence interferometry [3]. The interferometry relies on the temporal and spatial coherence of optical waves that are backscattered from the tissue [4]. If the central wavelength of the light source is equal to or larger than the scattering compartments within the sample under investigation, the interference of the reflected light with different amplitudes and phases generates a grainy texture in the image called speckle. Speckle degrades the quality of OCT images and conceals the clinically important features [5]. By suppressing the speckle, the quality of the images is improved, and the diagnostically relevant features become more visible.
Methods for speckle reduction are divided into two main categories; hardware based methods, and software based methods [6]. The main hardware-based speckle reduction methods are compounding techniques [7,8]. It has been proven that the averaging successfully reduces the noise by the factor of N where N is the number of B-scan images to be averaged if the images are sufficiently un-correlated [9]. In 2012, Szkulmowski et al. proposed a shifting beam method that has been utilized for speckle reduction of synthetic aperture radar (SAR), ultrasound, and OCT images [6]. In this method scan beams are shifted orthogonal to both light beam propagation and lateral scanning directions. The images are then averaged. In another method, introduced by Wang et al. in 2013 [10], the probe beam is decentered from the pivot of the scanning mirror to create multiple images that are finally averaged to obtain a single enhanced image [11][12][13].
Software-based approaches (also called digital filters), process the acquired images offline [14][15][16][17]. Total variation (TV) [18], and block matching and 3D filter (BM3D) [19] are two popular de-speckling methods. TV estimates new pixel values by minimizing the amount of variation in the image, ignoring the small scale anatomical structures in the image. Expanding the idea of TV, Wu et al. in 2015 [20], estimated the despeckled image based on local statistics of the speckle. In 2004, Allende et al. proposed a despeckling method that works by detecting outliers in local patches and then cluster analysis within each patch [21]. Each pixel in the patch is then assessed as normal pixel or outlier. In the end, outlier pixels are eliminated while the normal pixels are left with minimal changes. While this method yields adequate images, it does not guarantee edge preservation. Lee [22] developed a local linear minimum mean square error filter, also known as Lee filter (LF), that is a locally adaptive estimation of the Wiener filter (WF). With the assumption of additive noise model, the filter works pixel-wise and estimates the new pixel values based on local statistics. Although the filter successfully degrades noise, it suffers from edge smoothing effects [23]. To avoid edge smoothing, the filter needs to estimate the local statistics in an edge-aware fashion. Jin et al. [23] proposed an adaptive Wiener filter (AWF) that estimates the despeckled pixel values in such a way that avoids over-smoothing of the edges. Assuming that the image and the speckle noise are stationary Gaussian processes, they model the image and the noise with the nonstationary mean and nonstationary variance model [24,25]. In this method, each pixel value is estimated based on local mean and local variance [26]. Although the local mean and variance are determined adaptively, this method still suffers from edge smoothing as it is unable to detect the edges effectively. In 2007, Ozcan et al. discussed several digital filtering methods to decrease the speckle in OCT images [26]. The authors have already implemented six digital filtering methods including enhanced LF [27], hybrid median filter [28], Kuwahara filter, wavelet filtering [29], methods based on artificial neural network [16,17,30] and AWF [31]. From the comparison of the obtained results, they concluded that the enhanced LF and the WF improve the signal-to-noise ratio (SNR) and quality of the OCT images.
The major challenge of current speckle reduction methods is the deterioration of small structures and edge smoothing in the image. The goal of a speckle reduction algorithm is to deconvolve the noise from the original image [4]. Although some algorithms find the optimum solution to the deconvolution problem, the original development of these algorithms require the power spectrum of the noisy image as well as the gold standard (GS) image, which are not available in practice. Practical versions of WF (e.g. LF and AWF) are developed under the assumption that the noise model is additive. This assumption does not apply to multiplicative speckle noise in OCT images. Moreover, LF and AWF assume that the noise and image to be locally stationary. This assumption is only valid in homogenous regions, and not valid around the edges. To restore the edges in the despeckled image, the filtering algorithm needs to effectively detect the edges and avoid applying the filter on the edges. Most current algorithms have not been successful in this regard.
In this study, we developed a cluster-based speckle reduction framework (CSRF) to prevent edge smoothing and small structure deterioration by despeckling in a cluster-wise fashion. For this purpose, we first detect the edges by clustering pixels in the OCT image (using conventional clustering techniques). A despeckling method (i.e. adaptive filtering) is then applied to pixels from individual clusters to preserve the edges. Using this methodology, the borders of the clusters are enhanced, and a pattern similar to gray level quantization may be created in the image that can easily be removed by a mean filtering. This is because the filtering only smooths the clusters and not the cluster edges. This framework is generalizable to any combination of clustering and despeckling methods. This study presents a general framework for OCT despeckling. Therefore, the experiments are focused on demonstrating the effectiveness of this new paradigm.

OCT sys
Taking the logarithm from Eq. (1), the multiplicative noise is converted to an additive noise: The Taylor expansion of ( ) where 0 X and 0 S refer to Integrating Eq. (3) with Eq. (4), we obtained: This results in: Eq. (7) is obtained: If the cluster is a homogenous region with similar pixel values, then a and b are considered equal for all ( ) 0 0 , i j in the cluster. Intuitively, multiplicative noise would change relatively as the pixel intensities change. In other words, higher pixel intensities correspond to higher intensity noise. In a homogenous region where the intensity variation is negligible, the multiplicative noise could be considered uncorrelated with the pixel intensities/location. From Eq. (7) we can see that the multiplicative form of the noisy image (Eq. (1) could be estimated by an additive model. Therefore, all the despeckling methods developed based on additive noise model (e.g., LF, and AWF) could effectively be used in CSRF.
Locally adaptive filtering methods (e.g. LF and AWF), assume the image and noise to be locally stationary. This assumption is only valid in regions with homogenous optical properties [35]. Stationarity is not preserved around the edges since both the image and the multiplicative speckle noise rapidly change near the edges [22,23,35]. In CSRF, on the other hand, each cluster is considered as an individual image with homogenous optical properties. Considering that there are no edges in individual clusters, the image and noise are guaranteed to be locally stationary in each cluster.

Clustering
We used unsupervised clustering methods to cluster pixels in the OCT images. Figure 2(b) demonstrates the clustering algorithm. In the clustering algorithm, each pixel is considered as a data point with two features: pixel intensity (PI) and attenuation coefficient (AC). The AC is estimated for each pixel in the feature extraction phase. The data points are then clustered by k-means (KM) algorithm. The clustering refinement includes filtering the clustering results to eliminate small clusters. The details of the KM clustering algorithm are beyond the scope of this study and we only provide a brief description of the methods. We refer the readers to references [36,37] for a more detailed description of the algorithm.

Feature extraction
Each pixel is assigned a set of features, i.e., PI and AC, to describe its optical properties. AC was estimated using the Vermeer et al. [38] approach. Equation (8)   By including pixel locations in the feature set, the clustering method is forced to be sensitive to the position of pixels during clustering. This is in contrast to the goal of clustering in CSRF, which is to detect the edges of homogenous regions. Moreover, column positions in OCT images are of no significant value for clustering, since regions with similar optical properties are normally stretched along the imaging surface. Moreover, the row position for thin and curved layers does not have sufficient discriminability among pixels from different layers/regions.

K-means clustering
The K-Means method works by assigning a cluster center to each cluster i , ( ) k i m , which is the mean of the data points corresponding to that cluster. The algorithm starts ( 0 = k ) with M randomly initialized cluster centers, ( ) 0 i m . Next, it iterates between the assignment step and the update step. In the assignment step, a data point is assigned one of the clusters according to its distance to cluster center ( )   are then normalized so that they sum to 1. To transform the multiplicative speckle noise into additive noise when applying the original LF and AWF, we take the logarithm of the OCT image if the OCT image is not already logarithmic. This step is omitted when these filtering methods are used in CSRF.
To incorporate the Lee filter into CSRF, the pixels in where ( ) The noise variance, 2 n σ in Eq. (11), is computed for each cluster individually. Estimation of a different noise variance for each cluster accounts for the correlation between noise and pixel intensity, which is due to the multiplicative nature of speckle noise. In the remainder of the manuscript, CSRF-LF refers to LF incorporated into CSRF.

Adaptive Wiener filter
Jin et al. proposed a modified version of LF, and called it AWF [23]. In this approach, different weights are assigned to pixels in where a and 2 ∈ are parameters of the filter, and ( ) , K i j is the normalization factor such that all weights sum to 1. Moreover, the weight of the central pixel is set to zero. To incorporate AWF into CSRF, we assign zero weight to pixels with a different cluster label than that with the central pixel. In the remainder of this paper, CSRF-AWF refers to AWF incorporated into CSRF. Algorithm 1 provides a detailed description of the algorithm used in CSRF.

Mean filtering
In the final step of CSRF, a mean filtering with 3 by 3 window size is applied on the despeckled images to remove a pattern that we call quantization pattern. This pattern is generated when all the cluster edges are left completely intact. Although some of the cluster edges correspond to the edges of tissue layers, there are cluster edges that do not represent clinically significant edges. The less significant edges are responsible for the layered pattern in the output images (i.e. quantization pattern). These edges are smoothed by mean filtering. We chose a small window size for the mean filter to avoid over-smoothing the more significant edges.

Gold standard denoised image
In order to evaluate the performance of CSRF, we need to compare its denoising performance with the state of the art denoising technique. One of the most straightforward approaches for OCT despeckling is B-scan averaging. In this study the gold standard images are generated by averaging 170 successive scans from the same site. Due to movement artefacts, it is necessary to register all images to a reference image before averaging. In this study, the reference image is chosen arbitrary from the 170 B-scans (see Fig. 4(b) for an example), and other images are registered to it using the enhanced correlation coefficient registration algorithm [39]. The registered B-scans are then averaged (see Fig. 4(a) for an example of a gold standard B-scan). It is important to note that the averaging approach for denoising, while yielding appealing results, is computationally expensive and requires a longer duration of sampling (with minimal movements) and therefore is not a practical solution.

Denoising assessment metrics
In order to compare the results from the CSRF with the GS denoising approach, we used two quantitative assessment metrics including signal-to-noise ratio (SNR), and structural similarity index (SSIM). The equations explaining these quality metric measures are provided in Eq. (14) and Eq. (15), respectively.
PSNR compares the signal of the OCT image to its background noise [40]. SSIM score measures the image quality based on structural similarity between the GS and despeckled images [41].  I  I  I  I  GS   I  I  I  I   C  C  SSIM I I  C  C μ μ σ μ μ μ σ σ where GS I , I , and Î are GS, noisy, and the estimated (despeckled) images, respectively.

Results and discussion
To evaluate the performance of the CSRF, OCT skin images were acquired from fourteen different body sites of a healthy, 25-year-old, male volunteer. The OCT machine is FDA approved. The institutional review board at Wayne State University (Independent Investigational Review Board, Detroit, MI) approved the study protocol, and informed consent was obtained from the patient before enrollment in the study. The body sites included anterior neck, buccal region, calves, chest, dorsum of foot, dorsum of hand, ears, forearm, forehead, lips, nose, orbit, palm, and upper back. For each of the fourteen body sites, 170 Bscans were acquired from 4 to 5 different locations, generating 56 data sets for evaluation. The proposed de-speckling method was applied on only one of the B-scans in each data set, forming 56 B-scans in total.
The initial experiments showed that the number of clusters significantly affected performance of the framework. In order to determine the optimum number of clusters, we investigated the effect of the number of clusters in a small subset of the OCT data sets (10 data sets). In the initial experiments, the quantitative and qualitative results indicated that with small numbers of clusters (less than 10) the major edges were preserved; but the small structures were deteriorated. We hypothesized that the number of clusters should be large enough so that the specific patterns of the small structures could be clustered separately. Our results showed that the optimum performance was achieved for 20 clusters per image. However, we believe that the optimum number of clusters might vary between imaging device since each imaging device has a different level of noise. Also, we believe that the optimum number of clusters is subject-independent, due to the anatomical similarities between subjects. We recommend testing different numbers of clusters when images from other body sites are used.
Our results from 56 data sets showed that on average, integration of AWF with CSRF, improves the SNR, and SSIM metrics by 13.63 dB, and 0.04, respectively (see Fig. 3(a) and  (b)). The results also show an average improvement of 13.88 dB, and 0.05 in SNR, and SSIM for integration of LF with CSRF (see Fig. 3(c) and (d)).
When testing different window sizes (i.e., 3 × 3 to 13 × 13), we observed that a window size of 9 by 9 pixels for the despeckling methods yields optimum qualitative and quantitative results. Window sizes smaller than 9 by 9 pixels did not effectively improve the quality of the images. Similarly, window sizes greater than 9 by 9 pixels smoothed the edges and deteriorated small structures. It is worth mentioning that the window size 9 by 9 pixels may not be appropriate for OCT images of other organs, e.g. retina. For the mean filter in the third and final step of CSRF, a 3 × 3 window was used to smooth the borders of the clusters and alleviate the problem of quantization pattern. Please note that a window size of 3 × 3 has a negligible effect on major edges, i.e., the ones that are diagnostically important. As shown in Fig. 4(c), the results of the AWF filtering show a noisy pattern around the edges. A checkerboard pattern is also observed throughout the image, which might be due to the inability of AWF's pixel weighting approach (see Eq. (13) to differentiate between edges and speckle. Moreover, the small structures, shown in the red box in Fig. 4(c) is deteriorated. Notably, the CSRF-AWF seems to solve the problem of checkerboard pattern. We can say that CSRF-AWF outperforms the original AWF method, both in edge preservation and smoothing the homogenous regions (see Fig. 4(d)). Figure 4(e) shows that CSRF-AWF significantly outperforms AWF in eliminating speckle. The major edges are deteriorated after the application of AWF, while CSRF-AWF leaves significant edges almost intact. As shown in Fig. 4(f), the original LF does not effectively reduce speckle around edges or small structures and shows that the despeckled image contains small black dots. In the results of CSRF-LF however ( Fig. 4(g)), such patterns have been removed. One can say that the images despeckled by CSRF-LF are smoother than those obtained from the original LF. This improvement is not however as significant as that seen in CSRF-AWF compared to AWF. The results from Fig. 4(h) indicate that LF fails to eliminate the speckle pattern (manifested in the form of rapid changes of pixel intensities in the A-line) and deteriorates the edges, while CSRF-LF completely eliminates the speckle and preserves the major edges from deterioration. In Fig. 5, despeckling results of several OCT images of skin taken from different body sites of a 25-year-old healthy male are shown.

Computational performance of CSRF
The computational complexity of CSRF is determined as the sum of the complexity of the clustering algorithm and that of the despeckling algorithm. The complexity of KM clustering method is in the order of 0( ) iknd . Where n is the number of pixels, k the number of clusters, i the number of iterations until the clustering is converged, and d the dimensionality of input features. Although CSRF increases the complexity of the despeckling algorithms, the clustering algorithms are highly parallelable and there are a number of multicore implementations of these methods available [42][43][44]. Moreover, there are implementations of the KM method that use graphical processing units [42,43], with which the execution time of the algorithm is significantly reduced.
The computational complexity of the filters (AWF and LF) does not depend on the number of clusters or the structure of the image (i.e. the shape of the clusters). Regardless of these factors, each pixel is processed only once therefore the computational complexity of the filtering step relies merely on the filtering approach that is going to be used (in this case LF, and AWF). On average the despeckling of each image took 5.3 seconds on personal computer with Core-i7 6700K central processing unit and 32 giga bytes of memory. Processing each image using the averaging method (the approach that we used to get the GS images) took 42 minutes on average.

Final remarks and future work
Although speckle (noise) decreases the image quality, blurs the image and conceals the diagnostically relevant features in OCT images, it carries submicron structural information of the tissue being imaged. Therefore an intelligent despeckling algorithm is required to make the image more eligible while preserving the major features in the image. In this study, we developed a cluster-based filtering framework, called CSRF. The framework enhances the performance of despeckling algorithms by applying the despeckling algorithm on the regions with the same optical properties: clusters. Our results show that images despeckled with our proposed framework are qualitatively and quantitatively improved (see Fig. 3 and Fig. 4). Adding other statistical features of OCT images to the feature vector of clustering, and including first and higher order statistics, is something that should be explored in future works. Other optical properties such as the scattering coefficient, anisotropy factor, and geometrical properties such as shape and thickness can also be used in clustering algorithms to improve its performance. We observed that the integration of a common despeckling method, e.g. AWF, in CSRF significantly increases its despeckling capabilities (see Fig. 3 and Fig. 4). Visual inspection of the despeckled images validated the results from qualitative stand point. Another important finding was the consistent improvement of both quality metrics, PSNR and SSIM when the original filtering methods were used in CSRF.
However, the scope of this study is merely to introduce CSRF as a de-speckling framework and validate its effectiveness, we believe that the outperformance of CSRF-based methods compared to the original methods, indicates the great potential of this framework for despeckling. This framework is proposing a new viewpoint on OCT despeckling. Future studies could focus on developing new filtering methods that are tailored for CSRF.

Conclusion
In this paper, we proposed a cluster-based speckle reduction framework (CSRF) for OCT images to reduce the speckle. The method was successfully tested on 56 sets of OCT images of human skin, in vivo. The results showed an average improvement of 13.88 dB, and 0.05 for Lee filtering, and 13.63 dB, and 0.04 for adaptive Wiener filter in PSNR and SSIM, respectively. The proposed method was tested on OCT images of skin, however, it could easily be used for OCT images of other sites, e.g., retina. This framework, if a more sophisticated clustering algorithm is used, helps in further functionalizing speckle reduction algorithms to enhance the visibility of a specific characteristic in the OCT images.

Funding
Michelson Diagnostics; Wayne State University Startup fund.