Speckle Statistics of Biological Tissues in Optical Coherence Tomography

The speckle statistics of optical coherence tomography images of biological tissue have been studied using several historical probability density functions. A recent hypothesis implies that underlying power-law distributions in the medium structure, such as the fractal branching vasculature, will contribute to power-law probability distributions of speckle statistics. Specifically, these are the Burr type XII distribution for speckle amplitude, the Lomax distribution for intensity, and the generalized logistic distribution for log amplitude. In this study, these three distributions are fitted to histogram data from nine optical coherence tomography scans of various biological tissues and samples. The distributions are also compared with conventional distributions such as the Rayleigh, K, and gamma distributions. The results indicate that these newer distributions based on power laws are, in general, more appropriate models and support the plausibility of their use for characterizing biological tissue. Potentially, the governing power-law parameter of these distributions could be used as a biomarker for tissue disease or pathology.


Introduction
Speckle is a granular pattern seen in signals or images that is caused by the interference of coherent waves with random phases and known or random amplitudes. 1 The study of speckle phenomena has a long history dating back to the time of Isaac Newton, with an increasing multitude of recent applications in optics, radar, and ultrasound. Optical coherence tomography (OCT) and ultrasound are two medical imaging modalities with prominent speckle. For some applications such as highresolution imaging, speckle is considered to be undesirable noise and many studies attempt to eliminate its presence. [2][3][4][5][6][7][8][9][10] Other studies choose to utilize speckle for physical modeling or characterization of tissue samples. [11][12][13][14][15][16] Studies of speckle amplitude statistics in acoustics and optics have led to the usage of various probability density functions (PDFs) such as the Rayleigh distribution, the K distribution, the Rice distribution, gamma distributions, and many others. [17][18][19][20][21] Although OCT is an interferometric technique and ultrasound utilizes time-of-flight measurements, the mathematics describing wave propagation and wave phenomena such as speckle can be applicable to both acoustical and optical imaging modalities. Recently, a new model hypothesized that in normal soft tissue, the dominant scattering elements are cylinders from fractal branching vasculature. [22][23][24][25] As a result of this model containing governing power-law relationships, three new and distinct probability distributions were used to characterize ultrasound speckle in biological tissue. These were the Burr type XII distribution, the Lomax distribution, and the generalized logistic distribution.
In this paper, these distributions are extended to OCT scans of various biological tissues.
Metrics for assessing appropriate regions of interest (ROIs) and evaluating the statistical validity of the distributions are also presented. Finally, a comparison of the new distributions with distributions found in the literature are presented based on various samples.

Modeling of Biological Tissue
Parker et al. 25 derived the first order speckle statistics of biological tissue in ultrasound imaging under the assumptions of weak scattering (using the Born approximation) originating from fractal branching of vasculature represented by cylinders. This derivation leads to power-law functions that dictate the PDFs for the echo amplitude and intensity histograms. The framework for the derivation is also directly applicable to OCT and is summarized here.
First, consider a distribution of scattering structures, from large to small, within the volume.
We assume the distribution follows a power-law distribution in size, with fewer larger scatterers.
A power law distribution and spatial correlation function are also consistent with generalized fractal models. 26 Thus, in scanning a volume, the probability of encountering a scatterer of characteristic dimension is given as 25,27 ( ) = − 1 min ( min ) − (1) where is the power-law coefficient representing the multi-scale nature of the tissue structures, and min represents the minimum size or lower limit of dimensions of the scattering structures that are detectable. Previous studies have shown that variations in the index of refraction within tissues obey a power law down to the sub-micron scale. 28 Thus, this model is appropriate for tissues.
Secondly, we assume each scatterer of dimension produces a detected amplitude or intensity according to the theory of backscattered waves. In general, canonical scattering elements such as spheres and cylinders have been characterized by power series solutions. [29][30][31] The dependence of backscatter on frequency and dimension is complicated, but can be characterized by well-known long-wavelength, short-wavelength, and transition or Mie scattering regimes. 32 However, we have employed a linear first-order approximation covering the sub-resolvable region where ( ) = 0 ( − min ) (2) where both 0 and min are dependent on system parameters such as wavelength and gain, and the lower limit of system detectability, including quantization and noise floor. With this linear monotonic function, the probability of occurrence can be simply mapped into the probability of amplitude or intensity using the probability transformation rule. 23,25

Probability Distributions for Amplitude, Intensity, and Log of Amplitude
Given Equations 1 and 2, the PDF for the histogram of amplitudes is determined as which is a special case of the Burr type XII distribution, and is a normalization parameter. When using this PDF for fitting OCT speckle amplitude, it is convenient to normalize by setting = √⟨ 2 ⟩ , where the denominator represents the root mean square (RMS) value. 18,21 Otherwise, the normalization constant can be incorporated into the parameter.
The PDF for the histogram of intensities is given by which is the Lomax distribution or Pareto type II distribution. When using this PDF for fitting OCT speckle intensity = | | 2 , it is convenient to normalize by setting = ⟨ ⟩ , where ⟨ ⟩ is the mean intensity.
Finally, the PDF for the histogram of log amplitude defined by = ln ( ) is given by which is a transformed version of the generalized logistic type I distribution. Typically, in OCT, it is beneficial to display an image of the log of the amplitude or intensity to better visualize the dynamic range. Thus, this distribution is useful in capturing the histograms of most conventional display values. Furthermore, these three PDFs are all well characterized in the statistics and econometrics literature, with known cumulative distribution functions and moments. 33

The Theoretical Importance of the Exponent Parameter
The above three PDFs all contain a power law or exponent parameter . In power law and related functions, the exponent parameter is a valuable parameter of interest in many applications. 27 In this paper's context, the exponent parameter may be important in tissue characterization.
According to Carroll-Nellenback, et al., 34 a simple fractal distribution of vessels within normal tissues would provide a value of approximately = 2.7. However, the number of scatterers per sample volume, and possibly the index of refraction can increase the exponent parameter. Thus, the exponent parameter may provide information about the tissue's scattering properties and structure, and may serve as a biomarker for differentiating disease and pathology from normal.

Historical Probability Distributions for Comparison
In the literature, there are other PDFs used to model OCT speckle amplitudes and intensities based on consideration of random point scatterers or more complex distributions from radar and other fields. The most prevalent of these is the Rayleigh distribution for speckle amplitude and the exponential distribution for speckle intensity, which are given by: 18,20 The Rayleigh distribution is suitable for the case of a large number of scatterers in a homogeneous medium.
Another distribution used by Weatherbee et al. is the K distribution, which has also been explored in ultrasound and radar. 18, 21 The K distribution is modeled for the case of a small number of scatterers, and the PDFs for the amplitudes and intensities are given by where Γ(•) is the gamma function and (•) is a modified Bessel function of the second kind, and is the shape parameter. where and are two shape parameters. The PDF for intensity can be derived and is given by The following tissues were scanned and analyzed with the SS-OCT system: mouse brain and liver, pig brain and cornea, and chicken muscle all ex vivo as well as human hand (skin) in vivo. In addition, two gelatin phantoms (5% with and without milk for optical scattering) were also scanned and analyzed. The number of A-lines for each scan was either 100, 500, or 1000.

OCT Scans of Various Biological Tissue
Variations in the number of A-lines do not change the speckle statistics, as long as an adequate number of pixels are used (e.g. greater than 1,000 pixels, which is easily covered by a 10 × 100 ROI) over an appropriate field of view (e.g. 5-10 mm). Amplitude, intensity, and log amplitude histograms are generated from specific ROIs in these samples.

Fitting Distributions using Maximum Likelihood Estimation
The distributions specified in Sections 2.2 and 2.4 are fitted to respective amplitude and intensity histograms using maximum likelihood estimation (MLE). The iterative maximization algorithm for MLE and all other analysis aspects were conducted in MATLAB 2020b (MathWorks, Natick, Massachusetts, USA). Curve fitting using an alternative least-squares approach or related methods can result in systemic errors, especially when estimating a power-law or exponent parameter. 35 This is due to a combination of factors such as violations of the underlying assumptions when using these curve fitting methods and variability from histogram binning methods. Therefore, MLE is a more accurate approach to this paper's studies.

Metric for Specifying an Appropriate Region of Interest
Relative uniformity across the ROI is an important, yet difficult to quantify, requirement for assessing appropriate speckle statistics. Attenuation along depth and shadowing effects are two examples of phenomena that would reduce the validity of an ROI for appropriate speckle statistics.
In this subsection, a simulation was performed in order to obtain an appropriate metric for evaluating the selection of an ROI. In Figure 1(a), a simulated OCT B-mode image (500 × 500 pixels) using a Burr type XII distribution with parameters = 10 and = 3 is shown with attenuation and shadowing effects. Figure 1(b) shows how spatial integration profiles vary along the axial and lateral directions, which can be visually correlated to the B-mode image. Specifically, integrations of speckle amplitude are performed along the two directions using intervals of 5 pixels and are normalized so that the two profiles can be displayed on the same scale. Using a Monte Carlo method, 10,000 ROIs of the B-mode image are randomly generated with a minimal size of 15 × 15 pixels to ensure adequate statistics. The resulting speckle statistics were fitted to a Burr type XII distribution using MLE. The ROIs were quantified by the maximum percent change (i.e. ratio of maximum difference over minimum value) in their spatial integration profiles, ∆. Figure   1(c) shows how the estimated parameter ̂ varies as a function of the ROI's ∆. Using this simulation, we established an ad hoc requirement that any appropriate ROI's ∆ should not exceed a 50% change in their spatial integration profiles laterally and axially.

Comparing Multiple Distributions
There are numerous methods for direct comparison of models. These methods include the Kolmogorov-Smirnov (KS) test, the Anderson-Darling (AD) test, the likelihood ratio test (LRT), and the Akaike information criterion (AIC). [35][36][37] The KS and AD test statistics provide p-values and indicate which models are best rejected while simultaneously indicating which model is the best fit. The AD test is similar to the KS test but gives more weight to distribution tails. However, it can be too conservative with estimates of power law functions and requires customized tests for the PDFs described in Sections 2.2 and 2.4. Instead, we follow the rapid KS procedure as outlined in Clauset et al. 35 By performing the KS test on the sample data and simulated sample distributions from the MLE fit, a p-value can be estimated by using a Monte Carlo method. In this case, a very large p-value (> 0.9) indicates a good fit, while a small value indicates that the model is not an appropriate one. When comparing models, the one with the largest KS p-value can be considered the best, although the best model may still not be a good fit.
The second test used is the LRT. The LRT generates a ratio of the log likelihoods of two models for comparison, which we denote by . 35,36 In this case, the LRT is used to compare the distributions in Section 2.4 (Rayleigh/Exponential, K, and Gamma distributions) with the Burr/Lomax distributions. The sign of indicates which distribution is the better fit: if < 0, then the Burr/Lomax distribution is the preferred distribution. A p-value is also estimated with to acknowledge that a true = 0 may fluctuate either positively or negatively. If the p-value is small, then the sign of is a good indicator of which distribution is the better fit. Otherwise for a non-small p-value (> 0.1), the LRT is inconclusive.
The final metric used is the AIC, which is defined as where is the number of estimated parameters and ̂ is the model's likelihood. The AIC provides a relative measure of a model when compared to other models, where a smaller value indicates a better model. 37 These three methods (KS test, LRT, and AIC) are used to determine which distribution fits best for speckle amplitude (Burr type XII, Rayleigh, K, or Gamma) and intensity (Lomax, Exponential, K, or Gamma).

Comparing Multiple Distributions
Figure 4 demonstrates multiple distribution fits for a pig cornea sample. Figure 4(a) shows the Bmode frame for the pig cornea with an appropriate ROI as quantified in Figure 4   Tables S1-S9, but a summary table of the comparison outcomes is provided below in Supplemental   Table S10. In all samples except for one, the best model for amplitude was the Burr type XII distribution and the best model for intensity was the Lomax distribution. The one exception is for pig brain, in which the K distribution was the best fit for both amplitude and intensity data.

Estimating the Exponent Parameter of Multiple Types of Biological Tissues
The Burr type XII, Lomax, and generalized logistic distributions all have two parameters: and . can be seen as a system-dependent (and gain-dependent) normalization parameter, but the exponent parameter is of interest since it could be used to characterize tissue. For all samples, ̂ and its 95% confidence interval were obtained via MLE in Supplemental Figures S1-S9. A summary of the exponent values for various samples types are shown in Table 1, in order of increasing ̂. Note that for all samples except the pig brain, the Burr/Lomax distribution was determined to be the best model in the previous section.

Discussion and Conclusion
In Section 4.1, MLE fits to the Burr type XII, Lomax, and generalized logistic distributions show that these three PDFs fit reasonably well to the amplitude, intensity, and log amplitude histogram data for two sample tissues (mouse brain and liver). The estimated exponential parameter ̂ is also universally consistent due to the framework of MLE. Thus, any one of the three PDFs can be used to estimate the exponential parameter with a reasonable level of confidence.
In Section 4.2, we further demonstrate the merits of using these PDFs (Burr/Lomax), as they are statistically compared with conventional PDFs described in the literature (Rayleigh/exponential, K, and gamma). Out of nine total samples, eight of them demonstrated that Burr/Lomax are indeed the best fits for the amplitude and intensity histogram data. In the one exception (pig brain), the differences between the Burr/Lomax fits and the K fits were statistically significant but relatively small. Another trend that was noticed was the fact that if the Burr/Lomax estimated a relatively high ̂ (~ 6), then the K and gamma fits were relatively closer to the Burr/Lomax fits, and they could also be a reasonable fit to the data (this can be verified by checking that the KS p-values are large