Wavelet transform fast inverse light scattering analysis for size determination of spherical scatterers

: We present a fast approach for size determination of spherical scatterers using the continuous wavelet transform of the angular light scattering profile to address the computational limitations of previously developed sizing techniques. The potential accuracy, speed, and robustness of the algorithm were determined in simulated models of scattering by polystyrene beads and cells. The algorithm was tested experimentally on angular light scattering data from polystyrene bead phantoms and MCF-7 breast cancer cells using a 2D a/LCI system. Theoretical sizing of simulated profiles of beads and cells produced strong fits between calculated and actual size ( r 2 = 0.9969 and r 2 = 0.9979 respectively), and experimental size determinations were accurate to within one micron.


Introduction
Nuclear size and morphology have been shown to be markers of tissue dysplasia, a precancerous state [1,2]. Optical biopsy techniques have provided a non-invasive approach for early detection of cancer based on quantitative assessment of nuclear morphology [3][4][5][6]. In addition, recent interest in angle-resolved optical coherence tomography (OCT) has improved the sensitivity and specificity of OCT systems as well as demonstrated its application in retinal diagnostic imaging [7,8]. Angle-resolved low coherence interferometry (a/LCI) combines both techniques to provide depth resolved angular intensity distributions of scattered light and offers a promising approach for assessing nuclear morphology in tissues for the detection of epithelial cancers [9,10]. Previous studies have demonstrated the capability of a/LCI in distinguishing dysplastic tissue from healthy tissue with high sensitivity and specificity, through the detection of changes in nuclear morphology, particularly an increase in average nuclear size in the basal layer of the epithelium [11][12][13][14][15][16][17].
Nuclear size can be determined through inverse light scattering analysis (ILSA). In ILSA, light scattering data are compared to a theoretical model to assess structure. For example, the nuclear size distribution can be determined through the analysis of the periodicity of backscattered light amplitude as a function of wavelength [18]. Here, nuclear size is determined through ILSA of the angular scattering profile collected through a/LCI. Two such models, based on Mie theory and T-matrix calculations, have previously been described where experimental data is compared to a lookup table of theoretical forward calculations [19][20][21][22]. Briefly, for Mie theory based ILSA, experimental a/LCI angular scattering profiles are compared to a database of simulated Mie theory scattering profiles for a range of scatterer sizes, and the best-fit curve is determined using chi-squared fitting. This method has been shown to achieve subwavelength precision and accuracy in predicting scatterer sizes but requires significant computational time to traverse through large databases [19]. T-matrix based fitting, on the other hand, requires even a larger database to include more information on the asymmetry and orientation of the scatterer. This comes at the cost of even greater computational power required to traverse this significantly larger database [20][21][22]. Due to the high computational demands required for database lookup, both of these methods are limited in their clinical capabilities in providing real-time imaging and tissue microstructure analysis. Here we propose a wavelet transform based ILSA that offers a faster approach for nuclear size determination.

Continuous wavelet transform
Wavelets are a family of functions that form a local basis and can be represented as: where a and b represent the dilation and shift factor, respectively, relative to the mother wavelet ψ(t) [23]. The continuous wavelet transform (CWT) of a signal f(t) is then defined as: While classical Fourier transform analysis is localized only in frequency, CWT has been traditionally used for time-frequency analysis on non-stationary signals as depicted in Figs. 1(a)-1(c) [24]. This is particularly advantageous in analyzing the angular scattering profile of spherical scatterers where the oscillatory frequency of the angular scattering profile is dependent on the diameter of the scatterer but may be non-stationary throughout the entire profile. Generally, the frequency of oscillations increases in the Mie scattering profile as the diameter of scatterer increases. However, the rate and uniformity of this frequency increase varies depending on the angular range in the scattering profile, such that frequency of oscillation and scatterer size is not always a simple, direct relationship. Thus, local analysis using CWT can potentially provide more information than traditional Fourier transform analysis for size determination of the scatterer. An alternative to the wavelet transform is the short-time Fourier transform (STFT) which is also localized in time and frequency [25]. However, the STFT is constrained by the Gabor limit, creating a tradeoff between time and frequency resolution [26]. Since the oscillation frequency seen in the scattering profile is the parameter used for sizing the scatterer, loss of frequency resolution is particularly detrimental to sizing accuracy and precision.
The CWT energy spectrum was used to assess the local frequency content in the scattering profile and is defined as: where E(a) is the CWT energy at the wavelet dilation factor a, and b 1 and b 2 define the limits of the shift factors of interest. Strong peaks in the CWT occur when the wavelet is dilated to match the size of the oscillatory period of the scattering spectrum. Therefore, higher frequency content is manifested as higher energy at lower dilation factors in the energy spectrum and vice versa, seen in Fig. 1(d). In addition to this primary energy peak, lower amplitude subharmonic energy peaks of the sinusoid frequency are also observed at higher dilation factors above the primary peak. By identifying characteristic features in the CWT of the angular scattering profile for scatterers of varying sizes, a fast wavelet ILSA method can be developed that does not require a database lookup step. Three distinct patterns in the CWT appear at the times of each sinusoid. (d) The energy spectra of the corresponding shift factor ranges of the CWT above. The dilation factor for the primary peak in the energy spectra decreases as the sinusoid frequency increases. Higher dilation factor sub-peaks correspond to subharmonics of the sinusoid frequency.

CWT size characterization
To assess the capability of determining size using CWTs, simulated Mie backscattering profiles were analyzed. The theoretical Mie profiles, from 0° to 25° at a wavelength of 800 nm, were generated (MiePlot v4305) for two samples: monodisperse polystyrene beads in polydimethylsiloxane (PDMS) (n bead = 1.59, n PDMS = 1.41) and a 2.5% size distribution of cell nuclei (n nucleus = 1.45, n cytoplasm = 1.37). The scatterer diameters were varied from 5 to 18 µm at an interval of 0.1 µm. A typical Mie profile for polystyrene beads is shown in Fig. 2(a).
Second order polynomial subtraction was performed on the scattering profile to isolate the oscillatory component, and the CWT was performed using a Morlet wavelet.
The CWT of a typical profile is shown in Fig. 2(b) along with the corresponding energy spectra in Fig. 2(c). Since the frequency of oscillation increases in the Mie scattering profile as the size of the scatterer increases, the dilation factor of the first peak in the CWT energy spectrum (a peak ) can be used to identify the size of the scatterer. It should be noted that the first CWT energy peak was used rather than the maximum because a second larger peak would occasionally occur, corresponding to the subharmonic peaks of the primary oscillation frequency, seen in Fig. 1(d), or to a low frequency oscillation in the scattering profile which has been attributed to the reflection and refraction component of scattering in our previous studies [10]. This component generally does not provide useful information for ascertaining the size of the scatterer, as it often can be influenced by the presence of small Rayleigh scatterers, and thus, the majority of this component is removed from experimental data prior to fitting by subtracting a second order polynomial, effectively isolating the oscillatory diffractive scattering component. In addition, as later discussed in section 2.3, a minimum value threshold was applied when detecting this first peak to prevent detection of high frequency peaks due to noise. This is a similar signal conditioning by low pass filter as used in previous analysis of scattering data using Mie theory [10]. (c) The CWT energy was calculated for varying wavelet dilation within the optimal angular range (black dotted line). The first energy peak wavelet dilation factor (pink line) was used as the sizing metric.
Once a peak was identified for each scatterer diameter, a linear regression was performed between the inverse of the peak energy dilation factor (1/a peak ) and the scatterer diameter. The coefficient of determination, r 2 , was used as a sizing quality metric, and this analysis was performed with varying wavelet shift ranges to determine the optimal angular range of the scattering profile.
The linear relationship between the 1/a peak and scatterer diameter was found to vary depending on the angular range examined. To characterize these results, we divided the angular range into 5° segments. An overview of the r 2 parameter obtained for these angular ranges are shown in Fig. 3. A weaker correlation was found in the lower shift ranges of the CWT for both beads and cells. This is largely due to a higher backscatter refraction and reflection signal at these low angles that overshadows the oscillatory component in the angular range examined. For beads, the sizing performance and energy spectrum peaks remained relatively constant throughout the entire angular scattering spectrum as shown in Fig. 3(a). For 40 shift windows of 5° spaced evenly across the CWT, the r 2 parameter was found to vary from 0.6128 to 0.9963 with an average regression slope of 212.11 ± 18.38. The maximum r 2 value of 0.9969 was found at a CWT shift range of 6.2-10.2° for polystyrene beads. Aside from this primary oscillatory component, secondary oscillations are also present within the angular scattering profile. For example, the orange curve in Fig. 4(a) shows a small energy peak at a low dilation factor corresponding to a low amplitude, high frequency oscillation. Generally, these oscillatory components do not interfere with the fitting algorithm. However, in some angular ranges and scatterer sizes, these secondary components can become dominant. This effect created outliers in a peak for certain sizes that lowered the fitting accuracy, and produced a lower r 2 value. The optimal range examined here (6.2-10.2°), was selected because scatterer sizing was largely unaffected by both the refraction and reflection component and secondary oscillatory peaks in this angular range.
The optimal a peak value was seen to vary drastically between the CWT of beads and that of cells due to the changes in the index of refraction of the scatterer and surrounding medium. From 0 to 20° two different oscillatory frequencies are present in the cell scattering profile depending on the scatterer diameter. For smaller scatterers, a low frequency (high a peak ) oscillation was more dominant while a high frequency (low a peak ) oscillation was more dominant for larger scatterers, as seen in Fig. 3(b). This resulted in relatively poor fitting by a single linear relationship, for starting angles from 0 to 15°, producing r 2 values ranging from −3.194 to 0.9226 (blue circles in Fig. 3(c)). The implication of a negative r 2 value is that the data are better fit by a horizontal line than the model linear relationship. However, when we consider just the low frequency oscillation by fitting only data for scatterers from 5 to 10 µm, we observed the r 2 steadily increasing from 0.45 to 0.9708 as the angular shift range increased from 0 to 25° (red x's in Fig. 3(c)). This suggests that the low frequency oscillation becomes more dominant for higher angular shift ranges in the CWT. At the CWT angular shift window from 20 to 25° this low frequency oscillation is detected for all scatterer diameters, and the maximum r 2 value of 0.9979 was found. Using the maximum r 2 CWT shift ranges (6.2-10.2° for polystyrene beads, 20-25° for cells), a linear relationship was developed for both polystyrene beads and cells which related the detected a peak of the CWT to the physical size of the scatterer.  We found that theoretical fitting of cell nuclei light scattering data using the linear relationship encountered difficulties for nuclear diameters greater than 16 µm. The expected energy peak in the CWT energy plot for these scatterers is dominated by secondary peaks at high dilation factors that are not completely removed with second order polynomial subtraction, as seen in Fig. 4. In these cases, the true dilation factor peak can be erroneously interpreted as a spurious peak arising from high frequency noise. These high dilation factor peaks can be removed by truncating the top portion of the CWT, essentially applying a high pass filter to the data. The maximum CWT dilation factor was truncated from 180 pixels (18°) to 110 pixels (11°). This ensured that all expected energy peaks were greater than 50% of the maximum energy of the spectra so that the expected peak was above the set noise threshold discussed in section 3.1. However, application of this truncation process inadvertently removed the dilation peaks for scatterers with diameter below 7 µm, shown graphically in Fig. 4(b). Thus, two regions are delineated where (1) truncation is not needed to detect the true a peak , corresponding to the range of 5-16 µm scatterer diameters and (2) truncation is needed to detect the true a peak , corresponding to the range of 7-18 µm scatterer diameters. Although this distinction creates a limitation in the usable range of experimental size determination for a given analysis procedure, some general knowledge of the nuclear diameter is often known a priori, which can be used to determine whether truncation is necessary for accurate size determination. For example, the nuclear diameter of the MCF-7 cells used for experimental sizing in this study falls within both regions so truncation was not needed during experimental cell nucleus size determination.  The detected a peak without truncation () and with truncation (x) for varying scatterer size. The truncation limit is show with the red dotted line. The blue and red regions indicate where no truncation and truncation of the CWT detects the true a peak , respectively.

Robustness characterization
In an experimental setting, the exact relative refractive index (n rel ) of the sample is unknown; however, the actual value must be assumed to apply the CWT algorithm. The parameter n rel is defined by the ratio of the refractive index of the scatterer to the refractive index of the surrounding. Using simulated scattering profiles, the CWT sizing algorithm was characterized for robustness to changes in n rel by testing the algorithms over a range of n rel values. For the polystyrene bead simulations n rel was varied from 1.09 to 1.16, compared to the known value of 1.12, and for the cells simulations n rel was varied from 1.037 to 1.073, with an expected value of 1.058.
In addition, the algorithm's robustness to noise was determined. The simulated scattering profile of 10 µm polystyrene beads in PDMS was degraded with varying degrees of additive Gaussian white noise. The addition of noise creates a peak at low dilation factor values in the CWT energy spectrum that interferes with the automatic detection of the first dilation factor peak. To overcome this, a minimum detection threshold relative to the CWT energy spectrum maximum was used during peak detection. The ratio of the signal first energy peak to the noise energy peak was calculated for varying SNR (section 3.1).

Polystyrene phantom size determination
Experimental size determination was performed using a 2D a/LCI system which has previously been described [27][28][29]. A schematic of the system is shown in Fig. 5. Briefly, a spectral domain interferometer is constructed with a Ti:Sapphire laser (λ = 800 nm, Δλ = 35 nm) where the 2D scattering field is overlapped with the reference field. The conjugate Fourier plane is positioned at the entrance slit of the spectrometer (Princeton Instruments, SP-2150) and detected with a Pike F-032 camera (AVT). An angle scanning mirror is used to overlap different planes of the scattering field on the entrance slit, and the entire 2D scattering field is imaged. Fig. 5. 2D a/LCI system schematic. Light from a Ti:Sapphire laser (λ = 800 nm, Δλ = 35 nm) is split into sample and reference arms (red). Angularly scattered light from the sample (green) is imaged onto the scanning mirror and overlapped with the reference field at the spectrometer slit. The angle scanning mirror is used to measure the entire 2D scattering plane.
For the first sample, polystyrene beads (Thermo Fisher Scientific, Microgenics Corporation, Fremont, CA) with diameters of 6, 8, 10, 12, and 15 µm were evenly distributed in a PDMS phantom (n rel = 1.12) and sized. A second sample was prepared with polystyrene beads in glycerol (n rel = 1.14) to experimentally test the robustness of CWT analysis to refractive index change. The glycerol was diluted with water to match the density of polystyrene in order to preserve an even distribution of beads throughout the sample. As shown in Fig. 6, the 1D scattering profile was extracted from the 0° azimuth angle (φ) relative to the incident beam from 0° to 25° (angular resolution of 0.434°) and oversampled to an angular spacing of 0.1°, to match the angular resolution of the simulated Mie profiles. Each scattering profile was analyzed using the CWT algorithm based on the size lookup line described above. Scattering Profile for 10μm beads in PDMS   Fig. 6. 2D a/LCI scattering profile for a 10 µm polystyrene bead phantom. The 1D scattering profile was extracted and used to size the phantom using CWT analysis.

Experimental cell size determination
MCF-7 breast cancer cells were plated on a chambered cover glass and imaged using the 2D a/LCI system at 8 different points. The scattering profile was extracted from the 2D a/LCI scan and the nuclear size was determined using the CWT algorithm. Validation of nuclear sizing was performed using fluorescence microscopy following staining with 4',6'-diamidino-2-phenylindole (DAPI) (Life Technologies, Carlsbad, CA). Fluorescence image analysis was performed using ImageJ (U.S. National Institutes of Health), and the equivalent diameter was averaged from 100 cells to determine the actual diameter of the nuclei. Furthermore, the angular scattering profiles were analyzed using Mie theory based fitting to compare the accuracy of the algorithms, and statistical analysis was performed using t-tests (α = 0.05).

Simulated Mie profile size determination
As presented in Fig. 7, the simulated Mie scattering profiles showed strong correlation between the inverse peak energy dilation factor and the size of the scatterer for both beads and cells (r 2 = 0.9969 and r 2 = 0.9979 respectively). Figure 8 presents the theoretical size determination of simulated cell and bead profiles both with and without error in the n rel assumption. Theoretical fitting using the lookup lines determined the size of the simulated bead profiles with an average error of ± 0.158 µm, and the cell nuclear profiles with an average error of ± 0.181 µm. Varying n rel lowered the sizing performance of beads to an average error of ± 0.412 µm (r 2 = 0.9652) and lowered nuclear sizing performance to an average error of ± 0.354 µm (r 2 = 0.9841). As expected, error in the n rel assumption decreases the sizing performance of the algorithm. However, the additional error is within reason, demonstrating the robustness of the algorithm to changes in n rel .
Generally, fitting accuracy decreases for larger diameter scatterers due to limited angular resolution of the scattering profile, which limits the resolution of the wavelet dilation factor. Simulated sizing can be improved by increasing the angular resolution of the scattering profiles. However, in practice, the angular resolution is limited by the experimental a/LCI system. In addition, the increased length of the scattering profile increases the CWT computation requirements by a factor of n 2 , where n is the length of the scattering profile.  Though additive Gaussian white noise can significantly degrade the scattering profile, the characteristic peaks are still distinguishable in the CWTs shown in Fig. 9(a) and 9(b). Comparison of the energy spectra in Fig. 9(c) shows a constant peak at the original first energy peak dilation factor despite the addition of noise but introduces a spurious peak at lower dilation factor. To determine a proper threshold for first energy peak detection, the ratio of signal to noise for peak CWT energy was found for varying degrees of the SNR of the original signal. Even in the cases where noise exceeds signal intensity (SNR<1) the signal peak CWT energy still exceeds that of the noise until approximately SNR = 0.5. Thus, with a proper peak detection threshold, the CWT sizing algorithm can be robust even with high degrees of noise. In later experimental size determination, the first peak detection threshold was set to 50% of the maximum of the energy spectrum, which roughly corresponds to noise robustness down to a SNR of 1.

Experimental size determination
Analysis of experimental light scattering distributions from polystyrene beads produced accurate fits for both PDMS and glycerol phantoms (r 2 = 0.9971 and r 2 = 0.9764 respectively), as shown in Fig. 10(b). The simulated CWT in Fig. 9(a) closely matches the experimental CWT in Fig. 10(a). All phantoms analyzed produced size determinations within ± 0.58 µm, with an average error of ± 0.147 µm for PDMS phantoms and ± 0.396 µm for glycerol phantoms. As expected, the glycerol phantoms had inferior sizing performance due to the added error in the incorrect n rel assumption.

Cell nuclear size determination
Quantitative image analysis of micrographs of DAPI stained MCF-7 cells, shown in Fig.  11(a), determined the average equivalent nuclear diameter to be 11.8 ± 1.8 µm. From the 8 a/LCI scans of MCF-7 cells, Mie theory based ILSA obtained an average nuclear diameter of 11.89 ± 0.70 µm. CWT size determination of the same scans determined the average nuclear diameter to be 11.71 ± 0.96 µm. As seen in the graph in Fig. 11(b), this is in close agreement with the results obtained by both fluorescence microscopy and Mie theory based ILSA. No statistical difference was found between the predicted nuclear diameters from the three analysis techniques (p>0.05). Furthermore, comparison of the experimental and simulated CWTs in Fig. 11(c) and 11(d) found close agreement in the locations of the characteristic peaks.

Conclusion
CWT based size analysis of simulated and experimental scattering profiles produced accurate size determinations in a fraction of the time (~50 ms) required for data fitting using current Mie theory (~1 second) and T-matrix (~1-2 hours) based algorithms. Despite limitations in sizing accuracy due to the n rel assumption and limited angular resolution, CWT based sizing can already be used for coarse sizing of scatterers. In practice, the CWT sizing results can be used to limit the database search range for Mie theory or T-matrix based algorithms for fine sizing resulting in significantly reduced computation time.
Future work involves developing methods for improving CWT sizing accuracy including size averaging using multiple wavelets or spectral ranges and utilization of multiple angular shift ranges in the CWT for data analysis. In addition, an extended work may study the effects of changing the refractive index of cell nuclei on the CWT to identify features that may provide a second method for detecting dysplasia [30]. Finally, further computational optimization of the CWT sizing algorithm could provide the benefit of real-time in vivo clinical analysis for coarse identification of the most at-risk areas for harboring tissue dysplasia.