Evaluation of hybrid algorithm for analysis of scattered light using ex vivo nuclear morphology measurements of cervical epithelium

: We evaluate a new hybrid algorithm for determining nuclear morphology using angle-resolved low coherence interferometry (a/LCI) measurements in ex vivo cervical tissue. The algorithm combines Mie theory based and continuous wavelet transform inverse light scattering analysis. The hybrid algorithm was validated and compared to traditional Mie theory based analysis using an ex vivo tissue data set. The hybrid algorithm achieved 100% agreement with pathology in distinguishing dysplastic and non-dysplastic biopsy sites in the pilot study. Significantly, the new algorithm performed over four times faster than traditional Mie theory based analysis. study was conducted on ex vivo tissue obtained from 20 participants undergoing cervical cone biopsy (n = 11) or hysterectomy (n = 9) at Duke University Medical Center and was determined to be exempt from IRB review. Following surgical resection, 2-4 optical biopsy sites were selected based on physician guidance and scanned using a/LCI. Each site was marked with India ink which enabled co-registration with traditional pathological examination of whole tissue sections performed by a gynecologic pathologist. No physical biopsies of the ex vivo tissue were taken for this study. Several a/LCI scan sites could not be correlated with the biopsies following pathological examination, leaving 23 co-registered optical biopsy sites for analysis. Each a/LCI scan was binned into 50 µm depth segments and the average nuclear diameter and density was determined for the basal layer of the tissue. In addition, histopathology analysis was performed, and each sample was characterized as either healthy (n = 16), metaplastic (n = 4), or dysplastic (n = 3). The a/LCI measured morphology using the CWT ILSA algorithm. The distribution is initially filtered at f Mie , and the nuclear diameter is determined using CWT ILSA. The χ 2 goodness of fit is determined between the a/LCI measured distribution and the Mie theory distribution for the CWT predicted nuclear diameter ( D CWT ). This process is repeated with the low pass filter cutoff decreased incrementally to each discrete Fourier transform frequency position below f Mie to find the best fit filter cutoff position ( f CWT ) which produced the lowest χ 2 value. A low pass filter with this filter cutoff is then applied to the a/LCI scattering data and fine fitting within a ± 1 µm range around the D CWT is performed using Mie theory based ILSA. can be differentiated from dysplastic biopsies.


Introduction
Angle-resolved low coherence interferometry (a/LCI) is an optical biopsy technique which utilizes depth resolved nuclear morphology measurements to identify dysplasia, a precancerous tissue state [1]. The a/LCI instrument collects elastically scattered light at varying scattering angles from the sample and implements low coherence interferometry to depth gate the measurements. This produces an a/LCI scan which yields the angular scattering distribution for specific depths beneath the tissue surface. Analysis of the angular scattering distributions produces depth resolved quantitative nuclear morphology measurements, which have been shown to be a biomarker for tissue dysplasia [2,3]. A typical a/LCI system samples the tissue with a 100 μm beam and achieves a depth penetration of around 300 μm, thus sampling approximately 100 nuclei for each a/LCI scan. The diagnostic capability of a/LCI has previously been demonstrated for epithelial dysplasia including esophageal and colon cancer [4,5].
To extract nuclear morphology information from the angular scattering distribution, inverse light scattering analysis (ILSA) is performed. Two ILSA methods have previously been developed for a/LCI [6][7][8][9]. Both of these methods require a database of simulated angular light scattering distributions computed using either Mie theory or T-matrix calculations. Mie theory models the cell nuclei as spherical scatterers while T-matrix models the cell nuclei as spheroids. The angular scattering distribution collected with a/LCI is compared to each profile within the database until the best fit is found, using chi-squared (χ 2 ) as a comparative metric. With Mie theory based ILSA, the nuclear diameter and nuclear density are extracted from the angular scattering distribution while with T-matrix based ILSA, the nuclear aspect ratio is also determined. However, this comes at the cost of added computational time due to the search of the larger database required for T-matrix based fitting.
Although these methods have been successful in characterizing nuclear morphology, both methods are limited in their clinical utility due to the long computational time required to traverse through the database. For the current clinical a/LCI system, Mie theory based ILSA requires approximately 5 seconds to analyze a single biopsy site, while T-matrix based fitting can take upwards of 15 minutes. Ideally, processing should take under 1 second to achieve near real-time analysis and optimal clinical utility. Recently, we have proposed a fast ILSA method using the continuous wavelet transform (CWT) of the angular scattering distribution which is able to predict the nuclear diameter without a database traversal step [10]. In the study, the fitting performance of this algorithm was evaluated using polystyrene bead phantoms and in vitro cell samples.
Here, we present a hybrid ILSA algorithm which combines Mie theory and CWT ILSA to analyze a/LCI scans. In addition, we propose a novel application of a/LCI for the detection of cervical dysplasia. Cervical cancer is the fourth most common female malignancy worldwide with over 500,000 new cases and 250,000 deaths estimated in 2012. Within the United States in 2014, there was an estimated 12,000 new cases and 4000 deaths reflecting an age-adjusted incidence of 7.8 per 100,000 women [11]. In the United States, the implementation of cervical cancer screening programs based on Pap smears and colposcopic assessment with acetowhite staining has led to the decline in incidence and deaths from cervical cancer in recent years [12]. However, these techniques are subjective and limited in sensitivity (50-96%), specificity (43-86%), and interobserver reproducibility [13][14][15]. To address this limitation, several optical techniques have been proposed as potential screening methods for cervical dysplasia detection [16]. In this study, the feasibility of a/LCI as a cervical cancer screening tool is evaluated on ex vivo cervix tissue using the hybrid ILSA algorithm.

Clinical a/LCI Study
A schematic of the clinical a/LCI system is shown in Fig. 1 and described in detail by Zhu, et al. [17]. Briefly, an 830 nm superluminescent diode (Superlum, Moscow, Russia) is used in a Mach-Zehnder interferometer geometry. P-polarized light from the sample arm is propagated via a polarization maintaining fiber to the probe tip where it is delivered to the sample via a GRIN lens as a collimated but off axis beam. The scattered light from the sample is collected by the GRIN lens and mapped to a coherent fiber bundle (Schott Inc., Southbridge, MA) by scattering angle. The scattered light propagates to the second beam splitter where it is interfered with the reference field and detected by an imaging spectrograph (SP-2150i, Princeton Instruments, Acton, MA) and CCD (PIXIS: 100, Trenton, NJ) with each channel in the spectrometer detecting light from an individual scattering angle. The cervical a/LCI study was conducted on ex vivo tissue obtained from 20 participants undergoing cervical cone biopsy (n = 11) or hysterectomy (n = 9) at Duke University Medical Center and was determined to be exempt from IRB review. Following surgical resection, 2-4 optical biopsy sites were selected based on physician guidance and scanned using a/LCI. Each site was marked with India ink which enabled co-registration with traditional pathological examination of whole tissue sections performed by a gynecologic pathologist. No physical biopsies of the ex vivo tissue were taken for this study. Several a/LCI scan sites could not be correlated with the biopsies following pathological examination, leaving 23 co-registered optical biopsy sites for analysis. Each a/LCI scan was binned into 50 µm depth segments and the average nuclear diameter and density was determined for the basal layer of the tissue. In addition, histopathology analysis was performed, and each sample was characterized as either healthy (n = 16), metaplastic (n = 4), or dysplastic (n = 3). The a/LCI measured morphology was then compared retrospectively to the pathological classification at each co-registered biopsy site. In addition, each biopsy site was verified by pathology to be ectocervical tissue.

Mie theory ILSA processing
Each a/LCI scan, consisting of the angular scattering distribution for varying depths in the sample, is processed to extract the nuclear scattering profile. A detailed discussion of this methodology is presented in Brown et al. [18]. First, the a/LCI scan is segmented into depth bins, and the angular scattering distribution is averaged within each depth bin for analysis. For the cervical dysplasia study, the a/LCI scans were segmented into 50 µm depth bins from the tissue surface to 300 µm below the tissue surface.
Secondly, each angular scattering distribution is low pass filtered. As described by Pyhtila et al., the Fourier transform of the scattering profile is the two point correlation function of the sample [19]. Thus high frequency components in the signal that arise from long range intercellular correlations are suppressed with the low pass filter, isolating the signal arising from subcellular structures, specifically the cell nuclei. The low pass filter frequency cutoff is typically placed after the first observed peak in the two point correlation function.
Following low pass filtering, a second order polynomial subtraction is performed to isolate the oscillatory scattering component due to diffraction from the reflection/refraction component and the nuclear scattering distribution is compared to each similarly processed profile within the Mie theory database. Finally, an uncertainty criterion is used to remove nonunique solutions, and the resulting nuclear morphology measurements, as determined from the remaining scans, are averaged to evaluate each biopsy site.
The a/LCI scans from 23 biopsy sites were used for initial analysis. The nuclear diameter was determined at each depth bin using traditional Mie theory based ILSA for non-dysplastic (healthy and metaplastic), and dysplastic biopsy sites (Fig. 2). A t-test was performed comparing the mean nuclear diameter between non-dysplastic and dysplastic biopsy sites at each depth and a significant difference was determined only in the depth bin from 200 to 250 µm below the tissue surface. Thus, this layer, corresponding to the basal layer of the ectocervical epithelium, was used for further analysis. The scattering distributions at this depth bin which passed the uncertainty criterion were used for subsequent CWT based ILSA analysis. Fig. 2. Average nuclear diameter determined from Mie theory ILSA of a/LCI scans for nondysplastic (healthy and metaplastic) and dysplastic biopsy sites at varying tissue depths. *indicates significant difference (p < 0.05) between group means.

CWT ILSA processing
Prior to CWT fitting of the a/LCI data, a CWT ILSA simulation was performed to validate the approach for tissue analysis. The analysis was performed using simulated Mie scattering profiles with the average nuclear morphology parameters determined for the cervical tissue using Mie theory based ILSA (1.044 nuclear density, 2.5% standard deviation size distribution, and varying scatterer diameters from a range of 5 to 18 µm using an interval of 0.1 µm). The CWT energy versus wavelet dilation factor (Fig. 3(c)) was then calculated within a specified angular analysis range, and the dilation factor of the first peak in the CWT energy spectrum (a peak ) was determined for each scatterer diameter. A lookup line is then produced using a linear regression between the inverse of the peak energy dilation factor (1/a peak ) and the scatterer diameter ( Fig. 3(d)). In experimental analysis, the a peak in the CWT of each a/LCI scan is found and the corresponding scatterer diameter in the lookup line is used to predict the diameter of the cell nuclei. Linear regression was performed to determine the regression coefficients between the inverse of the peak energy dilation factor (1/a peak ) and the scatterer diameter. This simulation was repeated for varying angular analysis range until an optimal range (23.63° -30.08°) was found, having the highest coefficient of determination (r 2 = 0.99) between the true diameter and the CWT predicted diameter (Fig. 4(a)). Experimental fitting of a/LCI data was then performed using the optimal angular analysis range and regression coefficients determined in simulation. The angular scattering profile is first filtered using the same process as Mie theory ILSA. The CWT of the angular scattering profile is calculated, and the nuclear diameter is determined from a peak in the CWT. When detecting a peak , a minimum value threshold was applied, set at 50% of the energy maximum to prevent detection of high frequency energy peaks due to noise. A more detailed discussion of the CWT analysis is described by Ho, et al. [10].
The a/LCI data for cervical tissues which passed the uncertainty criteria for Mie theory ILSA processing were analyzed using the CWT algorithm. Using this procedure alone, poor fitting agreement was observed between Mie theory and CWT ILSA. This is primarily due to a high frequency component in the angular scattering distribution which arise from long-range intercellular correlations. Despite the low pass filter, the presence of a high frequency component within the a/LCI data can affect the CWT algorithm even though the optimal solution can still be found with Mie theory based fitting. However, with CWT ILSA, this high frequency, long range correlation produces a low dilation factor peak in the CWT that confounds the sizing algorithm. As a result, the CWT ILSA consistently overpredicts the nuclear diameter.
To compensate for this high frequency component, the filter cutoff should be decreased for the CWT analysis. It was determined that by iteratively shifting the filter cutoff, an optimal filter position can be found where the Mie theory and CWT ILSAs produce good agreement ( Fig. 4(b)). However, this method requires a priori knowledge of the nuclear size to find the optimal filter cutoff position. Thus, the CWT ILSA algorithm can only be used retrospectively, i.e. after completing a Mie theory-based analysis, and is not viable as an independent clinical fitting algorithm. To overcome this issue, a hybrid fitting algorithm was developed that combines the speed of the CWT and the accuracy of Mie theory ILSA.

Hybrid fitting algorithm
An overview of Mie theory based ILSA, CWT based ILSA, and the hybrid fitting algorithm is shown in Fig. 5. In all three algorithms, the a/LCI scattering distribution is initially filtered at the low pass filter cutoff frequency used for Mie theory ILSA (f Mie ). With Mie theory based and CWT based ILSA, this filtered distribution is used directly in the fitting algorithm to predict the nuclear diameter. However, with the hybrid algorithm, an iterative approach is used to determine the optimal low pass filter cutoff position during an initial coarse fitting step using the CWT ILSA algorithm. The distribution is initially filtered at f Mie , and the nuclear diameter is determined using CWT ILSA. The χ 2 goodness of fit is determined between the a/LCI measured distribution and the Mie theory distribution for the CWT predicted nuclear diameter (D CWT ). This process is repeated with the low pass filter cutoff decreased incrementally to each discrete Fourier transform frequency position below f Mie to find the best fit filter cutoff position (f CWT ) which produced the lowest χ 2 value. A low pass filter with this filter cutoff is then applied to the a/LCI scattering data and fine fitting within a ± 1 µm range around the D CWT is performed using Mie theory based ILSA. For the hybrid algorithm, the a/LCI scattering distribution is initially filtered at f Mie . Coarse fitting is performed using CWT ILSA to determine the optimal CWT filter cutoff position f CWT and a coarse diameter prediction D CWT . Fine fitting is performed using Mie theory based ILSA within a ± 1 µm range around D CWT .

a/LCI diagnostic accuracy
Consistent with previous a/LCI clinical studies [4,5], the basal layer of the epithelium (approximately 200-250 µm beneath the tissue surface) was determined to be the most diagnostically useful layer. Mie theory ILSA was used to determine the nuclear size and density for the 23 biopsy sites at this depth. The clinical course of action is identical for healthy and metaplastic biopsy sites, i.e., no intervention is required. Thus, for statistical analysis, the biopsy sites for these two groups were combined into a single group, nondysplastic (healthy and metaplastic sites). A Student's t-test was performed and no significant difference (p = 0.21) was found between the mean nuclear diameter of healthy and metaplastic sites, justifying their treatment as a single group. The non-dysplastic group was compared to the dysplastic group using Student's t-test, and a significant increase (p = 0.002) in average nuclear diameter was observed with a mean increase of 2.54 µm (Fig. 6(b)). The average of median nuclear diameter of the dysplastic and non-dysplastic biopsy sites was determined for a classification decision line set at 10.54 µm. Using this decision line, the biopsy sites can be classified retrospectively with 100% sensitivity, 90% specificity, and 91.3% accuracy (Fig. 6). Also consistent with previous a/LCI studies, a high negative predictive value was seen (NPV = 100%).   [20]. (b) Average nuclear diameter for non-dysplastic (healthy + metaplastic) and dysplastic biopsy sites.

CWT algorithm fitting performance
To develop the hybrid fitting algorithm, the CWT and Mie theory ILSA predicted diameters were compared for the cervix light scattering data based on processing the a/LCI distributions with the optimized low pass filter cutoff, f CWT . Although there was general agreement between CWT and Mie theory prediction at this cutoff, there were a couple of scans where CWT drastically over predicted the diameter compared to Mie theory ILSA. This trend was manifested as a systematic offset, or bias, in the Bland-Altman plots for size predictions on an individual scan and biopsy basis (Fig. 7). However, even with this bias, the CWT maintained a high diagnostic predictive value. Using an 11.85 µm decision line, a value determined by adding the 1.31 μm bias to the decision line for the Mie theory ILSA, all non-dysplastic tissue biopsies can be differentiated from dysplastic biopsies. Fig. 7. Bland-Altman plots comparing CWT ILSA and Mie theory ILSA nuclear diameter prediction for the a/LCI distribution filtered at f CWT for (a) individual a/LCI scans and (b) biopsy sites. The 11.75 µm decision line can be drawn for CWT based ILSA diameter prediction for 100% predictive accuracy.

Hybrid algorithm
The bias in the Bland-Altman plot shows a systematic difference between the Mie theory and CWT ILSA. This may be due to CWT ILSA being more vulnerable to the higher degree of noise in tissue data as compared to simulations. Another contributing factor may be the high frequency signal arising from intercellular correlations in the tissue sample. As noted in section 2.3, high frequency noise peaks were rejected using a minimum threshold criteria for a peak detection. To further reduce the influence of noise, this detection threshold was increased from 50% to 66% of the CWT energy maximum in the hybrid algorithm. In addition, a rejection criterion similar to that used in Mie theory data processing was implemented. If there were multiple peaks in the CWT energy spectra within 66% of the maximum energy peak, we determine the CWT size prediction to be a non-unique solution and the a/LCI scan is rejected. Both of these adjustments were implemented in a modified CWT ILSA, termed the hybrid algorithm, which first determines a rough mean nuclear diameter via CWT and then executes a finer size determination using Mie theory across a narrowed range centered about this rough determination.
The hybrid algorithm predicted diameters demonstrate very good agreement when compared to the Mie theory predicted diameters for the distributions filtered at f CWT ( Fig.  8(a)). Compared to the original CWT ILSA algorithm ( Fig. 7(b)), the Bland-Altman bias in predicted diameter decreased from 1.31 µm to 0.49 µm along with a decrease in the standard deviation from 1.70 µm to 1.06 µm showing a reduction in the variability between the two approaches. The hybrid algorithm produced similar results for the per scan distribution with a reduction in bias and standard deviation from 1.14 µm to 0.25 µm and 2.81 µm to 1.76 µm, respectively.
Student's t-test of the hybrid algorithm fitting of per biopsy data demonstrated stronger separation between the dysplastic and non-dysplastic sites compared to Mie theory based ILSA (p = 4e-6). With the same 10.54 µm decision line used in Mie theory based ILSA, the hybrid algorithm was able to achieve 100% sensitivity and 100% specificity ( Fig. 8(b)). Significantly, the hybrid algorithm performed over four times faster with hybrid CWT requiring an average of 110 ms of computation per scan compared to 462 ms per scan for Mie theory ILSA. It should be noted that the algorithms were developed and tested using MATLAB and did not implement other optimization techniques such as parallel computing to improve processing times. Thus, these times do not represent the actual processing times of the clinical a/LCI system, but by comparing the relative processing times of the algorithms, the increase in speed of the hybrid algorithm could be demonstrated.

Discussion
The final hybrid algorithm developed here significantly reduced a/LCI processing time, resulting in a 4.2 fold average decrease in the processing time per A-scan. This is a result of coarse CWT fitting step which significantly reduced the database search range for the Mie theory based ILSA. However, an important consideration is the total time needed for data acquisition and processing. Due to the stronger rejection criterion needed for the modified CWT algorithm, more scans are required per biopsy site to acquire a sufficient amount of data to make a proper evaluation. From the 230 scans taken for the 23 biopsy sites in this study, 82 were rejected (~36%) from Mie theory ILSA processing. With the CWT ILSA rejection criterion, an additional 33 scans were removed from the remaining 148 scans (~22%) resulting in a total of 50% rejection.
We calculated the theoretical amount of time required to acquire and process additional scans to account for those lost through the rejection criteria. To achieve 230 passed scans, an additional 130 scans would be required for Mie theory ILSA (360 scans total; 166.3 seconds), while an additional 230 scans would be required for the hybrid algorithm (460 scans total; 50.6 seconds). This achieves the same number of passed scans for both algorithms, and is a fair time comparison between the two algorithms upon including processing time. Despite the additional scans, the hybrid algorithm would still result in a 3.3 fold improvement in processing time (Fig. 9) and achieve 10 scans per biopsy site. Using the processing times determined above, this would translate to 4.62 seconds of processing time required for Mie theory based ILSA and 1.4 seconds of processing time required using the hybrid algorithm. Regardless of potential improvements that could be implemented with optimized coding or improved processing power, this hybrid algorithm already offers a significant improvement to the current clinical system. Comparison of the results in Fig. 8(b) suggests that the hybrid ILSA algorithm may also offer an improvement in fitting accuracy and diagnostic value. Using the hybrid algorithm, higher specificity and accuracy were achieved and greater separation in nuclear diameter was observed (smaller p-value) between the nuclear diameter of dysplastic and non-dysplastic sites. It should be noted that these results are still preliminary and a larger scale study involving more sample points will be required to validate this claim.
In this small pilot study, the analysis of the biopsy sites demonstrate the feasibility of using a/LCI to characterize cervical epithelium and shows potential for detecting cervical dysplasia. However, the clinical utility of the approach has not been fully validated due to the limited number of samples. In particular, a greater number of measurements from dysplastic sites are needed to better define the range of dysplastic nuclear diameters, study the varying grades of tissue dysplasia, and produce a more reliable clinical decision line.

Conclusion
We have presented application of a/LCI to assessment of cervical epithelium for dysplasia. The results of this pilot study demonstrate the feasibility of using a/LCI to detect cervical dysplasia based on Mie theory ILSA. A new hybrid algorithm based on CWT and Mie theory ILSA fine fitting of a/LCI data, was shown to offer a significant improvement in computational time compared to the Mie theory ILSA. In addition, we demonstrated the diagnostic capability of a/LCI was preserved when using the hybrid algorithm.
Future work will focus on in vivo validation of the clinical a/LCI system for the detection of cervical dysplasia and study the capability of a/LCI in distinguishing severity in grades of cervical intraepithelial neoplasia. A more in depth study will be performed using this data set to validate the fitting performance of the hybrid fitting algorithm as well as to further define the clinical utility of a/LCI for the assessment of cervical dysplasia. Further computational optimization of the hybrid algorithm could provide a real-time in vivo a/LCI clinical detection system for cervical dysplasia.