Comparison of a physical model and principal component analysis for the diagnosis of epithelial neoplasias in vivo using diffuse reflectance spectroscopy

We explored the use of diffuse reflectance spectroscopy in the ultraviolet-visible (UV-VIS) spectrum for the diagnosis of epithelial precancers and cancers in vivo. A physical model (Monte Carlo inverse model) and an empirical model (principal component analysis, (PCA)) based approach were compared for extracting diagnostic features from diffuse reflectance spectra measured in vivo from the dimethylbenz[α]anthracenetreated hamster cheek pouch model of oral carcinogenesis. These diagnostic features were input into a support vector machine algorithm to classify each tissue sample as normal (n=10) or neoplastic (dysplasia to carcinoma, n=10) and cross-validated using a leave one out method. There was a statistically significant decrease in the absorption and reduced scattering coefficient at 460 nm in neoplastic compared to normal tissues, and these two features provided 90% classification accuracy. The first two principal components extracted from PCA provided a classification accuracy of 95%. The first principal component was highly correlated with the wavelength-averaged reduced scattering coefficient. Although both methods show similar classification accuracy, the physical model provides insight into the physiological and structural features that discriminate between normal and neoplastic tissues and does not require a priori, a representative set of spectral data from which to derive the principal components. ©2007 Optical Society of America OCIS codes: (170.1470) Blood/tissue constituent monitoring; (170.6510) Spectroscopy, tissue diagnostics; (170.4580)hjv Optical diagnostics for medicine; (160.4760) Optical properties References and links 1. "Cancer Facts and Figures," (American Cancer Society, 2006). 2. R. J. Nordstrom, L. Burke, J. M. Niloff, and J. F. Myrtle, "Identification of cervical intraepithelial neoplasia (CIN) using UV-excited fluorescence and diffuse-reflectance tissue spectroscopy," Lasers Surg. Med. 29, 118-127 (2001). 3. I. Georgakoudi, E. E. Sheets, M. G. Muller, V. Backman, C. P. Crum, K. Badizadegan, R. R. Dasari, and M. S. Feld, "Trimodal spectroscopy for the detection and characterization of cervical precancers in vivo," Am. J. Obstet. Gynecol. 186, 374-382 (2002). 4. M. G. Muller, T. A. Valdez, I. Georgakoudi, V. Backman, C. Fuentes, S. Kabani, N. Laver, Z. Wang, C. W. Boone, R. R. Dasari, S. M. Shapshay, and M. S. Feld, "Spectroscopic detection and evaluation of morphologic and biochemical changes in early human oral carcinoma," Cancer 97, 1681-1692 (2003). #81504 $15.00 USD Received 26 Mar 2007; revised 29 May 2007; accepted 5 Jun 2007; published 8 Jun 2007 (C) 2007 OSA 11 June 2007 / Vol. 15, No. 12 / OPTICS EXPRESS 7863 5. D. J. Parekh, W. C. Lin, and S. D. Herrell, "Optical spectroscopy characteristics can differentiate benign and malignant renal tissues: a potentially useful modality," J. Urol. 174, 1754-1758 (2005). 6. N. Subhash, J. R. Mallia, S. S. Thomas, A. Mathews, P. Sebastian, and J. Madhavan, "Oral cancer detection using diffuse reflectance spectral ratio R540/R575 of oxygenated hemoglobin bands," J. Biomed. Opt. 11, 014018 (2006). 7. D. C. de Veld, M. Skurichina, M. J. Witjes, R. P. Duin, H. J. Sterenborg, and J. L. Roodenburg, "Autofluorescence and diffuse reflectance spectroscopy for oral oncology," Lasers Surg. Med. 36, 356-364 (2005). 8. G. M. Palmer, C. Zhu, T. M. Breslin, F. Xu, K. W. Gilchrist, and N. Ramanujam, "Comparison of multiexcitation fluorescence and diffuse reflectance spectroscopy for the diagnosis of breast cancer (March 2003)," IEEE Trans. Biomed. Eng. 50, 1233-1242 (2003). 9. Y. S. Fawzy, M. Petek, M. Tercelj, and H. Zeng, "In vivo assessment and evaluation of lung tissue morphologic and physiological changes from non-contact endoscopic reflectance spectroscopy for improving lung cancer detection," J. Biomed. Opt. 11, 044003 (2006). 10. A. Amelink, H. J. Sterenborg, M. P. Bard, and S. A. Burgers, "In vivo measurement of the local optical properties of tissue by use of differential path-length spectroscopy," Opt. Lett. 29, 1087-1089 (2004). 11. P. Thueler, I. Charvet, F. Bevilacqua, M. St Ghislain, G. Ory, P. Marquet, P. Meda, B. Vermeulen, and C. Depeursinge, "In vivo endoscopic tissue diagnostics based on spectroscopic absorption, scattering, and phase function properties," J. Biomed. Opt. 8, 495-503 (2003). 12. G. Zonios, L. Perelman, V. Backman, R. Manoharan, M. Fitzmaurice, J. Van Dam, and M. S. Feld, "Diffuse reflectance spectroscopy of human adenomatous colon polyps in vivo," Appl. Opt. 38, 6628-6637 (1999). 13. J. C. Finlay, and T. H. Foster, "Hemoglobin oxygen saturations in phantoms and in vivo from measurements of steady-state diffuse reflectance at a single, short source-detector separation," Med. Phys. 31, 1949-1959 (2004). 14. T. J. Pfefer, L. S. Matchette, C. L. Bennett, J. A. Gall, J. N. Wilke, A. J. Durkin, and M. N. Ediger, "Reflectance-based determination of optical properties in highly attenuating tissue," J. Biomed. Opt. 8, 206-215 (2003). 15. N. Ghosh, S. K. Mohanty, S. K. Majumder, and P. K. Gupta, "Measurement of optical transport properties of normal and malignant human breast tissue," Appl. Opt. 40, 176–184 (2001). 16. G. M. Palmer, and N. Ramanujam, "Monte Carlo-based inverse model for calculating tissue optical properties. Part I: Theory and validation on synthetic phantoms," Appl. Opt. 45, 1062-1071 (2006). 17. C. T. Chen, H. K. Chiang, S. N. Chow, C. Y. Wang, Y. S. Lee, J. C. Tsai, and C. P. Chiang, "Autofluorescence in normal and malignant human oral tissues and in DMBA-induced hamster buccal pouch carcinogenesis," J. Oral Pathol. Med. 27, 470-474 (1998). 18. S. Andrejevic, J. F. Savary, C. Fontolliet, P. Monnier, and H. van Den Bergh, "7,12dimethylbenz[a]anthracene-induced 'early' squamous cell carcinoma in the Golden Syrian hamster: evaluation of an animal model and comparison with 'early' forms of human squamous cell carcinoma in the upper aero-digestive tract," Int. J. Exp. Pathol. 77, 7-14 (1996). 19. F. H. White, K. Gohari, and C. J. Smith, "Histological and ultrastructural morphology of 7,12 dimethylbenz(alpha)-anthracene carcinogenesis in hamster cheek pouch epithelium," Diagn. Histopathol. 4, 307-333 (1981). 20. M. C. Skala, G. M. Palmer, C. Zhu, Q. Liu, K. M. Vrotsos, C. L. Marshek-Stone, A. Gendron-Fitzpatrick, and N. Ramanujam, "Investigation of fiber-optic probe designs for optical spectroscopic diagnosis of epithelial pre-cancers," Lasers Surg. Med. 34, 25-38 (2004). 21. M. C. Skala, K. M. Riching, D. K. Bird, A. Gendron-Fitzpatrick, J. Eickhoff, K. W. Eliceiri, P. J. Keely, and N. Ramanujam, "In vivo multiphoton fluorescence lifetime imaging of protein-bound and free nicotinamide adenine dinucleotide in normal and precancerous epithelia," J. Biomed. Opt. 12, 024014 (2007). 22. M. C. Skala, J. M. Squirrell, K. M. Vrotsos, J. C. Eickhoff, A. Gendron-Fitzpatrick, K. W. Eliceiri, and N. Ramanujam, "Multiphoton microscopy of endogenous fluorescence differentiates normal, precancerous, and cancerous squamous epithelial tissues," Cancer Res. 65, 1180-1186 (2005). 23. C. Zhu, G. M. Palmer, T. M. Breslin, F. Xu, and N. Ramanujam, "Use of a multiseparation fiber optic probe for the optical diagnosis of breast cancer," J. Biomed. Opt. 10, 024032 (2005). 24. C. Zhu, G. M. Palmer, T. M. Breslin, J. Harter, and N. Ramanujam, "Diagnosis of breast cancer using diffuse reflectance spectroscopy: Comparison of a Monte Carlo versus partial least squares analysis based feature extraction technique," Lasers Surg. Med. 38, 714-724 (2006). 25. D. G. MacDonald, and S. M. Saka, Structural Indicators of the High Risk Lesion (Cambridge Univ. Press, Cambridge, 1991). 26. M. L. Ellsworth, R. N. Pittman, and C. G. Ellis, "Measurement of hemoglobin oxygen saturation in capillaries," Am. J. Physiol. 252, H1031-1040 (1987). 27. S. R. Millon, K. M. Roldan-Perez, K. M. Riching, G. M. Palmer, and N. Ramanujam, "Effect of optical clearing agents on the in vivo optical properties of squamous epithelial tissue," Lasers Surg. Med. 38, 920927 (2006). #81504 $15.00 USD Received 26 Mar 2007; revised 29 May 2007; accepted 5 Jun 2007; published 8 Jun 2007 (C) 2007 OSA 11 June 2007 / Vol. 15, No. 12 / OPTICS EXPRESS 7864 28. J. Mourant, J. Freyer, A. Hielscher, A. Eick, D. Shen, and T. Johnson, "Mechanisms of light scattering from biological cells relevant to noninvasive optical-tissue diagnostics," Appl. Opt. 37, 3586-3593 (1998). 29. G. M. Palmer, C. Zhu, T. M. Breslin, F. Xu, K. W. Gilchrist, and N. Ramanujam, "Monte Carlo-based inverse model for calculating tissue optical properties. Part II: Application to breast cancer diagnosis," Appl. Opt. 45, 1072-1078 (2006). 30. W. R. Dillon, and M. Goldstein, Multivariate analysis: methods and applications (Wiley, New York, 1984). 31. J. Devore, Probability and Statistics for Engineering and the Sciences (Duxbury, Pacific Grove, 2000). 32. N. Cristianini, and J. Shawe-Taylor, An introduction to support vector machines: and other Kernel-based learning methods (Cambridge University Press, Cambridge, 2000). 33. J. Hjorth, Computer intensive statistical methods: Validation, model selection, and bootstrap (Chapman & Hall, London, New York, 1994). 34. R. Drezek, K. Sokolov, U. Utzinger, I. Boiko, A. Malpica, M. Follen, and R. Richards-Kortum, "Understanding the contributions of NADH and collagen to cervical tissue fluorescence spectra: modeling, measurements, and implications," J. Biomed. Opt. 6, 385-396 (2001). 35. U. Sunar, H. Quon, T. Durduran, J. Zhang, J. Du, C. Zhou, G. Yu, R. Choe, A. Kilger, R. Lustig, L. Loevner, S. Nioka, B. Chance, and A. G. Yodh, "Noninvasive diffuse optical measurement of blood flow and blood oxygenation for monitoring radiation therapy in patients with head and neck tumors: a pilot study," J. Biomed. Opt. 11, 064021 (2006). 36. R. Hornung, T. H. Pham, K. A. Keefe, M. W. Berns, Y. Tadir, and B. J. Tromberg, "Quantitative nearinfrared spectroscopy of cervical dysplasia in vivo," Hu


Introduction
Stratified squamous epithelial tissues (for example, the cervix, skin and oral cavity) consist of a cellular epithelium and an underlying stroma that contains structural proteins (collagen) and blood vessels.Cancers that originate in the stratified squamous epithelia accounted for more than 50% of all diagnosed cancers, and more than 300,000 cancer deaths in 2006 [1].Early detection of these cancers is crucial to minimize cancer morbidity and mortality.Detection of stratified squamous epithelial cancers usually consists of visual inspection of the surface of the organ, followed by tissue biopsy and histological evaluation.A diagnostic method that is more reliable than visual inspection alone could potentially increase the likelihood that diseased tissue is biopsied, and potentially reduce unnecessary tissue biopsies.
Diffuse reflectance spectroscopy is a non-invasive optical technique that could augment the current standard of cancer diagnosis, and provide real-time feedback on the optimal biopsy location before tissue is removed.The diffuse reflectance spectrum reflects the absorption and scattering properties of the tissue.The absorption coefficient (μ a ) is directly related to the concentration of absorbers in the tissue, and the primary absorbers in epithelial tissues in the ultraviolet-visible (UV-VIS) spectral region are oxygenated and deoxygenated hemoglobin.The reduced scattering coefficient (μ s ') reflects the size and density of scatterers in tissue, such as collagen fibers, cells and nuclei.Diffuse reflectance spectra can be collected rapidly and remotely from tissue via a fiber-optic probe coupled to a spectrometer.This technology is fast, quantitative and sensitive to alterations in tissue structure and biochemistry.Several studies have demonstrated that diffuse reflectance spectroscopy can diagnose early epithelial cancers with high sensitivity and specificity [2][3][4].
A number of previous studies have used empirical methods to extract features from diffuse reflectance spectra that are diagnostic of disease.One approach is to subjectively select spectral features such as intensities and intensity ratios [2,5,6], or use chemometric methods such as Principal Component Analysis (PCA) to reduce the entire spectra with minimal information loss into a few orthogonal principal components [7,8].The extracted features of each sample are then input into a classification scheme for tissue diagnosis.Empirical methods for feature extraction do not relate the measured spectra to the physically meaningful information contained in the diffuse reflectance spectrum.Moreover, a method like PCA requires a representative set of spectra from the different tissue types to be available in order to extract the principal components.Extracting physically meaningful information from the diffuse reflectance measurements using a physically based model will improve our understanding of the physiological and structural features that differentiate normal and neoplastic epithelial tissues.This approach exploits the entire spectral data content to quantify tissue absorption and scattering and enables identification of specific wavelength ranges where the optical properties are most diagnostic.In addition, this approach does not require a priori, a set of spectral data for feature extraction.
Several groups have used analytical and numerical models to extract the absorption and scattering coefficients of stratified squamous epithelial tissues from in vivo measurements of diffuse reflectance spectra in the UV-VIS [3,4,[9][10][11][12][13][14][15].These approaches include empirical calibration to a set of reference phantoms [10,14], diffusion theory modeling [3,4,9,12,13,15], and Monte Carlo modeling of measured diffuse reflectance spectra [11].Although there are obvious advantages to using a physical model based rather than an empirical based analysis of diffuse reflectance spectra, there have been no studies that systematically compare the classification accuracy achieved with these two different feature extraction approaches for the diagnosis of epithelial pre-cancers and cancers in vivo.
The goal of the study presented here is to compare an empirical versus physical model based technique for extracting features from in vivo diffuse reflectance spectra of epithelial tissues.The physical model used was a scalable Monte Carlo model of diffuse reflectance developed by our group [16].This physical model is valid for high absorption in the UV-VIS, does not require specific fiber-optic probe geometries, and requires only a single phantom study for calibration.The empirical model used was PCA.Features extracted using both approaches were input into a classification algorithm based on support vector machine algorithms (SVM) and a leave one out method was used to obtain an unbiased estimate of the sensitivity and specificity.The results of this small, preliminary study demonstrate that classification based on the physical model achieves similar classification accuracy (90%) compared to the empirical model (95%) for in vivo diagnosis of neoplasias in a stratified squamous epithelial tissue model.While both approaches are diagnostically similar, the physical model has the advantage of extracting physically meaningful parameters, which offer insight into the underlying physiological processes responsible for the differences in tissue spectra (and thus, wavelengths of interest), and does not require a representative spectral data set from the different tissue types to extract features from each spectrum.

DMBA-treated hamster cheek pouch model of oral cancer
A total of 10 male Golden Syrian hamsters (152 ± 14 g) were examined in this study.This study was approved by the Institutional Animal Use and Care Committee and meets NIH guidelines for animal welfare.The dimethylbenz[α]anthracene (DMBA) treated hamster cheek pouch model was selected for this study because it has been shown to mimic the dysplasia-carcinoma sequence in the human oral cavity [17][18][19] and different stages of epithelial pre-cancer and cancer can be examined over a relatively short period of time.For each hamster, the right cheek pouch was treated three times per week with 0.5% DMBA in mineral oil (DMBA-treated cheek), and the left cheek pouch was treated with mineral oil only (control cheek) for 16 weeks.The treatment procedures were established from previous studies [20][21][22].

In vivo diffuse reflectance spectroscopy
Diffuse reflectance spectra of epithelial tissues in the hamster cheek pouch were measured using a fiber optic probe coupled to a multi-wavelength optical spectrometer.The spectrometer and fiber optic probe have been described in detail in previously published studies [23,24].Briefly, the optical spectrometer consists of a 450 W xenon lamp, a scanning double excitation monochromator, a bifurcated fiber-optic probe, a filter wheel, an imaging spectrograph, and a CCD camera.The common end of the fiber-optic probe (that is in contact with the tissue) has a central illumination core of 19 fibers (overall diameter 1,180 μm) and a ring of 4 collection fibers.The distance between the center of illumination of the core and the center of each collection fiber is 735 μm.The sensing depth of the fiber optic probe was evaluated using Monte Carlo simulations in a previous study, and indicated penetration depths of ~0.5 to 2 mm for a wide range of tissue optical properties in the UV-VIS [24] and in this application will be predominantly sensitive to the stromal layer of the tissue [20].
At 18-22 weeks after the commencement of DMBA treatment, diffuse reflectance spectroscopy was performed on the control and DMBA-treated cheek pouch of each animal, as previously described [20].Each hamster was anesthetized with an intraperitoneal injection of a mixture of 200 mg/kg of ketamine and 5 mg/kg xylazine, the cheek pouch was everted and stretched over a metal post, and diffuse reflectance spectroscopy measurements were made from the control cheek and then the DMBA-treated cheek within each animal.
Diffuse reflectance spectra from each site were recorded over a wavelength range of 350-600 nm with an integration time of 0.01 seconds.The slit widths of the excitation monochromator and imaging spectrograph were chosen to provide bandpasses of 3.5 and 7.9 nm, respectively.Each intensity-wavelength point in the spectrum was sampled at a wavelength increment of 0.26 nm and then binned to result in an increment of 5 nm.
The background spectrum, which was measured with the probe immersed in distilled water using the same experimental setup as the tissue measurements, was first subtracted point-by-point from each tissue spectrum prior to further calibration.Each diffuse reflectance spectrum was calibrated for the wavelength-dependent response and throughput of the system by normalizing it to the diffuse reflectance spectrum measured with the fiber optic probe inserted into an integration sphere (DRACA-30I, Labsphere, Inc.).

Histopathology
After the optical spectroscopy measurements were carried out, the biopsy was placed in buffered formalin and submitted for histopathology.The tissue biopsies were prepared and read by a certified pathologist (AGF).Diagnosis was based on established criteria [25].The diagnosis assigned to each sample was based on the most severe diagnosis of the entire corresponding biopsy sample, consistent with our previous studies [20][21][22].

Extraction of tissue absorption and scattering properties with a Monte Carlo based physical model
A Monte Carlo based inverse physical model developed by our group [16] was used to extract the absorption and scattering properties of the hamster cheek pouch from the measured diffuse reflectance spectra.In the forward model, a set of absorbers are presumed to be present in the medium, and the scatterer is assumed to be single-sized, spherically shaped and uniformly distributed.The wavelength dependent absorption coefficients of the medium are calculated from the concentration of each absorber and the corresponding wavelength dependent extinction coefficients.The wavelength dependent scattering coefficients and anisotropy factor are calculated from scatterer size, density and the refractive index of the scatterer and surrounding medium using Mie theory for spherical particles.The absorption and scattering coefficients are then input into a scalable Monte Carlo physical model of light transport to obtain a modeled diffuse reflectance spectrum.In the inverse model, the modeled diffuse reflectance is adaptively fitted to the measured tissue diffuse reflectance.When the sum of squares error between the modeled and measured diffuse reflectance is minimized, the concentrations of absorber, the scatter size and density are extracted.A detailed description of this physical model is provided elsewhere [16].
The free parameters in the inversion process are the scatterer size, the scatterer density and the concentrations of the absorbers.The fixed parameters in the model were the refractive index mismatch between the scatterers and the surrounding medium, the type of absorbers (in this study, the absorbers were assumed to be oxygenated and deoxygenated hemoglobin) and the extinction coefficient of the absorbers.The refractive index mismatch was assumed to be 1.4 for the scatterers and 1.36 for the surrounding medium, which is a reasonable assumption [16].The extinction coefficients for oxygenated (oxy) hemoglobin and deoxygenated (deoxy) hemoglobin for the hamster are available for the 400-460 nm wavelength range [26].It was previously shown that hamster hemoglobin extinction coefficients cannot be replaced with that of human hemoglobin [27].Thus, the wavelength range used in the analysis was dictated by the available extinction coefficients of hamster hemoglobin (400-460 nm), consistent with a previous hamster spectroscopy study from our group [27].The fits for the inversion were run 20 times, with random initial guesses for the free parameters.The converged values with the lowest sum of squared errors were output.This provides better assurance that the global minimum has been reached.The range of initial guesses was 0.35 to 1.5 μm diameter for the scatterer size, 3 to 25 cm -1 for the mean (wavelength averaged) μ s ', and 0 to 20 cm -1 for the maximum μ a [20,28,29].
A series of experiments were conducted to ensure that the hamster cheek pouch, which is approximately 3 mm thick, could be approximated as semi-infinite over the 400-460 nm range [27].Diffuse reflectance measurements were made from the cheek pouch for two different cases.In the first case, aluminum was placed beneath the cheek pouch.In the second case, black felt was placed beneath the cheek pouch.It is expected that aluminum will reflect any transmitted light, while the felt will absorb it.The percent difference between the integrated diffuse reflectance intensity (over the 400-460 nm wavelength range) of the felt and aluminum covered base was 7% [(Integrated intensity (felt)-Integrated intensity (aluminum)) / Integrated intensity (felt)*100].When the diffuse reflectance spectra from the aluminum and felt trials were input into the physical model, the percent difference between the modeled aluminum spectra and the modeled felt spectra [(Integrated intensity (modeled felt)-Integrated intensity (modeled aluminum)) / Integrated intensity (modeled felt)*100] was also 7%.This was repeated for two more trials to give a similar result.This indicates that the hamster cheek pouch can be assumed semi-infinite.
The accuracy of the model used has been previously verified using phantom studies for the 350-850 nm wavelength range [16] and the 400-460 nm wavelength range [27].Briefly, the phantoms consisted of polystyrene spheres (07310-15, Polysciences, Inc., Warrington, PA) and human hemoglobin (H0267, Sigma Co., St. Louis, MO) dissolved in water.Five phantoms were prepared, having reduced scattering coefficients ranging from 10.9 -16.4 cm -1 and absorption coefficients ranging from 0 -17.5 cm -1 .The optical properties were extracted from each of these phantoms using the model described above, and compared to known values to determine the error in extracting the optical properties.The model was used to extract the optical properties from the diffuse reflectance spectrum of each phantom using every other phantom as the reference measurement (i.e., all possible combinations of target-reference phantoms were considered).The root mean square (RMS) error for extracting the optical properties of the phantoms, averaged across wavelengths and phantoms was then calculated to evaluate the accuracy of this model.It was found that the errors for the limited wavelength range (mean RMS error in μ s ' and μ a of 1.5 ± 0.7 % and 1.4 ± 0.7 %, respectively) were comparable to those of the full wavelength range (mean RMS error in μ s ' and μ a of 2.5 ± 0.8 % and 3.1 ± 1.5 %, respectively).
Prior to fitting to the inverse physical model, each tissue diffuse reflectance spectrum was divided point by point by the diffuse reflectance spectrum of a reference tissue phantom with known optical properties (absorption coefficient of 1.3-9.4/cmand reduced scattering coefficient of 13.9-14.6/cmover wavelength range of 400-460 nm; reference phantoms with a lower concentration of hemoglobin or no hemoglobin added as an absorber resulted in similar extracted tissue optical properties).Each modeled diffuse reflectance spectrum was calibrated in a similar manner (simulated reflectance spectrum with the same optical properties).
The outputs from the inverse Monte Carlo physical model were the concentrations of oxygenated and deoxygenated hemoglobin, and the scatterer size and density (note there was no interaction between the free parameters).The parameters used for further analysis were the absorption and reduced scattering coefficients over the wavelength range 400-460 nm, the total hemoglobin concentration (oxygenated plus deoxygenated hemoglobin), and the hemoglobin saturation (oxygenated hemoglobin concentration divided by total hemoglobin concentration).The reduced scattering coefficient was used rather than the scatterer size or density because different values of scatterer diameter and density can yield similar values for the reduced scattering coefficient using Mie theory, and the reduced scattering coefficient is the value typically reported in the literature.

Extraction of principal components using an empirical model (PCA)
A second feature extraction method, principal component analysis (PCA) [30], was used to represent the measured diffuse reflectance spectra with a few principal components (PCs), which account for most of the variance of the original spectral data set while significantly reducing the data dimension.Prior to this analysis, the measured diffuse reflectance spectra were pre-processed by normalizing each spectrum to the same reference phantom that was used for normalization in the Monte Carlo analysis (see above).The "phantom-normalization" method was chosen so that the physical model and empirical (PCA) based analysis methods could be compared on spectra that were pre-processed the same way.All of the principal components that accounted for 100% of the spectral variance were extracted (13 PCs).The details of this analysis are described in a previous publication [20].

Classification
A Wilcoxon rank-sum test [31] was employed to identify the parameters extracted from the physical model that showed statistically most significant differences (p<0.05) between normal and neoplastic epithelial tissues.The parameters obtained from the physical model that were identified as diagnostically most significant were incorporated into a non-parametric support vector machine (SVM) [32] algorithm to classify each tissue sample as normal or neoplastic.A ''leave one out'' cross-validation [33] was then carried out to obtain an unbiased estimate of the classification accuracy of the algorithm.This was repeated for the statistically most significant principal component scores (p<0.05)obtained from the empirical (PCA) analysis.

Results
Of the 20 cheek pouches evaluated in this study, 10 were diagnosed as normal (the 10 control cheeks), 4 with dysplasia, 2 with carcinoma in situ (CIS) and 4 with squamous cell carcinoma (SCC).For the purposes of statistical analyses, all tissues diagnosed with dysplasia, CIS and SCC were combined into one "neoplastic" tissue category.

Results from physical model analysis
Figure 1 shows the measured diffuse reflectance spectra of a normal and neoplastic (SCC) tissue sample measured in vivo from within the same animal and the corresponding fits to the physical model.There is excellent agreement between the measured and fitted spectra and this is representative of the quality of fits obtained in this study.1. Measured diffuse reflectance spectra of one normal and neoplastic (squamous cell carcinoma) tissue site within the same animal, and the corresponding fits to the physical model (c.u.= calibrated units).The measured and fitted data were multiplied by the reference phantom after the fitting procedure, so that the original diffuse reflectance spectra and its corresponding fit are shown in this figure.The residuals for the normal and neoplastic sample are also shown.
Table 1 shows the values for the reduced scattering coefficient (μ s ') and the absorption coefficient (μ a ) at 20 nm increments between 400 nm and 460 nm, averaged across all normal and all neoplastic samples.The neoplastic μ s ' is significantly lower than the normal μ s ' at all wavelengths (p<0.05).The neoplastic μ a is significantly lower than the normal μ a at 400, 420 and 460 nm (p<0.05), and the normal and neoplastic μ a are approximately equal at 440 nm (p>0.05).The mean (wavelength averaged) reduced scattering coefficient (mean μ s ') decreases with neoplasia compared to normal (p<0.05).There is no significant change in the mean (wavelength averaged) absorption coefficient of normal and neoplastic epithelial tissues (p=0.089).The lack of significance in the wavelength-averaged absorption coefficient is due to the lack of significant differences between the normal and neoplastic μ a in the 425-450 nm range (p>0.05)(significant differences between normal and neoplastic μ a exist between 400-420 nm and 455-460 nm, p<0.05).The results of paired comparisons are similar (data not shown).
Table 1.Mean and standard deviation of the reduced scattering coefficient (μ s ') and absorption coefficient (μ a ) calculated across all normal (n=10) and neoplastic (dysplasia/CIS/SCC, n=10) tissues.Significant differences between normal vs. neoplastic tissues were found for the variables marked with an asterisk (*), based on unpaired Wilcoxon tests.
Figure 3 shows a scatter plot of the values for the absorption and reduced scattering coefficient at 460 nm extracted from the measured diffuse reflectance spectra using the physical model based analysis.These two variables provided equal or better classification accuracy, sensitivity and specificity in the cross-validated set than any other two variables included in Table 1, the hemoglobin saturation or the total hemoglobin concentration (data not shown).The physical model based analysis provides good discrimination between the normal and neoplastic samples, with only one normal and one neoplastic (CIS) sample misclassified in the cross-validated set (Table 2).The addition of a third feature did not significantly improve the classification accuracy of the cross-validated set, consistent with a previous study [24].Table 2. Results from the SVM classifier for the physical model based analysis.The two variables shown in Fig. 3 were used to classify normal (n=10) and neoplastic (dysplasia/CIS/SCC, n=10) samples.The mean and standard deviation of the overall classification rate, sensitivity and specificity are shown for the training sets, and the "leave one out" cross validation result is also shown.

Results from empirical (PCA) based analysis
Only the first two PCs extracted from the phantom-normalized diffuse reflectance spectra, PC1 and PC2, showed statistically significant differences between normal and neoplastic samples (p<0.05).These PCs accounted for a total of 96.3% of the spectral variance in the measured normal and neoplastic diffuse reflectance spectra (Table 3).Figure 4 shows the average phantom normalized spectra for normal (n=10) and neoplastic (dysplasia/CIS/SCC, n=10) tissues, and spectral re-projections of PC1 + PC2 for normal (n=10) and neoplastic (n=10) tissues.The difference between the re-projected spectra and the phantom-normalized spectra are shown as residuals for normal and neoplastic tissues.The spectral re-projections from the PC subspace onto the normalized spectral data space were calculated by linearly combining the principal components weighted by their corresponding PC scores.The re-projected and phantom-normalized spectra are similar to each other (there are no significant differences between the phantom-normalized and re-projected spectra at any wavelength for normal or neoplastic tissues, p>0.05), with residuals similar to that of the physical model (Fig. 1).There are statistically significant differences between the phantomnormalized spectra of neoplastic and normal epithelial tissues from 430-460 nm (p<0.05) and between the re-projected spectra from 425 -460 nm (p<0.05).Emission Wavelength (nm) Diffuse Reflectance (c.u.) Fig. 4. Average phantom normalized spectra for normal (n=10, -) and neoplastic (dysplasia/CIS/SCC, n=10, --) tissues, and spectral re-projections of PC1 + PC2 for normal (•) and neoplastic (*) tissues.The difference between the re-projected spectra and the phantomnormalized spectra are shown as residuals for normal ( ) and neoplastic ( ) tissues.
Figure 5 shows a scatter plot of the scores for PC1 and PC2 extracted from phantomnormalized diffuse reflectance spectra using the PCA analysis.The empirical model based analysis showed a similar cross-validated accuracy (Table 4) to the physical model based analysis (Table 2).In the empirical model based classification, 1 normal sample was misclassified in cross-validation (different normal samples were misclassified in the empirical and physical model based classification).The addition of a third PC did not significantly improve the classification accuracy of the cross-validated set, consistent with a previous study [24].

Discussion
The results of this study indicate that a physical model based method (Monte Carlo model) achieves similar classification accuracy, sensitivity and specificity compared to an empirical method (PCA) for the analysis of diffuse reflectance spectra for in vivo detection of neoplasias in stratified squamous epithelial tissue.The physical model provides biologically relevant endpoints that may reveal the physiological and structural features underlying the diagnosis.However, the empirical model provides little insight into tissue physiology or structure.Unlike the empirical model, the physical model does not require a priori, a representative set of spectral data from normal and neoplastic tissues in order to extract diagnostic features.The values reported for the absorption and reduced scattering coefficient of normal tissues in the current study are consistent with a previous study from our group, which used the same physical model to extract the optical properties of the hamster cheek pouch from in vivo diffuse reflectance spectra measured from 400 -460 nm [27].The absorption and reduced scattering coefficients reported here for the hamster cheek pouch are also in agreement with those reported for the human cervix and oral cavity in vivo in the same wavelength region [3,4,34] using different extraction algorithms.A previous study of the human oral cavity in vivo found that the absorption coefficient over the UV-VIS wavelength range demonstrated large variations in hemoglobin concentration, which were uncorrelated with disease state [4].This lack of significant changes in the hemoglobin concentration is consistent with the results reported in the current study.The absolute values of the hemoglobin concentrations reported in the current study are also consistent with previous in vivo studies in human head and neck cancers, measured using diffuse optical techniques [35].The decrease in observed hemoglobin saturation is consistent with previous in vivo measurements of normal and neoplastic cervical tissues in the NIR [36].The decrease in hemoglobin saturation observed in the current study is consistent with the known behavior of neoplasias.Neoplastic lesions extract greater oxygen from the local vasculature due to enhanced metabolic demands of the neoplastic cells relative to normal [37].The saturation values reported here for normal and neoplastic tissues in the oral cavity in vivo are lower than those reported by Hornung et al in the human cervix in vivo (85 ± 3% and 77 ± 15%, for normal and high grade squamous intra-epithelial lesions, respectively).However, tissue oxygenation is highly dependent on anesthesia [38,39] (no anesthesia was used in the Hornung study, and ketamine/xylazine was used for full sedation in the current study) and organ site [39].Baudelet et al found that the pO 2 of both tumors and normal muscle tissue decreased approximately the same percentage with ketamine/xylazine anesthesia in mice [38].This suggests that the decrease in hemoglobin saturation with neoplasia measured under ketamine/xylazine anesthesia in the current study represents a similar trend that would be expected under non-anesthetized conditions.
The decrease in the reduced scattering coefficient with neoplasia observed in this study (~40% decrease) is consistent with that observed in the human cervix and oral cavity in vivo in the ultraviolet-visible (UV-VIS) wavelength region [3,4].Previous confocal microscopy experiments indicate a threefold increase in the epithelial scattering coefficient in highly dysplastic tissues compared to normal tissue [40].Microscopy experiments have also indicated a decrease in collagen fiber density by a factor of 2 in severe dysplasia [41], and degradation of the collagen matrix leads to decreased scattering [42].The probe geometry used in this study is most sensitive to changes in the stroma [20], and the scattering properties of epithelial tissues are dominated by the highly scattering stroma [3,4,34], so the decrease in the reduced scattering coefficient likely reflects degradation of collagen with neoplastic development.
There were significant differences between normal and neoplastic tissues at all wavelengths between 400-460 nm for the reduced scattering coefficient, 400-420 nm and 455-460 nm for the absorption coefficient, and 425-460 nm for the principal components.The significant wavelengths derived from the principal components overlap with those of the reduced scattering coefficient and minimally overlap with those of the absorption coefficient.PC2 is not highly correlated with either the mean (wavelength averaged) reduced scattering coefficient or the mean (wavelength averaged) absorption coefficient (correlation coefficient < 0.5 for normal and neoplastic tissues).However, PC1 is highly correlated with the mean reduced scattering coefficient (correlation coefficient ~ 0.9 for normal and neoplastic tissues), and is not correlated with the mean absorption coefficient (correlation coefficient < 0.2 for normal and neoplastic tissues).This indicates that changes in PC1 with neoplasia are mostly due to changes in the reduced scattering coefficient.One advantage of the physical model based approach is that the sources of contrast can be directly determined from the absorption and reduced scattering coefficients.
In conclusion, the study presented here demonstrates that diffuse reflectance spectroscopy can be useful for the diagnosis of epithelial neoplasias in vivo.The analysis of diffuse reflectance spectra was carried out using a physical model based and an empirical model based approach.The physical model based approach appears to provide similar classification accuracy, sensitivity and specificity as the empirical approach for the diagnosis of epithelial neoplasias in vivo with diffuse reflectance spectroscopy.Unlike the empirical model, the physical model based approach provided insight into the physiological and structural features that discriminate between normal and neoplastic epithelial tissues.The empirical model based approach also has the disadvantage of requiring a priori, a representative set of spectra from the different tissue types in order to extract the principal components.In the future, larger clinical studies are needed to determine if the conclusions of this small, preliminary study are valid for clinical applications.
Fig.1.Measured diffuse reflectance spectra of one normal and neoplastic (squamous cell carcinoma) tissue site within the same animal, and the corresponding fits to the physical model (c.u.= calibrated units).The measured and fitted data were multiplied by the reference phantom after the fitting procedure, so that the original diffuse reflectance spectra and its corresponding fit are shown in this figure.The residuals for the normal and neoplastic sample are also shown.

1 )Fig. 3 .
Fig. 3. Scatter plot of the two of the most diagnostic tissue parameters obtained from the physical model based analysis (mean reduced scattering coefficient and absorption coefficient at 460 nm).The samples include normal (•), dysplasia and CIS (+), and SCC (*).The decision line (-) was obtained from the SVM classifier.(SCC=squamous cell carcinoma, CIS = carcinoma in situ).

Fig. 5 .
Fig.5.Scatter plot of two diagnostic parameters (PC1 and PC2) extracted from the diffuse reflectance spectra using the empirical model (PCA) based analysis for phantom-normalized spectra.The samples include normal (•), dysplasia and CIS (+), and SCC (*).The decision line (-) was obtained from the SVM classifier.

Table 3 .
Principal components obtained from the empirical (PCA) analysis, the percent variance accounted for by each PC, and the p-value from unpaired Wilcoxon tests of normal (n=10) vs. neoplastic (dysplasia/CIS/SCC, n=10) tissues.These PCs were derived from phantom-normalized diffuse reflectance spectra.

Table 4 :
Results from the SVM classifier for the empirical model based analysis.PC1 and PC2 from phantom-normalized spectra were used to classify normal (n=10) and neoplastic (dysplasia/CIS/SCC n=10) samples.The mean and standard deviation of the overall classification rate, sensitivity and specificity are shown for the training sets, and the "leave one out" cross validation result is also shown.