Combining high wavenumber and fingerprint Raman spectroscopy for the detection of prostate cancer during radical prostatectomy

For prostate cancer (PCa) patients, radical prostatectomy (complete removal of the prostate) is the only curative surgical option. To date, there is no clinical technique allowing for real-time assessment of surgical margins to minimize the extent of residual cancer. Here, we present a tissue interrogation technique using a dual excitation wavelength Raman spectroscopy system capable of sequentially acquiring fingerprint (FP) and high wavenumber (HWN) Raman spectra. Results demonstrate the ability of the system to detect PCa in post-prostatectomy specimens. In total, 477 Raman spectra were collected from 18 human prostate slices. Each area measured with Raman spectroscopy was characterized as either normal or cancer based on histopathological analyses, and each spectrum was classified based on supervised learning using support vector machines (SVMs). Based on receiver operating characteristic (ROC) analysis, FP (area under the curve [AUC] = 0.89) had slightly superior cancer detection capabilities compared with HWN (AUC = 0.86). Optimal performance resulted from combining the spectral information from FP and HWN (AUC = 0.91), suggesting that the use of these two spectral regions may provide complementary molecular information for PCa detection. The use of leave-one-(spectrum)-out (LOO) or leave-one-patient-out (LOPO) cross-validation produced similar classification results when combining FP with HWN. Our findings suggest that the application of machine learning using multiple data points from the same patient does not result in biases necessarily impacting the reliability of the classification models.


Introduction
Raman spectroscopy (RS) is a non-destructive tissue interrogation technique that provides vibrational molecular information based on the inelastic light scattering following laser excitation. The resulting spectra lend quantitative information relating to molecular species including amino acids and proteins, lipids, carbohydrates and nucleic acid (DNA, RNA). This provides a global overview of the tissue associated with structural, metabolic, immunological and genetic phenomena. RS is able to discriminate between healthy and diseased tissue, as in cases of infectious (viral, bacterial, fungal) diseases [1][2][3][4] or in the field of oncology for lung, breast, brain and skin cancers, among others [5][6][7][8]. The development and improvement of RS systems for clinical applications (e.g., surgical guidance, tissue characterization during needle biopsy procedures) is confronted with many challenges, including the requirement to find an appropriate balance between practical acquisition times (of the order of ~1 s for surgical applications) and high spectral quality. During the last decades, several groups have developed in vivo RS systems using hand-held probes, mostly for single-point interrogation of areas <1 mm. Most biomolecular information from RS is contained in the fingerprint (FP) spectral region [9][10][11], but the high wavenumber (HWN) region also contains information that has been used for cancer tissue characterization [12][13][14]. In an effort to improve the diagnostic detection of pathological tissue, studies have recently presented the use of multimodal optical systems using RS in combination with other optical modalities [15][16][17], including intrinsic fluorescence spectroscopy or RS that uses more than one spectral region [5,[18][19][20][21][22][23].
Prostate cancer (PCa) is the most frequently diagnosed cancer amongst men. As with many other solid tumors, surgery is one of the main treatment options. However, residual cancer can be difficult to localize during surgery and may affect negatively patient outcome as positive surgical margins generally correlate with PCa recurrence and mortality [24]. In situ cancer detection during radical prostatectomy, which involves the resection of the whole prostate, remains a clinical challenge because the interface between cancerous and benign tissue can be difficult to visually discern. The cancer can also spread to extra-prostatic tissue (extra-prostatic extension), complicating the complete removal of the cancer. Extra-prostatic extension often correlates with the presence of positive surgical margins, a testament to the fact that some cancer tissue remains after surgery [25]. Extra-prostatic extension and positive surgical margin are respectively present in 21-28% [26,27] and 21-42% [24,28] of patients undergoing radical prostatectomy, and they are both correlated with poor prognosis [26,[28][29][30]. For some cancers, histopathology of frozen sections is used to evaluate the optimal extent of resection [31][32][33]; however, this technique cannot be used to test surgical margins during radical prostatectomy without disrupting the workflow. Typically, it is only after a few days, once the prostate is completely removed and the specimen has been processed (formalin-fixed and paraffin-embedded, FFPE), typically a few days following radical prostatectomy, that the presence of cancer at the surgical margins can be assessed.
There is, therefore, an unmet clinical need for real-time characterization of surgical margins during radical prostatectomy. In the context of PCa, RS was already identified as a potential tool for the discrimination of cancer in FFPE tissue [34,35] and PCa cell lines [36,37] using Raman micro-spectroscopy, and by our group using FP Raman macrospectroscopy on ex vivo human prostates [38]. Here, we present the development of a dual excitation wavelength RS tissue imaging system combining FP and HWN. We describe a proof-of-principle study based on the interrogation of fresh human prostates, demonstrating the potential of the technique for intraoperative PCa detection.

Patients and specimen handling
Men with different cancer grades underwent first-line radical prostatectomy (no hormonal therapy, chemotherapy or radiotherapy prior to the procedure) at the Centre hospitalier de l'Université de Montréal (CHUM). Informed consent was obtained from all patients who were recruited and monitored by the CHUM Ethics Committee (approval number 15.102). Tissue handling methods have been extensively described elsewhere [38]. Briefly, the dimensions and weights of the prostates were measured, and each specimen was inked for surgical margin identification before any other manipulation. For each patient, and in order to minimize the impact on the clinical workflow, a slice of fresh prostate with 3-5 mm thickness was cut and kept in cold saline (0.9% NaCl) prior to RS measurements. Between 10 and 50 tissue areas of 0.5 mm in diameter were measured with RS on each of the 18 prostate slices (673 measurements in total). The prostate slice was then immersed in formalin for fixation. The specimens were then reintegrated into the regular pathology workflow for tissue processing (FFPE) and histopathological analyses.

Tissue analyses
Each tissue area measured with the FP + HWN Raman system (see description below in Section 2.3) was identified on a corresponding FFPE microscope slide using a spatial correlation methodology described elsewhere [38]. Briefly, the method consists of 4 stages: 1) identifying each measured area using a drop of India ink manually deposited onto the fresh specimen ( Fig. 1(b)), 2) using a visible image of the prostate slice to draw a mask of its contour and an overlay composed of circles indicating all areas where RS measurements were made, 3) reconstructing the prostate slice map based on all FFPE tissue scans, and 4) superimposing the mask on the FFPE reconstruction providing histopathology information spatially registered with Raman spectra. To each spectrum was then assigned the label normal or cancer based on histopathological analyses performed by two pathologists: 393/477 tissue areas were identified as normal, as they contained only benign tissue; 84/477 were identified as cancer, as they contained at least one cancerous gland. Data were excluded (196/673) in cases of uncertain diagnosis or when India ink contamination was detected in the Raman spectra. rement RS pro 85 nm) can di ction sub-assem laser (maximu Fig. 1(b) The system can detect in the HWN range 2800-3550 cm −1 but only the <3050 cm −1 range was considered in this study because of the presence of a sharp Raman peak from the sapphire constituting one of the probe lenses around 3240 cm −1 . The same standard material was used for normalization of FP and HWN although it was developed by NIST for signals associated with excitation at 785 nm. For the HWN data presented here, we have shown this method to be equivalent to normalizing the HWN signals using a method commonly utilized in the Raman field based on measurements associated with a calibration gas lamp [39]. The later technique was also implemented by our group in another context for the normalization of in vivo HWN brain data [40]. Although the precise manner with which the spectra are normalized is of qualitative importance for spectral quality assessment, it does not impact classification results using machine learning where only the spectral features relevant to discriminate normal and cancer tissue are extracted and used to produce a classification model.

Supervised learning technique and statistics
Each spectrum was labeled as normal or cancer according to histopathological analyses, here considered as the gold standard for tissue classification when using supervised learning approaches. The data set was then submitted to machine learning classification using support vector machines (SVMs) with leave-one-out cross validation (leave-one-(spectrum)-out, LOOCV) or leave-one-patient-out cross-validation (LOPOCV). LOOCV is typically used for preliminary proof-of-principle studies based on small size cohorts. For larger data sets, LOPOCV can help avoid bias from patient-specific spectral characteristics by ensuring that the spectra from a given patient never appears in both the training and validation data sets. Feature selection based on the minimum redundancy maximum relevance (mRMR) method was used to limit the dimensionality of the spectral information while keeping the most important spectral data. The number of features was optimized using a grid search.
For each classification test, the corresponding receiver operating characteristic (ROC) curve was computed. Sensitivity and specificity values were obtained corresponding to the point closest to the upper-left corner of the sensitivity vs (1 -specificity) curve. When using both HWN and FP data sets, spectra were simply concatenated prior to being fed to the classifier. Reported classification results were achieved with two-sided 95% normal-based confidence intervals of less than ± 5%, and were estimated by using bootstrapping crossvalidation. Univariate statistical analyses were performed using Matlab (MathWorks, USA) on prominent peaks selected on average Raman spectra. For each analyzed peak, a Kolmogorov-Smirnov test was applied in order to test whether or not the data was associated with a normal distribution. In case of normality, a two-sample F-test was performed in order to evaluate whether or not the standard deviations could be considered as equal. Then, a twosample t-test was performed on the selected peaks. In case of non-normality, a Wilcoxon rank-sum test was performed. Corresponding p-values were computed and spectral bands classified within the following categories: p<0.001, p<0.01 or p<0.05.

Tissue classification
Multivariate statistical analyses were performed using supervised machine learning classification. The results using SVMs with LOOCV or LOPOCV are detailed in Table 2 and Fig. 3.

Discussion and conclusion
We have presented the first study of human PCa detection using both FP and HWN RS. We have developed and validated the use of multi-excitation wavelength RS for PCa detection on fresh ex vivo human post-prostatectomy specimens. The acquisition time per interrogation point for the dual excitation wavelength technique was 1 s compared with 0.5 s for FP RS alone, and the processing time was below 0.3 s for both cases. This demonstrates that the technique would effectively remain real-time despite the acquisition of the additional HWN spectrum. Although classification accuracy, specificity and sensitivity gains are of the same order as the confidence interval associated with the statistical analysis, our results suggest that reliability and robustness assessed through ROC analysis would increase when combining FP and HWN. This can potentially be linked with the increased, and potentially non-redundant, molecular informational content conferred by the combination of both FP and HWN. By design in the current study the imaging time was limited to 1 s and tissue illuminations below skin maximum exposure limits in order to demonstrate the technique could be integrated with current prostatectomy procedures with minimal disruption of the surgical workflow. Because Raman SNR increases with integration time and laser power, the classification results would be impacted by an increased SNR, potentially even negating the gains associated with combining FP and HWN. When comparing LOOCV and LOPOCV, the results were comparable for FP and HWN and almost identical for FP + HWN, suggesting that there is little patient-specific bias in the spectra. Regarding the size of our cohort (18 patients), we expected to have improved results for LOOCV compared with LOPOCV. The close agreement between the ROC curves obtained with both techniques suggests that 18 different specimens (with on average 26 ± 9 spectra per specimen) are sufficient to offer a representative selection of the different tissue patterns existing in cancerous and normal prostate tissues, and that the correlation of spectral information within the same patient does not seem to unduly affect classification results. If confirmed on a larger cohort of patients, this finding could facilitate the clinical translation of RS in prostate surgery by limiting the number of patients required to train robust classification models especially in light of the highly complex nature of normal and cancer prostate tissue. In fact, PCa is a multi-focal disease where tumor tissue can be disseminated within the whole prostate [46][47][48]. Benign tissues are mixed with malignant cells, resulting in prostatic tissue that is highly heterogeneous and almost undetectable with current imaging techniques [49]. Despite this, RS was shown here to be an appropriate and a powerful tool for real-time detection of PCa.
Results for FP RS (accuracy = 87% for LOOCV) given in this study are comparable with those obtained by our group on a larger cohort of 32 fresh prostate slices (accuracy = 86% for LOOCV using neural network) [38] and by another group on a cohort of 38 snap-frozen prostate samples (accuracy = 85% using principal component analysis / linear discriminant analysis) [35]. Regarding the potential of combining FP and HWN, improvements were observed when comparing the diagnostic yields of HWN + FP (AUC = 1) with HWN (AUC = 0.93) or FP (AUC = 0.97) in the case of esophageal squamous cell carcinoma [23].
In conclusion, we have presented a proof-of-principle study illustrating the use of a dual excitation wavelength RS for PCa diagnosis, an approach that could be used for surgical guidance during radical prostatectomy procedures to ensure no cancer tissue remains at the time of surgery.