Identification of Salivary Biomarkers for Oral Cancer Detection with Untargeted and Targeted Quantitative Proteomics Approaches*

Oral cancer is one of the most common cancers worldwide. To identify biomarkers for oral cancer diagnosis, salivary proteins differentially expressed in oral cancer patients have been identified with iTRAQ-based MS analyses. The candidates were further selected using MRM-MS and validated with the immunoassays. The results suggest that three proteins (CFH, FGA, and SERPINA1) have the potential as diagnosis and prognosis biomarkers of oral cancer, and analysis of salivary proteome is a feasible strategy for biomarker discovery. Graphical Abstract Highlights iTRAQ-based analysis of saliva samples from oral cancer patients. Proteome profiling of saliva samples from patients with oral premalignant lesions. Verification of salivary biomarker candidates with MRM-MS and immunoassays. Identification of salivary proteins as potential biomarkers of oral cancer. Oral cavity squamous cell carcinoma (OSCC) is one of the most common cancers worldwide. In Taiwan, OSCC is the fifth leading cause of cancer-related mortality and leads to 2800 deaths per year. The poor outcome of OSCC patients is principally ascribed to the fact that this disease is often advanced at the time of diagnosis, suggesting that early detection of OSCC is urgently needed. Analysis of cancer-related body fluids is one promising approach to identify biomarker candidates of cancers. To identify OSCC biomarkers, salivary proteomes of OSCC patients, individuals with oral potentially malignant disorders (OPMDs), and healthy volunteers were comparatively profiled with isobaric tags for relative and absolute quantitation (iTRAQ)-based mass spectrometry (MS). The salivary levels of 67 and 18 proteins in the OSCC group are elevated and decreased compared with that in the noncancerous group (OPMD and healthy groups), respectively. The candidate biomarkers were further selected using the multiple reaction monitoring (MRM)-MS and validated with the immunoassays. More importantly, the higher salivary level of three proteins, complement factor H (CFH), fibrinogen alpha chain (FGA), and alpha-1-antitrypsin (SERPINA1) was correlated with advanced stages of OSCC. Our results indicate that analysis of salivary proteome is a feasible strategy for biomarker discovery, and the three proteins are potential salivary markers for OSCC diagnosis.


In Brief
Oral cancer is one of the most common cancers worldwide. To identify biomarkers for oral cancer diagnosis, salivary proteins differentially expressed in oral cancer patients have been identified with iTRAQ-based MS analyses. The candidates were further selected using MRM-MS and validated with the immunoassays. The results suggest that three proteins (CFH, FGA, and SERPINA1) have the potential as diagnosis and prognosis biomarkers of oral cancer, and analysis of salivary proteome is a feasible strategy for biomarker discovery.

Hao-Wei Chu ‡ ‡ ‡ ‡, Kai-Ping Chang § ‡ ‡ ‡, Chia-Wei Hsu ¶, Ian Yi-Feng Chang ¶, Hao-Ping Liuʈ, Yi-Ting Chen ‡ ¶** ‡ ‡, and Chih-Ching Wu ‡ § ¶ § § ¶ ¶ʈʈ
Oral cavity squamous cell carcinoma (OSCC) is one of the most common cancers worldwide. In Taiwan, OSCC is the fifth leading cause of cancer-related mortality and leads to 2800 deaths per year. The poor outcome of OSCC patients is principally ascribed to the fact that this disease is often advanced at the time of diagnosis, suggesting that early detection of OSCC is urgently needed. Analysis of cancer-related body fluids is one promising approach to identify biomarker candidates of cancers. To identify OSCC biomarkers, salivary proteomes of OSCC patients, individuals with oral potentially malignant disorders (OPMDs), and healthy volunteers were comparatively profiled with isobaric tags for relative and absolute quantitation (iTRAQ)-based mass spectrometry (MS). The salivary levels of 67 and 18 proteins in the OSCC group are elevated and decreased compared with that in the noncancerous group (OPMD and healthy groups), respectively. The candidate biomarkers were further selected using the multiple reaction monitoring (MRM)-MS and validated with the immunoassays. More importantly, the higher salivary level of three proteins, complement factor H (CFH), fibrinogen alpha chain (FGA), and alpha-1-antitrypsin (SERPINA1) was correlated with advanced stages of OSCC. Our results indicate that analysis of salivary proteome is a feasible strategy for biomarker discovery, and the three proteins are potential salivary markers for OSCC diagnosis. Oral cancer is one of the leading causes of cancer-related mortality. There were ϳ300,000 cases of oral cancer and around 150,000 patients died from oral cancer worldwide in 2018 (1). In Taiwan, oral cancer ranks fifth in terms of deaths for males and leads to 2800 deaths per year (2). Oral cavity squamous cell carcinoma (OSCC) 1 accounts for more than 90% of oral cancers and is associated with chronic irritating habits such as betel quid chewing, alcohol drinking, and smoking (3,4). Although the diagnosis and treatment are currently improved, overall survival rates of OSCC patients remain poor in Taiwan (2). The major cause of high mortality of the disease is that more than 50% of OSCC patients were first diagnosed at its advanced stages, suggesting that early detection of the disease is needed to improve the treatment outcome and reduce the growing burden of OSCC (5).
Conventional oral examination following with biopsy of suspected site is current approach for OSCC screening (6,7). Although the approach has been widely used for decades, its efficacy for OSCC detection remains controversial. Some patients are unable to fully open their mouths for examination. The biopsy is usually acquired from a single site, potentially omitting another cancer sites, especially in patients with multiple types of lesions. In addition, OSCC and several types of oral potentially malignant disorders (OPMDs) have similar appearances, resulting in difficulty to discriminate OPMDs from OSCC (8). The use of biomarkers in body fluids may represent an effective tool for the early detection of OSCC (9). To date, however, no effective biomarker has been approved for the diagnosis and/or prognosis of OSCC.
Saliva is an emerging specimen for disease diagnosis because it can be harvested easily and non-invasively. OSCC cells are encircled by salivary milieu, and thus it is practicable to detect salivary markers for OSCC screening (10). More than 100 salivary molecules have been reported as potential biomarkers of OSCC, including proteins, nucleotides (DNA, mRNA, and microRNA), and metabolites (11). Previously, we profiled proteomes of saliva from the OSCC patients by means of SDS-PAGE coupled with liquid chromatography (LC)-tandem mass spectrometry (MS). With spectral counting-based label-free quantification, 22 proteins were identified as potential salivary biomarkers of OSCC. Among them, resistin was subjected to further validation using ELISA. The data confirmed that the salivary levels of resistin in the OSCC patients were significantly higher than that in the healthy group (12), indicating that analysis of saliva proteome is feasible for discovery of OSCC biomarkers.
Although numerous studies have been published to search for OSCC biomarkers, few reported protein biomarkers have moved into clinical practice (10). This failure reflects an insufficient effort to select biomarker candidates from proteome profiling and validate potential biomarkers in adequate samples with suitable methods (13). In this study, we aimed to discover useful salivary biomarkers of OSCC. To this end, the saliva proteomes of the healthy volunteers, the OPMD individuals, and the OSCC patients were comprehensively analyzed with isobaric tags for relative and absolute quantitation (iTRAQ)-based MS. The iTRAQ analyses identified 1838 proteins, among which, 67 and 18 were elevated and decreased in the OSCC patients compared with that in the noncancerous groups, respectively. We then performed the multiple reaction monitoring (MRM)-MS to verify 24 candidates using a small cohort of saliva samples. Three candidate biomarkers (CFH, FGA, and SERPINA1) were further evaluated as OSCC biomarkers using an independent cohort of saliva samples with the sandwich ELISAs. Finally, we identified a marker panel that shows high sensitivity and specificity for OSCC diagnosis.

EXPERIMENTAL PROCEDURES
Patient Populations and Clinical Specimens-All saliva specimens were collected at the Chang Gung Memorial Hospital (CGMH), Linkou, Taoyuan, Taiwan from 2012 to 2014 (supplemental Table S1). This research followed the tenets of the Declaration of Helsinki. All subjects signed an informed consent form approved by the Institutional Review Board of CGMH before participation and/or the use of previously collected saliva samples. All volunteers were examined by an oral mucosal screening test. The cases of OPMD and OSCC were biopsy-proven, and patients underwent routine check-ups according to standard protocols. All subjects were asked to avoid drinking, eating, and smoking for at least 2 h before the unstimulated saliva collection. To inhibit the activity of endogenous salivary enzymes, 4 ml of saliva was treated with protease inhibitor cocktails (1:20, v/v; Roche, Basel, Switzerland) immediately. After centrifugation at 3000 ϫ g for 15 min at 4°C, the supernatants were harvested and stored at Ϫ80°C until use. The protein concentrations were measured with a Pierce™ BCA Protein Assay Kit (Thermo Fisher Scientific, San Jose, CA). The total protein levels of the salivary samples used in proteomic analyses were shown in supplemental Fig. S1.
Tryptic Digestion for iTRAQ Labeling-For iTRAQ labeling, equal amounts of protein from each saliva sample in the given groups has been pooled. Proteins of the pooled saliva were reduced with 5 mM tris-(2-carboxyethyl) phosphine hydrochloride (TCEP; Sigma-Aldrich, St. Louis, MO) at 56°C for 1 h and alkylated with 10 mM S-methyl methanethiosulfonate (MMTS; Sigma-Aldrich) at room temperature for 30 min. The protein mixtures were digested with modified, sequencing-grade trypsin (1:10, enzyme/protein; Promega, Madison, WI) at 37°C for 16 h.
The iTRAQ experiments were performed with iTRAQ™ Reagent Multiplex Kit 4-plex (AB Sciex, Forster City, CA) according to the manufacturer's protocol. Briefly, the iTRAQ reagent was reconstituted in ethanol and then mixed with the peptide mixture and incubated with shaking at room temperature for 1.5 h. We used iTRAQ 115, 116, and 117 for the peptide mixtures from the healthy individuals, OPMD, and OSCC groups, respectively (Table I and supplemental Table S1). The labeled samples were mixed, desalted with a ziptip filled with C 18 resin (SOURCE™ 5RPC, GE Healthcare, UK), and then dried in a SpeedVac concentrator (Thermo Fisher Scientific). The peptide mixtures were resuspended with SCX-HPLC buffer A containing 0.1% formic acid (FA; Sigma-Aldrich) and 30% acetonitrile (ACN; Mallinckrodt Baker, Center Valley, PA).
The LC equipment was connected to the mass spectrometer, LTQ-Orbitrap Elite (Thermo Fisher Scientific), operated by Xcalibur software (version 2.2 SP1.48, Thermo Fisher Scientific). Intact peptides were detected in the Orbitrap at a resolution of 60,000, and the ion of (Si(CH 3 ) 2 O) 6 H ϩ at m/z 445.120025 was used as a lock mass for internal calibration. The 12 data-dependent MS/MS scan events, including 6 collision-induced dissociations (CID) and 6 higher-energy collision-induced dissociations (HCD), were followed by one MS scan for the six most abundant ions in the preview MS scan. The m/z values selected for the MS/MS analyses were dynamically excluded for 180 s. The electrospray voltage of the source was applied at 1.8 kV. The microscan with maximum fill times of both MS and MS/MS spectra was 1000 and 100 ms, respectively. The minimum signal intensity required of MS/MS spectra for both CID and HCD was 10,000, and the normalized collision energy was 35% for CID and 30% for HCD. The m/z range for MS scan was 350 -2000 Da.
Protein Database Searching and iTRAQ Data Analysis-Data analysis for the iTRAQ experiments was performed using Proteome Discoverer software (ver. 1.4.1.14; Thermo Fisher Scientific). The MS/MS 1 The abbreviations used are: OSCC, oral cavity squamous cell carcinoma; APOA1, apolipoprotein A-I; APOA2, apolipoprotein A-II; AUC, area under the ROC curve; CFH, complement factor H; CID, collision-induced dissociation; DAVID, Database for Annotation, Visualization and Integrated Discovery; FGA, fibrinogen alpha chain; HCD, higher-energy collision-induced dissociation; iTRAQ, isobaric tags for relative and absolute quantitation; KEGG, Kyoto Encyclopedia of Genes and Genomes; LOD, limit of detection; LLOQ, lower limit of quantitation; MRM, multiple reaction monitoring; OPMD, oral potentially malignant disorders; ROC, receiver operator characteristic; SERPINA1, alpha-1-antitrypsin; SIS, stable isotope-labeled standard. spectra were searched against the Swiss-Prot human sequence database (released at 201803, selected for Homo sapiens, 20,198 entries) using the Mascot search engine (version 2.2.0; Matrix Science, London, UK). The setting of the MS precursor ion mass was 350 -5,000 Da. The search parameters set for CID spectra were to the following: ESI-TRAP instrumentation, precursor mass tolerance 10 ppm, fragment mass tolerance 0.5 Da. The search parameters for HCD spectra were set to the following: ESI-FTICR instrumentation, precursor mass tolerance 10 ppm, fragment mass tolerance 0.05 Da. Peptides with one missed cleavage from the trypsin digestion were allowed. The evaluated modifications included the static methylthio modification on cysteine (ϩ46.0916 Da) and the variable oxidation on methionine (ϩ15.9994 Da) and iTRAQ on the N terminus and lysine (ϩ144.1544 Da) for both CID and HCD spectra. To determine false discovery rate (FDR) of protein identification, the spectra were searched against a decoy database in which the sequences have been reversed. FDR were estimated by number of matches in decoy database/number of matches in target database. To ensure a low overall FDR for protein identification, the peptide confidence setting was set to the following: p value of peptide confidence Ͻ 0.01, peptide length Ͼ 7 amino acids, and Ն 2 peptides identified per protein. The identification of the epithelial keratins was excluded to reduce contamination during the experimental process.
The protein quantitative data were exported from Proteome Discoverer. For the protein quantification, only proteins with more than two quantifiable spectra were accepted. iTRAQ ratio of each protein was transformed to the log2 base scale. The iTRAQ ratios (116/115 and 117/115) of each protein were manually normalized such that the log2 ratios displayed a median value of zero for all peptides in a given protein. This was performed across an entire labeling experiment to correct for variation in protein abundance.
Bioinformatics Analysis-The enrichment analysis of the saliva proteome was performed with the online tools on the Database for Annotation, Visualization and Integrated Discovery (DAVID, v6.8, https://david.ncifcrf.gov/home.jsp). The criteria of the specific pathways from both biological processes and the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (http://www.genome.jp/ kegg/pathway.html).
Tryptic Digestion for LC-MRM-MS and Addition of Stable Isotopelabeled Standard (SIS) Peptides-Tier 2 level of the MRM assays were developed and applied. The dried sample (5 g protein) of each saliva sample was dissolved in 5 l of 25 mM NH 4 HCO 3 (Mallinckrodt Baker) and treated with 10 l of 10% sodium deoxycholate (DOC; Sigma-Aldrich). The proteins were reduced with 4 l of 50 mM TCEP and 25 mM NH 4 HCO 3 at 62°C for 30 min and alkylated with 4.6 l of 55 mM iodoacetamide (Amersham Biosciences, Buckinghamshire, UK) and 25 mM NH 4 HCO 3 at 37°C for 30 min. Modified sequencing-grade trypsin was added to the samples at a 20:1 protein/enzyme ratio and incubated at 37°C for 9 h. Each protein digest was spiked with a constant amount of an SIS mixture containing 24 [ 13 C 6 ]Lys-, [ 13 C 6 15 N 2 ]Lys-, or [ 13 C 6 15 N 4 ]Arg-coded SIS peptides, purchased from UVic-Genome BC Proteomics Centre (supplemental Table S10). The SIS peptides were selected using the following criteria: (1) unique peptides containing 8 -20 residues without any known post-translational modification sites; (2) peptides without chemically reactive amino acids, such as C, M, and W; (3) peptides without unstable sequences, such as NG, DG, and QG; and (4) peptides without sequences potentially leading to missed cleavage, such as RP and KP (13). To remove DOC, the digested samples were centrifuged for 10 min at 16,000 ϫ g and 4°C. The resulting supernatant was desalted and concentrated by solid-phase extraction using a Waters Oasis HLB Elution Plate (Waters, Milford, MA) using the modified manufacturer's protocol. Briefly, the resin in each well was rinsed with ACN and equilibrated with 200 l equilibration buffer (0.1% trifluoroacetic acid and 0.1% FA). The salivary protein digest was loaded onto the plate, then washed with water and eluted twice with 25 l of 70% ACN. Eluted samples were frozen and lyophilized to dryness. The cycles of freezethaw procedures were consistent for each sample (14).
LC-MRM-MS Analysis-For the LC-MRM-MS assay, peptides were injected and separated with a nanoACQUITY UPLC System (Waters) equipped with a reversed-phase column (nanoACQUITY C 18 column, 100 m ϫ 100 mm, 1.7 m particle size, Waters). The dried peptide samples were rehydrated with buffer C to produce a 0.25 g/l concentration and subjected to LC separation using buffer E (0.1% FA and 98% ACN). A linear gradient of buffer E (3-28% for 48 min, 28 -38% for 5 min, 38 -95% for 1 min, 95% for 5 min, 95-3% for 1 min, and 3% for 10 min) was applied at a flow rate of 400 nL/min. To prevent peptide carryover on the UPLC column, a blank solvent injection (25-min analysis at 400 nL/min) was run between each samples (14). An AB/MDS Sciex 5500 QTRAP (AB Sciex, Framingham, MA, USA) with a nano-electrospray ionization source controlled by the Analyst software (version 1.6.2; AB Sciex) was used for MRM-MS analyses. The parameters were set as follows: ion spray voltage 2,200 V, curtain gas setting 20 psi (ultrahigh-purity nitrogen), interface heater temperature 150°C, MS operating pressure 2.1 ϫ 10 Ϫ5 Torr, and collision cell exit with a potential range of 16 -44 V for each target.
MRM Data Analysis and Generation of Calibration Curves-All MRM-MS data were exported from Analyst software and processed with the Skyline (version 2.5; https://skyline.ms/project/home/software/ Skyline/begin.view), a widely used and freely available tool developed by the MacCoss group (15). All integrated peaks/transitions were manually censored and corrected to the same retention times of the peaks from the sample and its corresponding SIS peptide, along with the removal of peaks derived from noise.
A response curve was generated for each peptide by using different amounts of a tryptic digest from a control saliva sample which was pooled from three OSCC, three OPMD individuals, and three healthy volunteers. The digestion protocol of the pooled sample was the same as that of clinical samples. The SIS peptide mixture was a mixture that included a different level of SIS peptide for 24 candidates in a condition appropriate for quantitative analysis. The composition of SIS peptide mixture was adjusted based on the peptide concentrations and signal intensities of endogenous peptides in control saliva sample to ensure quantitation accuracy. The 9-point dilution curve (including blank point) was generated with various concentrations of an appropriately diluted tryptic digest which was spiked with SIS peptide in a constant level (supplemental Table S10). With a known concentration of each SIS peptide, the concentration of the protein in the unknown samples can be quantified based on the observed peak-area ratio. To improve the experimental precision, we performed three technical replicates independently from tryptic digestion to LC-MRM-MS for each of clinical samples and for each of diluting points of response curves. The average peptide concertation values of the technical triplicate for each sample and concentration point were represented in subsequent analysis. Additionally, we used one ion pair as quantifier in three MS ion pairs for each peptide, and the other two were used to confirm the retention times in LC system and prevent from any signal interference. All MRM peaks were inspected manually in Skyline software to avoid incorrect peak detection and ensure integrated peak structure.
The limit of detection (LOD) was an indicator of the lowest level of observed signal for the endogenous target peptide for which the signal-to-noise (S/N) ratio was more than 10 in all three replicates. The S/N ratio was calculated by the intensity of signal peak divided by the highest peak which appeared with the signal peak within 2 min. The lower limit of quantitation (LLOQ) was defined as the lowest concentration of endogenous target peptide for quantification with Ͻ 20% coef-ficient of variation (CV), and the distance between the LLOQ and response curve was less than 20% in all triplicate experiments.
Experimental Design and Statistical Rationale-To profile salivary proteome of OSCC with iTRAQ analysis, salivary samples from 10 healthy individuals, 9 OPMD patients, and 10 OSCC patients were used (Table I and supplemental Table S1). Equal amounts of proteins from each sample in the same group were pooled to minimize the between-individual variations. Three replicates of iTRAQ analyses were performed, each of which included the pooled samples of the healthy control, OPMD, and OSCC groups, i.e. three technically replicated samples harvested and labeled in parallel. The mean and standard deviation (S.D.) of log2 iTRAQ ratios of all proteins for each comparison were obtained. Protein with log2 ratio above the mean plus one S.D. in all triplicate experiments was considered overexpressed. Protein with log2 ratio under mean minus one S.D. in all triplicate experiments was considered underexpressed.
For biomarker verification with LC-MRM-MS, salivary samples from 30 healthy individuals, 28 OPMD patients, and 30 OSCC patients were used (Table I and supplemental Table S1). Three technical replicates independently from tryptic digestion to LC-MRM-MS for each of clinical samples and for each of diluting points of response curves were conducted. The sandwich ELISAs were used to verify salivary levels of CFH, FGA, and SERPINA1 as OSCC biomarkers in a larger cohort (100 healthy individuals, 55 OPMD patients, and 77 OSCC patients). Each sample was analyzed in duplicate. The protein concentration of each sample is determined as average of the duplicates.
The between-group differences were determined using a nonparametric Mann-Whitney U test. All statistical tests were two-sided, and a p value Ͻ 0.05 was considered statistically significant. Receiver operator characteristic (ROC) curve and area under the ROC curve (AUC) were generated to illustrate the decision value of various cut-off points for candidates. The point with the largest sum of specificity and sensitivity was selected as the threshold. The marker panel was constructed with the logistic regression method and evaluated using logistic regression with the following method: backward LR to exclude proteins with low significance (p value Ͻ 0.05) from the model. All statistical analyses were carried out using SPSS software version 12.0 (SPSS Inc., Chicago, IL).

Profiling of Saliva Proteomes with iTRAQ-based MS Analysis-
To identify novel salivary biomarkers of OSCC, the proteomes of saliva samples from 10 healthy individuals, 9 volunteers with OPMD, and 10 OSCC patients were quantitatively profiled using the iTRAQ reagent combined with a 2D LC-MS/MS analysis (Fig. 1A). The OPMD group was used to evaluate whether chronic inflammatory diseases in the oral cavity might lead to elevated levels of proteins in saliva. To reduce effects of between-individual variations, equal amount of proteins from each saliva specimen in the given group was combined into a pooled sample (Table I and supplemental Table S1). We conducted three replicates of iTRAQ-based analyses (Rep 1, Rep 2, and Rep 3), each of which contained the pooled samples of the healthy control, OPMD, and OSCC groups, that is three technically replicated samples harvested and labeled in parallel (supplemental Table S2, S3, and S4 for Rep 1, 2, and 3, respectively). Using this strategy, we identified 1838 proteins To determinate reproducibility of the iTRAQ analyses, the proteins detected in three replicates (supplemental Table S2-S4) were analyzed for overlapping members. As shown in Fig.  1B, 63.2% of the proteins were identified in all replicates, and 80.4% were found in more than two replicates, whereas 19.6% were exclusive to one replicate. For the quantified proteins, 59.1% and 77.8% can be compared in all and at least two replicates, respectively (Fig. 1C), suggesting that the iTRAQ-based proteome profiling was performed adequately.
To outline the differences between the healthy control, OPMD, and OSCC groups, ratios of each protein between three groups were compared. As shown in supplemental Fig.  S2, the protein ratios of healthy control/OSCC are highly correlated with that of OPMD/OSCC (r ϭ 0.720 -0.730) with statistical significance (p Ͻ 0.001), whereas the comparisons between OPMD/healthy control and OSCC/healthy control show lower correlations (r ϭ 0.233-0.269). The comparisons collectively suggest that in terms of protein levels, the OPMD group is like the control group, whereas the OSCC group is distinct from the OPMD and control groups.
Biological Process Networks of the Proteins with Altered Salivary Levels in the OSCC Patients-To discover differentially expressed proteins in OSCC saliva, the mean and S.D. for the ratios of all proteins in each replicate were acquired. A protein with a value larger than the mean plus one S.D. were deemed overexpressed, whereas a protein with a value less than the mean minus one S.D. were considered underexpressed. Based on the cutoffs, 335 ( Fig. 2A) and 312 (Fig. 2C) proteins were overexpressed and underexpressed, respectively, in the OSCC group compared with the control. Among these differentially expressed proteins, 147 proteins were in common for all three replicates, including 102 overexpressed and 45 underexpressed proteins ( Fig. 2A and 2C).  Moreover, compared with the OPMD group, 173 (89 and 84 with elevated and reduced levels, respectively) were differentially expressed in the OSCC group for all three replicates ( Fig. 2B and 2D).
To reveal the biological processes likely involved in changes of OSCC microenvironments, the proteins with altered levels in OSCC group were subjected to the group-wise analyses with DAVID. As shown in supplemental Table S5, 147 proteins differentially expressed in the OSCC group compared with the control group were highly correlated in chorus with biological processes of metabolism of protein, cholesterol, and glucose, acute-phase response, complement activation, blood coagulation, and cell-cell adhesion. Moreover, those proteins were applied for the pathway-wise analysis with the KEGG database. The results revealed that those proteins possibly participated in metabolism of antibiotics, protein, and glucose, complement activation, Staphylococcus aureus infection, and systemic lupus erythematosus (supplemental Table S6).
In line with the observations that the OPMD saliva is alike to the control saliva (supplemental Fig. S2), the enriched biological processes and KEGG pathways of the proteins differentially expressed in the OSCC group compared with the control group (supplemental Table S5 and S6) are mostly overlapped with that of the 173 proteins with altered levels in the OSCC group compared with the OPMD group (supplemental Table  S7 and S8).
Establishment of a LC-MRM-MS Assay to Quantify the Biomarker Candidates-The iTRAQ-based analyses revealed that the levels of 67 and 18 proteins were elevated and decreased in the OSCC saliva, respectively, either compared with the control or with the OPMD saliva (supplemental Table S9), suggesting that these proteins have the potential to be relevant markers for OSCC screening. To ascertain clinical relevance of the iTRAQ results, a multiplexed LC-MRM-MS were performed using SIS peptides as internal standards to verify multiple biomarkers in saliva samples (Fig. 1A). Among the 67 candidate biomarkers, 24 proteins have been detected in a 70-min LC-MRM-MS run. We spiked each tryptic saliva sample with a pooled standard consisting of the 24 SIS peptides. supplemental Table S10 lists the detailed sequences, Q1/Q3 transitions, abundance of each SIS peptide, and collision energies used in this study.
LOD and LLOQ values of the 24 targets were determined with establishment of response curves. The LLOQ was defined as the lowest concentration of endogenous peptide that could be measured in triplicate with a CV less than 20%, and the LOD was defined as the lowest concentration at which a signal was observed for the endogenous target peptide with a signal/noise ratio greater than 10 in all three replicates. The determined LOD values for endogenous salivary proteins ranged from 4.44 pg/ml for heparin cofactor 2 (SERPIND1) to 2.50 ng/ml for CFH, whereas LLOQ values ranged from 0.13 ng/ml for apolipoprotein A-II (APOA2) to 78.49 ng/ml for SERPINA1. The LOD and LLOQ for each peptide in the pooled saliva sample are summarized in supplemental Table S10.
Verification of the Biomarker Candidates with the LC-MRM-MS-Using the established LC-MRM-MS assay, the levels of 24 proteins have been determined in the individual saliva specimens of 30 controls, 28 OPMD, and 30 OSCC patients (Table I and supplemental Table S1). The mean, S.D., and distribution of the protein concentrations in each group are shown in Table II and supplemental Fig. S3. Among 24 targets, 20 proteins with the levels significantly elevated in the OSCC group, either compared with the OPMD or with the control groups (p Ͻ 0.01). Moreover, compared with the noncancerous group (the control and OPMD individuals), the levels of 7 proteins were increased 3-fold (p Ͻ 0.001) in the OSCC group (Table II). These proteins include apolipoprotein A-I (APOA1), APOA2, FGA, haptoglobin (HP), hemopexin (HPX), inter-alpha-trypsin inhibitor heavy chain H1 (ITIH1), and SERPIND1.
The capability of utilizing the proteins for detection of OSCC was further evaluated by ROC curve analysis using the estimated concentrations of 24 candidates in saliva (ng/ml). For discriminating the OSCC group from the healthy control group, 22 proteins have the AUC values greater than 0.7, and 20 with the AUC values more than 0.7 to distinguish between OSCC and OPMD groups (Table II). Further, 6 proteins (APOA1, APOA4, CFH, FGA, SERPINA1, and SERPIND1) were able to effectively differentiate OSCC from healthy controls (AUC Ն 0.8). The LC-MRM-MS results clearly demonstrate that the iTRAQ-based profiling of saliva proteome effectively reveals appropriate biomarker candidates for OSCC detection.
Verification of Biomarker Candidates in Saliva Using Sandwich ELISA-Using the MRM-MS analyses, six proteins (APOA1, APOA4, CFH, FGA, SERPINA1, and SERPIND1) were significantly overexpressed in the OSCC patients compared with the healthy individuals (p Ͻ 0.001, fold change Ͼ 2, and AUC Ն 0.8), and thus are potential biomarkers of OSCC (Table II and supplemental Fig. S3). Among them, CFH, FGA, and SERPINA1 were subjected to further verification with the sandwich ELISAs (Fig. 1A) in a larger cohort containing saliva specimens from 100 controls, 55 OPMD, and 77 OSCC patients (Table I and supplemental Table S1). As shown in Fig. 3A and Table III, the salivary levels of CFH, FGA, and SERPINA1 in the OSCC group were significantly elevated compared with the healthy controls (p Ͻ 0.001) as well as to the OPMD group (p Ͻ 0.01). At a cutoff-value of 535.8 ng/ml, salivary CFH level was able to discriminate the OSCC patients from health controls with a sensitivity and specificity of 37.7% and 95.0%, respectively. With a cutoff of 3.4 g/ml for the salivary FGA, the sensitivity and specificity for OSCC screening were 51.9% and 87.0%, respectively. At a cutoff-value of 593.3 ng/ml, SERPINA1 was able to detect OSCC with a sensitivity and specificity of 64.9% and 79.0%, respectively.
The quantified data acquired with the ELISA and LC-MRM-MS analyses were further compared. As shown in supplemental Fig. S4, all the comparisons between MRM-MS and ELISA assays show high correlations (r ϭ 0.418 -0.723) with statistical significance (p Ͻ 0.0001, n ϭ 66), suggesting that the combination of untargeted and targeted quantitative proteomics is feasible to identify salivary biomarker candidates of OSCC.
The Potential of CFH, FGA, and SERPINA1 for Early Detection of OSCC-Next, we evaluated the suitability of the three proteins as early detection markers of OSCC by testing their salivary levels in patients with early-stage primary tumors (pT status T1/T2), no lymph node metastasis (pN negative), and at early overall tumor stages (stage I-II). Salivary levels of FGA and SERPINA1 were significantly elevated in patients with early-stage primary tumors, no lymph node metastasis, and at the early overall tumor stages, compared with the healthy controls (supplemental Table S11). Salivary CFH levels in the patients with early stage primary tumors were higher than that in the healthy controls (supplemental Table S11). These results support the potential of CFH, FGA, and SERPINA1 as useful salivary markers for early detection of OSCC.
The Utility of CFH, FGA, and SERPINA1 as Biomarkers for OSCC Prognosis-We further investigated whether the salivary levels of CFH, FGA, and SERPINA1 were correlated with the clinical manifestations of OSCC. As shown in Table  IV, the salivary levels of FGA and SERPINA1 in the OSCC patients with late-stage primary OSCC (pT-T3/T4) were higher than those with early-stage primary OSCC (pT-T1/T2; p Ͻ 0.01). Importantly, the salivary levels of all three proteins were significantly elevated in the OSCC patients with lymph-node metastasis (pN-N Ͼ 0) compared with those without lymphatic metastasis (pN-N ϭ 0; p Ͻ 0.01). Consistently, a similar phenomenon can be observed in the patients with OSCC at overall tumor stages III-IV compared with those at overall tumor stages I-II (p Ͻ 0.01; Table IV). The results imply that the salivary levels of CFH, FGA, and SERPINA1 could be practical for OSCC prognosis. In contrast, no obvious correlation was observed between the salivary levels of the three proteins and the habitual behaviors (alcohol consumption, betel nut chewing, and smoking), cell differentiation, and patient age (Table IV). DISCUSSION Early diagnosis of OSCC can save numerous lives, diminish burden of morbidity arose from treatment of the disease at advanced stage, and lighten economic load of the disease management (10). However, the current approach to OSCC diagnosis, which includes visual examination of oral cavity succeeded by inspection with biopsy, is sometimes inefficient. Patients with OSCC are sometimes incapable of fully opening their mouths for examination, what is more, the biopsy is customarily depended on a sampling of single site, which probably leads to a false-negative diagnosis, particularly in individual with multiple potentially malignant lesions. Discovery of salivary biomarkers with high efficacy for OSCC detection can greatly ameliorate early diagnosis of OSCC (12,17). To this end, the salivary proteome of OSCC patients was quantitatively profiled in this study. The individuals with OPMD were also included here to appraise whether the chronic inflammatory disease in oral cavity could alter the salivary levels of candidate proteins.
The salivary biomarkers are probably more practical than blood biomarkers for OSCC detection because parts of oral  cancer cells are immersed in salivary milieu and saliva specimens can be acquired readily in clinical practice. In line with the speculation, we previously found that the salivary levels of resistin and thrombospondin-2 (THBS2) in OSCC patients are significantly higher than that in the healthy individuals, whereas their serum levels did not differ between the OSCC patients and healthy controls (12,17). To improve OSCC diagnosis, we herein identified 24 proteins with the potentials as salivary biomarkers for OSCC diagnosis using the untargeted iTRAQ-based followed by the targeted MRM-based MS analyses ( Fig. 1 and Table II). Moreover, the salivary levels of FGA and SERPINA1 have been demonstrated to be useful for detecting the early-stage OSCC in the case-control study ( Fig. 3 and supplemental Table S11). To better OSCC diagnosis, the salivary levels of FGA, resistin, SERPINA1, and THBS2 could be simultaneously determined in combination with the oral examination. Further investigations with the same cohorts would be worthy to evaluate whether the ability of OSCC detection could be improved by utilizing a panel consisting of the four salivary proteins.
Based on the abundances of salivary proteins obtained with the iTRAQ analyses (supplemental Fig. S2), the OSCC patients could be differentiated from the noncancerous group (healthy controls and OPMD individuals). Further, with the proteins differentially expressed in the OSCC group, the biological processes of macromolecule metabolism, blood coagulation, complement activation, acute-phase response, ke-ratinocyte differentiation, and cell-cell adhesion can be highlighted (supplemental Table S5 and S7). The enriched processes are in agreement with the recent findings that the metabolism program (18,19) and immune response (20,21) are dysregulated in OSCC tissues, and the expression of proteins related with blood coagulation is altered in head and neck cancers (22). In line with the results of process enrichment analysis, the KEGG pathways of complement cascades, ribosome, carbon metabolism, and glycolysis were associated with the proteins with altered levels in the OSCC group (supplemental Table S6 and S8).
In this study, the iTRAQ-based profiling was used to select the target proteins for the further MRM-based verification. With the usage of the SIS peptides, the salivary proteins could be absolutely quantified using the MRM-MS analysis. However, some of the salivary proteins could not be quantified in the SIS peptide-based MRM assay. To verify more salivary proteins as potential OSCC biomarkers, mTRAQ labeling combined with MRM assay will be applied to comprehensively and relatively quantify target proteins selected by the iTRAQ analysis (23).
Complement is generally considered as a protective mechanism against the formation of cancers. However, recent studies also indicate a pro-tumorigenic potential of complement in certain cancers (24). CFH can process factor I-mediated C3b cleavage on cell surfaces to regulate complement activation (24,25). The serum level of CFH has been reported to be a diagnostic marker for lung adenocarcinoma (26). FGA is the alpha component of fibrinogen, which is cleaved by thrombin to form fibrin when vascular injury (27). A few studies have suggested that FGA could act as a cancer biomarker. Shi et al. also showed that the plasma level of FGA could be a prognosis marker of HER2-positive breast cancer (28). SERPINA1 possesses inhibitory activities for wide variety of proteases and then can protect cells or tissues from proteases secreted from neutrophils (29). SERPINA1 has been reported as a potential biomarker in colorectal (30) and lung (31) cancers. Moreover, Kawahara et al. have revealed that SERPINA1 could act as a salivary biomarker for oral cancer with a targeted proteomic strategy in a small saliva cohort (32). The results collectively suggest that CFH, FGA, and SERPINA1 could be warranted for further biomarker evaluation.
In conclusion, to improve OSCC diagnosis, the saliva proteomes of healthy controls, OPMD individuals, and OSCC patients were quantitatively profiled with the untargeted iTRAQbased MS. The 24 proteins with elevated salivary levels in OSCC group were selected for further evaluation with the targeted MRM-MS assay in an independent cohort. Then, the three proteins (CFH, FGA, and SERPINA1) were demonstrated to have the potentials as biomarker candidates for early detection and/or prognosis of OSCC. To the best of our knowledge, this work is one of the few investigations into the identification of OSCC biomarkers, in which iTRAQ and MRM analyses of salivary proteome are in conjugation with antibodybased assays to prioritize candidates that are worthy to be further evaluated.