Potentially Novel Candidate Biomarkers for Head and Neck Squamous Cell Carcinoma Identified Using an Integrated Cell Line-based Discovery Strategy*

Head and neck squamous cell carcinomas (HNSCC) can arise from the oral cavity, oropharynx, larynx or hypopharynx, and is the sixth leading cancer by incidence worldwide. The 5-year survival rate of HNSCC patients remains static at 40–60%. Hence, biomarkers which can improve detection of HNSCC or early recurrences should improve clinical outcome. Mass spectrometry-based proteomics methods have emerged as promising approaches for biomarker discovery. As one approach, mass-spectrometric identification of proteins shed or secreted from cancer cells can contribute to the identification of potential biomarkers for HNSCC and our understanding of tumor behavior. In the current study, mass spectrometry-based proteomic profiling was performed on the conditioned media (i.e. secretome) of head and neck cancer (HNC) cell lines (FaDu, UTSCC8 and UTSCC42a) in addition to gene expression microarrays to identify over-expressed transcripts in the HNSCC cells in comparison to a normal control cell line. This integrated data set was systematically mined using publicly available resources (Human Protein Atlas and published proteomic/transcriptomic data) to prioritize putative candidates for validation. Subsequently, quantitative real-time PCR (qRT-PCR), Western blotting, immunohistochemistry (IHC), and ELISAs were performed to verify selected markers. Our integrated analyses identified 90 putative protein biomarkers that were secreted or shed to the extracellular space and over-expressed in HNSCC cell lines, relative to controls. Subsequently, the over-expression of five markers was verified in vitro at the transcriptional and translational levels using qRT-PCR and Western blotting, respectively. IHC-based validation conducted in two independent cohorts comprising of 40 and 39 HNSCC biopsies revealed that high tumor expression of PLAU, IGFBP7, MMP14 and THBS1 were associated with inferior disease-free survival, and increased risk of disease progression or relapse. Furthermore, as demonstrated using ELISAs, circulating levels of PLAU and IGFBP7 were significantly higher in the plasma of HNSCC patients compared with healthy individuals.


IHC-based validation conducted in two independent
Head and Neck Squamous Cell Carcinoma (HNSCC) 1 is the sixth most common cancer worldwide, with ϳ600,000 new cases diagnosed every year (1). HNSCCs include squamous cell carcinomas (SCCs) of the oral cavity, pharynx (nasopharynx, oropharynx, and hypopharynx), and larynx. Despite improvements in therapeutic approaches, the overall 5-year survival rate of HNSCC patients has not improved over the past three decades (1,2).
Clinical management of HNSCC is faced by several challenges, including early detection of the primary tumor, and site-specific control of the disease (3,4). Early diagnosis is often hampered by the asymptomatic nature of HNSCC development and the inadequacy of current diagnostic methods to identify HNSCCs with sufficient specificity and sensitivity. As a result, over 60% of HNSCC patients present with advanced stages III or IV, harboring lymph node metastases (3).
The outcome of HNSCC patients is reduced by the development of local recurrences, which can develop in ϳ10 -30% of cases, even when surgical margins are pathologically tumorfree (5). Current protocols fail to adequately stratify patients with high risk of developing tumor recurrence in order to allow for appropriate modifications in treatment.
Several potentially prognostic HNSCC biomarkers have been described including TP53 mutations, EGFR over-expression, and presence of human papillomavirus (HPV) or its surrogate marker, p16 (1, 6 -8). Targeting EGFR has impacted on HNSCC therapy; however, its benefit in survival for locally advanced tumors remains modest (9). There are no other biomarkers which have influenced management of HNSCC.
Considerable efforts in the identification of secreted biomarkers for various cancers in serum or plasma have met with limited success, primarily because of the immense sample complexity, and large dynamic range, which greatly hamper the discovery of biomarkers in such fluids (10). Cancer cell line secretomes offer a valuable alternative to serum or plasma for the discovery of potential biomarkers, and have been used as a source for discovery of secreted biomarkers for prostate, ovarian, lung, and other tumors (3,(11)(12)(13)(14)(15)(16). The cancer cell secretome is composed of proteins secreted or shed by tumor cells in culture and detected in the serum-free conditioned media (CM). Proteins secreted or shed by cancer cells have a chance of being released into circulation and thus have the potential to be detected in patient-derived blood in a clinical setting for disease diagnosis or prognosis (11). In vitro secretome studies typically involve growing cells to semiconfluence in normal growth media followed by incubation in serum-free media for periods of 24 -48 h. The CM is then harvested for proteomic analysis, and biomarker candidates are selected, and then validated using techniques such as enzyme-linked immunosorbent assays (ELISAs) or immunohistochemistry (IHC). One hurdle associated with biomarker discovery using this approach is the selection of truly secreted proteins for validation, because intracellular contaminants can be present as a result of cell lysis. Hence, cell culture conditions must be carefully optimized to reduce cell death under serum starvation. Furthermore, parallel proteomic profiling of whole cell lysates (WCLs) can be used as a background to filter out intracellular contaminants and enrich for secreted proteins, with the assumption that the relative protein abundance of a secreted protein would be higher in the CM compared with the WCL (17)(18)(19)(20).
In the current study, we focused on multidimensional protein identification technology (MudPIT) analyses of secretomes and whole cell lysate proteomes of three head and neck cancer (HNC) cell lines: FaDu, UTSCC8, and UTSCC42a, coupled with parallel gene expression analyses, as a platform for discovery of putative secreted HNSCC markers (21,22). These cell lines were derived from two anatomically related subsites of the head and neck: the hypopharynx (FaDu cells) and the larynx (UTSCC8, and UTSCC42a cells). Based on protein and transcript expression data, candidates with increased abundance in the CM versus the WCL, and increased mRNA expression levels, as compared with the normal oral epithelial (NOE) cells, were selected and prioritized for validation using publicly available resources. Five candidates were selected for in vitro validation at the protein level through Western blotting against the CM of cancer cell lines and control NOE cells, and at the mRNA level using quantitative real-time PCR (qRT-PCR). In addition, using immunohistochemistry, we demonstrated that four of these candidates have higher protein expression in recurrent compared with nonrecurrent HNSCCs, and that their elevated expressions were associated with lower disease-free survival (DFS), and increased risk of disease progression. Finally, two of the candidate proteins, PLAU (urokinase) and IGFBP7 (insulin-like growth factor binding protein 7), were significantly elevated in the plasma of HNSCC patients compared with healthy controls, which in turn appeared to associate with an increased risk of death in the cancer patients.

EXPERIMENTAL PROCEDURES
Cell Culture-FaDu (human hypopharyngeal squamous cell cancer cell line), UTSCC8 and UTSCC42a (human laryngeal squamous cell cancer cell lines, kind gifts from R. Gré nman, Turku University Hospital, Turku, Finland) and normal NOE cells were cultured under conditions described previously (23). FaDu cells were obtained from TCGA (Bethesda, MD), and NOEs from Celprogen (San Pedro, CA). For secretome analyses, cell lines were cultured under serum-free, phenol-red free conditions. The serum-free media were purchased from Invitrogen and included Minimum Essential Medium (GIBCO 51200) for FaDu and NOE cells, and Dulbecco's modified Eagle's medium (GIBCO 31053) for UTSCC8 and UTSCC42a cells. Cell viability was assessed with trypan blue staining (Invitrogen), and exceeded 95% following serum starvation. The cells were authenticated at the Centre for Applied Genomics (Hospital for Sick Children, Toronto, Canada) using the AmpF/STR Identifier PCR Amplification Kit (Applied Biosystems), and routinely tested for mycoplasma contamination using the Mycoalert detection kit (Lonza Group Ltd).
Protein Sample Preparation and Trypsin Digestion-Materials Ultrapure grade urea, ammonium bicarbonate, ammonium acetate, calcium chloride, HEPES and TRIS were obtained from BioShop Canada, Inc. (Burlington, ON). Ultrapure grade iodoacetamide, dithiothreitol and formic acid were obtained from Sigma. HPLC grade solvents (methanol, acetonitrile and water) were obtained from Fisher Scientific, Canada.
HNSCC cell lines were grown in normal growth media until ϳ 80% confluence in T175 cm 2 culture flasks. Then, the cells were washed with phosphate buffered saline (PBS) and incubated in 30 ml of serum-free media for 48 h. Subsequently, the cells and the media were harvested. The cells were lysed by incubation in hypotonic lysis buffer (10 mM HEPES, pH 7.4) for 30 min on ice, then briefly sonicated. Triton-X-100 was then added to a final concentration of 1.5%; the cell lysates were incubated on ice for 30 min, then centrifuged at 14,000 rpm for 30 min at 4°C. The supernatant was subsequently collected for proteomic analysis. The conditioned serum-free media was concentrated to ϳ500 l using Amicon Ultra-15 centrifugal filter tubes with a 3 kDa membrane filter (Millipore, Billerica, MA). Protein concentration was determined by Bradford assay, and 150 g of total protein from each sample was precipitated overnight at Ϫ20°C with five volumes of ice-cold acetone, followed by centrifugation at 21,000 ϫ g for 15 min. The protein pellet was solubilized in 8 M urea, 2 mM dithiothreitol, 50 mM Tris-HCl, pH 8.5 at 37°C for 1 h, followed by carboxyamidomethylation with 10 mM iodoacetamide for 1 h at 37°C in the dark. The samples were then diluted with 50 mM ammonium bicarbonate, pH 8.5 to 1.5 M urea. Calcium chloride was added to a final concentration of 1 mM and samples were digested with 3 g of recombinant, proteomics-grade trypsin (Promega, Madison, WI) at 37°C overnight. The resulting peptide mixtures were solid phaseextracted with C18 spin-columns (The Nest Group Inc., Southborough, MA) according to the manufacturer's instructions, and stored at Ϫ80°C until further use.
MudPIT Analysis of HNSCC Cell Lines-A fully automated 5-cycle two-dimensional chromatography sequence was set up as previously described (22). Peptides were loaded on a 7 cm precolumn (150 m i.d.) containing a Kasil frit packed with 3.5 cm 5 m Magic C18 100 Å reversed-phase material (Michrom Bioresources Inc., Auburn, CA) followed by 3.5 cm Luna 5 m SCX 100 Å strong cation exchange resin (Phenomenex, Torrance, CA). Samples were automatically loaded from a 96-well microplate auto-sampler using the EASY-nLC system (Proxeon Biosystems, Odense, Denmark) at 3 l/min. The pre-column was connected to an 8 cm fused silica analytical column (75 m i.d.) via a microsplitter tee (Proxeon Biosystems) to which a distal 2.2 kV spray voltage was applied. The analytical column was pulled to a fine electrospray emitter using a laser puller. For peptide separation on the analytical column, a water/acetonitrile gradient was applied at an effective flow rate of 400 nL/min, controlled by the EASY-nLC (Proxeon Biosystems). Ammonium acetate salt bumps (8 l) were applied at concentrations of 100, 150, 200, and 500 mM followed by a water/acetonitrile gradient as described previously (22). Sample analysis was performed on a LTQ Orbitrap XL (Thermo Fisher Scientific, San Jose, CA) using previously described instrument settings (22). The MS functions and the HPLC solvent gradients were controlled by the Xcalibur data system (Thermo Fisher Scientific).
Protein Identification and Data Analysis-Raw data were converted to m/z XML using ReAdW (version 1.1) and searched against the Human IPI database (version 3.54, 75426 human sequences, (http:// www.ebi.ac.uk/IPI)) using the X!Tandem search algorithm (version 2008.02.01.1). X!Tandem was searched with a fragment ion mass tolerance of 0.40 Da and a parent ion tolerance of Ϯ10 ppm. Complete tryptic digestion was assumed; one missed cleavage site was allowed. Cysteine carbamidomethylation and methionine oxidation were specified as fixed and variable modifications, respectively. To minimize protein inference, we developed a database grouping scheme and only reported proteins with substantial peptide information, as described previously (24). Target/decoy search was performed to experimentally estimate the peptide false discovery rate (FDR), and 12 decoy sequences were identified (FDRϽ1%). Protein identifications with at least two unique tryptic peptides were considered, as reported previously (24,25).
Protein Quantification by MS and Normalization-Spectral counting (SpC) was used as a measure of protein abundance. The SpCs for peptides corresponding to a protein were normalized against the total number of spectra for a given MudPIT sequence, averaged over the triplicates. To avoid division by 0, SpC values of 0 were replaced by 0.01. The relative abundance of each protein in the conditioned media versus the cell lysate was calculated for each cell line by calculating the ratio of averaged normalized SpCs in the conditioned medium versus the cell lysate.
Gene Expression Microarray-Gene expression profiling was conducted on the HNSCC cell lines cultured under normal growth conditions using a Whole Human Genome Oligo Microarray (Agilent Technologies) at the UHN Microarray Centre. The NOE cell line was used as a normal control. The fold change (FC) was calculated for each probe as compared with the expression in the NOE cells. In addition, standard statistical methods were used to calculate the minimum and maximum FC for each probe in order to provide error estimates in fold changes.
Data Mining-Human Protein Atlas (HPA) (proteinatlas.org/) was used for qualitative comparison of IHC staining of HNC tissue with normal oral mucosa. To achieve this, the strongest staining cancer and normal IHC image was given a value of 3, 2, 1, or 0 representing strong, moderate, weak, and negative staining, respectively, as reported by the HPA; the difference in the scores from cancer and normal was then calculated. For proteins with multiple antibodies, the scores were averaged. SignalP 3.0 and SecretomeP 2.0 (cbs.dtu.dk/ services/) were used for prediction of classically and nonclassically secreted proteins, respectively (26). The gene products, which were not predicted to be secreted by either of these mechanisms, were searched against Exocarta (exocarta.ludwig.edu.au/) to identify whether they are present in exosome fractions (27). Comparison against published HNSCC gene expression microarrays was accomplished by linking both data sets by gene accessions using our in-house database.
Quantitative Reverse Transcriptase Real-time PCR-Primer pairs were designed for several genes (supplemental Table S1) using the Primer3 software (primer3.sourceforge.net). RNA was extracted using the Total RNA Extraction kit (Norgen, Thorold, ON) and reverse transcribed using SuperScript III reverse transcriptase (Invitrogen) as specified by the manufacturer. Quantitative real-time PCR was performed using SYBR Green (Applied Biosystems) and an ABI Prism 7900 HT Sequence Detection System (PE Biosystems). The mean fold change in mRNA expression was calculated using the 2-⌬⌬Ct method, as described previously (28). The mean fold change from three independent experiments was then averaged.
Western Blotting-Proteins (20 g) were resolved on an 8 -12% SDS-PAGE gel and electro-transferred onto a PVDF membrane (Bio-Rad). Membranes were blocked for 1 h in 4% skim milk TBST (0.5% Tween 20 in TBS) buffer, and probed with the primary antibody overnight at 4°C. Antibodies (Abcam) were used at the following concentrations: MMP14 1:2000, PLAU, TGFBI, IGFBP7, and THBS1 at 1 g/ml. After incubation with the primary antibody, membranes were washed and incubated in the appropriate secondary HRPconjugated antibody (Anti-rabbit IgG A9169 or anti-mouse IgG A9917, both 1:30,000 dilution, Sigma) for 2 h. Following an additional round of washing in TBST, immunoreactive protein bands were visualized with ECL Western blotting Substrate (Pierce, Rockford, IL) on an x-ray film (Bio-Flex). Membranes were stained with 0.1% (w/v) Ponceau S in 5% acetic acid to ensure equal sample loading.
Clinical Specimens-Institutional Research Ethics Board approval was obtained for this study. Primary diagnostic formalin-fixed and paraffin-embedded (FFPE) laryngeal and hypopharyngeal carcinoma biopsies from 79 patients were evaluated in 5 m sections. The clinical characteristics of the 79 patients are shown in Table I. The specimens were divided into two cohorts: training (n ϭ 40) and validation (n ϭ 39) (supplemental Tables S2 and S3). The specimens were selected such that half of the patients experienced recurrences; half did not. The median follow-up times were 3 years for the training cohort; 1.85 years for the validation cohort.
Plasma samples were collected at diagnosis from 27 HNC patients, and 14 healthy volunteers in heparin-containing phlebotomy tubes. The clinical characteristics of these patients are shown in Tables IIA and IIB. The median follow-up time for this group of 27 patients was 2.1 years. Plasma was obtained following the phlebotomy by centrifugation at 3000 ϫ g for 30 min at 4°C. The clear plasma supernatant was stored at Ϫ80°C until use. Fifteen of the plasma samples were obtained from the same patients as in the IHC validation cohort; thus these samples had matched tumor tissues for IHC analyses.
Statistical Analysis-The Wilcoxon rank-sum test was used to compare IHC straining intensities and plasma protein concentrations between different groups of patients. Overall survival (OS) was calculated from the date of diagnosis to death or the last follow-up time. Disease-free survival was calculated from the date of diagnosis to the date of death or date of relapse or the date of last follow-up. OS and DFS were estimated by the Kaplan-Meier (KM) method. The comparison of the KM curves between the high (above-median) and low (below-median) protein expression groups was based on the log-rank test. Cox proportional-hazard model was used to estimate hazard ratios (HRs) and corresponding 95% confidence intervals (CIs). Two sided test was used. For the training set, we compared the IHC expression between the recurrence and nonrecurrence groups. The IHC biomarkers with p Յ 0.05 were considered of nominal significance and were further assessed by an independent validation set. Assuming similar effect size as in the IHC training set, power analysis was applied for the validation set. Based on 20 recurrent and 19 nonrecurrent patients in the validation set, and a significance level ␣ of 0.01 to adjust for multiple testing, the validation set had at least 0.86 power to detect statistically significant differences of biomarker expression between the recurrent versus nonrecurrent groups. Power calculations were based on a two-sided test using Power and Sample Size 2008 (PASS) software package (NCSS, Kaysville, Utah). SAS 9.2 software was used for statistical analysis.
ELISA-The concentration of four candidate proteins was measured by ELISA in plasma samples of patients with or without HNSCC.   The concentration of PLAU (ELISA kit from MyBiosource), IGFBP7 (ELISA kit from Antibodies-Online), MMP14 (ELISA kit from MyBiosource), and THBS1 (ELISA kit from R&D systems) were determined according to manufacturer's instructions. Data Sets-Microarray data of HNSCC and NOE cells have been submitted to Gene Expression Omnibus with the accession number GSE40185. The proteomic data associated with this manuscript may be downloaded from the Proteome Commons (http://www. proteomecommons.org) Tranche network using the following hash code: 2by4ZjbUxuϩ7AQhgVoTOHgFQ5kjXqZweYzUq6ySwJ8dUZ7CLK3TV P6FF9u4ukxMDbyy9vivP8kkMxDwQfOrbPu7ruOoAAAAAAAAoegϭϭ.

Secretome and Transcriptome Analysis of HNSCC Cell
Lines-The workflow and experimental design of this study are shown in Fig. 1A. As a result of MudPIT-based proteomic analyses of conditioned media and the whole cell lysates of the three HNC cell lines, 1850 protein groups were identified with high confidence (i.e. gene products). The complete lists of protein and peptide identifications are presented in supplemental Tables S4 and S5. In total, 809 gene products were identified in the CM of the HNC cell lines, with 626, 464, and 626 gene products identified in the media of FaDu, UTSCC8 and UTSCC42a cells, respectively (Fig. 1B). Furthermore, 1632 gene products were identified in the whole cell lysates of HNC cell lines, with 1093, 1182, 1267 gene products identified in FaDu, UTSCC8, and UTSCC42a cells, respectively (Fig.1B).
To identify genes that were potentially de-regulated in HNSCC, gene expression profiling was conducted on the HNC cell lines cultured under normal growth conditions. The integration of the protein expression data obtained by MudPIT with the gene expression microarray profiles resulted in 1616 (87%) gene products, which could be linked to probes on the chip (i.e. protein-mRNA pairs), and which were used for further evaluation (Fig. 2).
Integrative Data Mining for Marker Prioritization-To obtain a list of promising candidates for potential validation, gene products with at least four-fold increased mean spectral counts in the CM versus the WCL, and a minimum 2-fold up-regulation in HNSCC cell lines according to mRNA expression relative to the NOE cells, were included for further consideration. To be included in the shortlist, these criteria had to be fulfilled in at least two of the three cell lines at both the protein and mRNA level. Ninety putative HNSCC biomarkers fulfilled these criteria (Fig. 2). These two filters were chosen primarily to enrich for markers that were preferentially secreted, and therefore, presumably more abundant in the CM as opposed to the WCL, and up-regulated in the HNSCC cell lines, compared with the NOEs, using the gene expression profiles generated in parallel. In addition, selection for gene products with higher abundance in the CM versus the WCL was used to deplete for potentially contaminating intracellular proteins present because of cell lysis.
As validation of all potential candidates was not feasible, several publicly available resources were used to prioritize these 90 proteins, and then systematically select candidates for downstream verification (Fig. 2 and supplemental Table  S6). First, the candidates were prioritized based on differences in IHC staining intensity of HNC versus normal oral mucosal tissues, as per Human Protein Atlas database. Thirty-nine candidates appeared to have stronger staining in HNC tissue as compared with normal tissues; thus were prioritized for validation. Nineteen had no difference in staining between tumor and normal tissue, 25 had no IHC information available, and the remaining seven proteins had stronger staining in normal as compared with HNC tissues. The IHC staining data from HPA provided evidence for differential protein expression between tumor and normal tissues from the human head and neck region, thereby corroborating the protein and gene expression patterns observed in the HNSCC in vitro models. Secondly, expressions of the 90 markers were mapped to two previously published microarray expression data sets for primary human HNC samples (29,30). The former study determined the expression profiles of 22 paired HNSCC and adjacent nontumor tissue samples from the same patients (including four laryngeal and 1 hypopharyngeal squamous cell carcinomas) (29). The latter report analyzed the expression profiles of 34 hypopharyngeal squamous cell carcinomas and four normal tissues (30). Thirty candidates mapped to at least one of these microarrays; 25 of which were up-regulated at least two-fold in HNSCCs versus controls; thus prioritized for validation. The gene expression profiles provided further evidence for enrichment of several candidates in HNSCCs as compared with nontumor tissue. Third, we investigated whether these proteins were secreted via the canonical se-cretion pathway, noncanonical secretion pathway, or released via exosomes using SignalP 3.0, SecretomeP 2.0 and Exocarta, respectively (26,27). Using these tools, 78 of the 90 candidates were highly likely to be secreted by one of these mechanisms, and prioritized for validation by giving preference to candidates that target to the extracellular space. Fourth, a literature search was performed to identify biomarker candidates that may have biologically relevant roles in cancer development, particularly in HNSCC. Lastly, candidates with available antibodies for IHC and ELISAs were prioritized for validation.
In Vitro Marker Validation-The top 5 ranked candidates were selected and validated in vitro using qRT-PCR and Western blotting against the CM of HNSCC and NOE cells (Fig. 3). PLAU, IGFBP7, MMP14 (matrix metalloproteinase 14), THBS1 (thrombospondin-1), and TGFBI (transforming growth factor, beta-induced) were verified as up-regulated at both the mRNA (Fig. 3A) and protein levels (Fig. 3B) in HNSCC cells and their respective CMs, as compared with the NOEs. Equal protein loading was verified through Ponceau S membrane staining (supplemental Fig. S1). Although THBS1 was confirmed to be up-regulated in the HNSCC cell lines at the transcript level, THBS1 protein was not detected in the CM of FaDu cells possibly because of a modification at the epitope site (Fig. 3B); this protein was nonetheless readily detectable by MS with high confidence. In general, the relative transcript and protein expression levels of these markers determined by qRT-PCR and Western blotting exhibited a similar trend to the levels identified through global gene expression profiling and semiquantitative spectral counting, respectively. As the next step, we attempted to validate the expression of these markers in a three-dimensional tumor model of HNSCC using IHC in FaDu xenograft tumor tissue sections, and corroborated that all five proteins were indeed expressed in these tumors (supplemental Fig. S2).

FIG. 3. In vitro verification of selected markers.
A, Quantitative realtime PCR results show up-regulation of the selected genes in HNSCC cell lines in comparison to a normal oral epithelial cell line. B, Immunoblotting against the selected markers in serum-free conditioned media following 48-hour incubation. Membranes were stained with 0.1% (w/v) Ponceau S in 5% acetic acid to ensure equal sample loading (supplemental Fig. S1). Total number of spectra identified for selected candidates by mass spectrometry shown on the bottom of each blot. The upper band for THBS1 may be a protein isoform.

Clinical Validation of Five Proteins in Primary HNSCC
Tissues-We next sought to determine whether these proteins were also expressed in primary human HNSCC tissues. IHC evaluations confirmed that all five proteins were indeed expressed in primary HNSCC biopsies (Fig. 4), albeit TGFBI expression appeared to be limited to the surrounding stroma, and not the cancer tissues. Nonetheless, the potential prognostic value of these five proteins was assessed. To that end, IHC for these five proteins were evaluated first on a training set (n ϭ 40), followed by a validation cohort (n ϭ 39), wherein half of these selected patients had relapsed, and were group matched for known clinical parameters including age, gender, stage at presentation, and treatment (Table I, supplemental  Tables S2 and S3). For a few samples, the IHC staining could not be evaluated because of poor quality of the tissue or staining morphology, hence, accounting for the variations in the number of samples. IHC expression levels of PLAU, IGFBP7, MMP14, and THBS1 were significantly higher in recurrent versus nonrecurrent HNSCCs in the training set (p ϭ 0.004, p Ͻ 0.0001, p ϭ 0.005, p ϭ 0.002, respectively) (Fig.  5A). Only TGFBI failed to be differentially-expressed between relapsed versus nonrelapsed samples; hence it was no longer evaluated in the validation cohort. Nonetheless, it was gratifying to observe that a similar pattern of IHC expression was corroborated in the second independent validation set (n ϭ 40), (p ϭ 0.02 for PLAU; p ϭ 0.04 for IGFBP7; p ϭ 0.004 for MMP14; and p ϭ 0.09 for THBS1) (Fig. 5B).
Next, the prognostic value of PLAU, IGFBP7, MMP14 and THBS1 was evaluated through estimation of OS and DFS using the Kaplan-Meier method (Fig. 6, supplemental Table  S7). The IHC expression data were combined for both the training and validation cohorts in order to better gauge the potential prognostic value of these biomarkers. The patients were divided into two categories based on the median IHC staining intensity scores for each marker: high versus low expression. On multivariate analysis, patients' tumors with high expression level of any of these four proteins experienced a significantly worse DFS compared with those with low expression (p ϭ 0.04, p ϭ 0002, p ϭ 0.0001, p ϭ 0.0003 for PLAU, IGFBP7, MMP14, and THBS1, respectively). Furthermore, tumors with high IGFBP7 and THBS1 expression were also associated with lower OS rates (p ϭ 0.005, p ϭ 0.002 for IGFBP7 and THBS1, respectively) (supplemental Fig. 3). Accordingly, the respective hazard ratios for disease progression or death were also significantly higher for patients' whose tumors had high expression of these proteins (supplemental Table 7). Specifically, elevated IGFBP7 (HR ϭ 3.17, p ϭ 0.0007); MMP14 (HR ϭ 3.95, p Ͻ 0.0001), or THBS1 (HR ϭ 3.07, p ϭ 0.0011) were associated with increased risk of disease progression. Similar associations were observed for increased risk of death for IGFBP7 (HR ϭ 3.46; p ϭ 0.004), and THBS1 (HR ϭ 3.37; p ϭ 0.01).
Preliminary Validation in Patient-derived Plasma Samples-Given that these biomarkers were identified from the "secretome," we next asked if any of these proteins could be measured in the plasma of HNSCC patients. Neither MMP14 nor THBS1 were differentially secreted between cancer patients and healthy volunteers (supplemental Fig. S4). However, PLAU, and IGFBP7 were both significantly over-expressed in HNC patients, compared with healthy volunteers (respective p values of p ϭ 0.01 and p ϭ 0.0002) (Fig. 7, Table IIA,  supplemental Table S8). In particular, the median level of IGFBP7 was ϳ35-fold higher in HNC patients, compared with almost undetectable levels in healthy volunteers (absent in eight out of nine controls), raising the intriguing possibility of IGFBP7 being a potential diagnostic marker for HNC.
Among the 23 HNC patients tested for plasma IGFBP7 levels, 11 had already experienced a recurrence; 12 did not; there was a trend toward a higher plasma level of IGFBP7 in relapsed patients, although this did not reach statistical significance. When patients were stratified into high and low plasma PLAU or IGFBP7 expression based on the median concentration, however, patients with higher PLAU or IGFBP7 levels had a two-fold increased risk of death (HR ϭ 2.03, p ϭ 0.3 for PLAU and HR-2.00, p ϭ 0.33 for IGBFP7) (supplemental Table S9).

DISCUSSION
Using a shotgun proteomics approach, starting with three HNSCC cell lines of laryngeal and hypopharyngeal origin, PLAU, IGFBP7, MMP14, and THBS1 were identified as four potential prognostic tumor tissue markers for clinical outcome. Furthermore, for the first time, PLAU and IGFBP7 were determined to be elevated in the plasma of HNC patients, with a trend toward higher risk of death from disease for those with higher circulating levels at time of initial diagnosis, validating the utility of such approaches in identifying potentially novel and useful biomarkers for HNC.
Analogous shotgun proteomics-driven approaches for identification and verification of secreted cancer biomarkers for various in vitro tumor models have been previously employed by others with numerous variations across groups in experimental design including the mode of sample preparation, mass spectrometry techniques, biomarker candidate selection and validation methods (15). Large scale validations examining the potential diagnostic or prognostic value of protein candidates utilizing clinically annotated patient samples have been limited and are one of the primary bottlenecks of secretome-based biomarker discovery, partly owing to limited access to patient samples and lack of commercially available ELISAs. Nonetheless, some groups have validated circulating biomarkers discovered with this approach to harbor promising clinical value (31,32). Taken together, secretome studies demonstrated in principle that the cancer cell line secretome could be a valuable source of biomarkers, hence, we extended this approach to the discovery of laryngeal and hypopharyngeal carcinoma biomarkers.
The following protein candidates were verified to potentially harbor prognostic or diagnostic significance: PLAU, MMP14, IGFBP7, and THBS1. PLAU and MMP14 are proteases which promote tumor invasiveness and metastasis through their involvement in the degradation of the extracellular matrix. They have been previously implicated in HNC as well as other tumor types, and their inhibition has been explored for therapeutic purposes (17,(33)(34)(35)(36)(37)(38)(39)(40)(41)(42)(43)(44)(45)(46)(47). IGFBP7 is a tumor suppressor protein shown to be down-regulated in solid tumors of the liver, lung, breast, prostate, colon, and skin (48 -50). There is evidence that it can also be up-regulated during specific stages of tumor development, e.g. in early-stage thyroid cancers; furthermore, its expression appears to be tissue-specific (51). To the best of our knowledge, the role or expression of IGFBP7 has not been previously investigated in HNC; how- ever, in other solid tumor models, it has been shown to mediate cell growth and adhesion (50). Last, THBS1 is a secreted glycoprotein involved in tumor progression through regulation of extracellular matrix remodeling and angiogenesis; however, its role in HNC also remains to be fully elucidated, although THBS1 has been shown to promote tumor invasion via the urokinase receptor (52). These proteins appear to have highly diverse yet inter-related functions as they are all part of a complex extracellular matrix remodeling net-work. Furthermore, as suggested by the significant correlation found between the tumor immunoexpression of these markers (data not shown), they are likely promoting tumor progression co-operatively through extracellular matrix degradation, cell growth and angiogenesis. These are key biological processes occurring at the cell surface or the extracellular vicinity of tumor cells; thus, it serves as no surprise that proteins involved in these diverse processes were identified in this study. Ralhan et al. reported the only secretome profiling of a similar HNC subsite, examining the CM of the human laryngeal SCC cell line (SCC38), in addition to oral SCC cell lines, which has resulted in fewer protein identifications, likely because of a different proteomics analysis strategy (53). We further extended our results by systematic validation in wellannotated patient tissue and plasma samples. Notably, nine proteins were commonly identified between the study by Ralhan et al. and the current study, including PLAU, IGFBP7, MMP14, and TGFBI, thereby supporting the findings presented here.
A significant effort was undertaken to validate our secretome-based results, by: (1) candidate up-regulation in vitro; (2) protein expression in FaDu xenografts; (3) IHC verification in two independent sets of primary HNSCC tissues of the same tumor types (laryngeal and hypopharyngeal SCCs); (4) evaluation of the prognostic value of these biomarkers; and (5) measurement of the secreted markers in plasma of HNSCC patients and controls. As a result, in a multivariate survival analysis, we observed that high tumor PLAU, IGFBP7, MMP14, and THBS1 expression correlated with inferior DFS, and increased risk of death and disease progression. Furthermore, plasma levels of PLAU and IGFBP7 may be used to distinguish subjects with and without cancer. Taken together, these data indicate that PLAU and IGFBP7 could be potentially utilized as diagnostic or prognostic biomarkers for squamous cell carcinomas of the larynx and the hypopharynx. Elevated plasma PLAU and IGFBP7 levels in HNSCC patients could also be an indication of increased risk of death; however, this requires validation in a larger cohort of HNSCC patient plasma samples in the future. No association was observed between plasma IGFBP7 and PLAU levels with disease progression or recurrence possibly because of the small sample size examined in each group. In addition, although all proteins were detected in plasma (pg/ml to ng/ml range), only PLAU and IGFBP7 demonstrated a significantly higher expression in the HNSCC patients. A similar differential expression pattern was not observed for MMP14 and THBS1, possibly because they are also secreted by additional cell types (e.g. stromal cells) thereby potentially masking the signal emanating from the tumor itself (37,40).
Nevertheless, there remain limitations to this study, in that despite group-matching the clinical parameters to our best ability, there was a slight age bias with a younger population in the group of patients with recurrent HNSCC in the IHC training cohort. Similarly, although the training cohort included only stage III/IV patient tissues, the validation set also included stages I and II patient samples. The primary reason behind this imperfect matching is based on limited sample availability. Similarly, the plasma samples obtained from healthy subjects versus HNSCC patients were not age or gender-matched, with the latter being significantly older than the former and composed of higher proportion of males. These study limitations, along with inherent patient-to-patient, intra-tumoral heterogeneity, and modest patient sample sizes, are potential caveats to the current study; hence further validation using larger cohorts will be required before these biomarkers can be further advanced to guide clinical management (54). Nonetheless, considering the small cohort of plasma samples and the initial discovery of putative markers emanating from an in vitro cell culture model, these results are highly encouraging for future evaluations.
In conclusion, the cancer cell line secretome provides a rich source of proteins for biomarker discovery using a simplified model system. In the current study, this system enabled the identification of several novel biomarker candidates for HNSCC. Further validation efforts would be required to confirm the diagnostic and prognostic value of these proposed markers in plasma, and to evaluate their specificity and sensitivity in relation to additional clinical parameters. Finally, through a prospective longitudinal study, it would be important to determine whether PLAU and IGFBP7 might serve as biomarkers to monitor disease response, or serve as early indicators of disease relapse.