Integrated Omic Analysis of Oropharyngeal Carcinomas Reveals Human Papillomavirus (HPV)–dependent Regulation of the Activator Protein 1 (AP-1) Pathway*

HPV-positive oropharyngeal carcinoma (OPC) patients have superior outcomes relative to HPV-negative patients, but the underlying mechanisms remain poorly understood. We conducted a proteomic investigation of HPV-positive (n = 27) and HPV-negative (n = 26) formalin-fixed paraffin-embedded OPC biopsies to acquire insights into the biological pathways that correlate with clinical behavior. Among the 2,633 proteins identified, 174 were differentially abundant. These were enriched for proteins related to cell cycle, DNA replication, apoptosis, and immune response. The differential abundances of cortactin and methylthioadenosine phosphorylase were validated by immunohistochemistry in an independent cohort of 29 OPC samples (p = 0.023 and p = 0.009, respectively). An additional 1,124 proteins were independently corroborated through comparison to a published proteomic dataset of OPC. Furthermore, utilizing the Cancer Genome Atlas, we conducted an integrated investigation of OPC, attributing mechanisms underlying differential protein abundances to alterations in mutation, copy number, methylation, and mRNA profiles. A key finding of this integration was the identification of elevated cortactin oncoprotein levels in HPV-negative OPCs. These proteins might contribute to reduced survival in these patients via their established role in radiation resistance. Through interrogation of Cancer Genome Atlas data, we demonstrated that activation of the β1-integrin/FAK/cortactin/JNK1 signaling axis and associated differential regulation of activator protein 1 transcription factor target genes are plausible consequences of elevated cortactin protein levels.

In the Western world, the overall decline in the incidence of head and neck squamous cell carcinoma (HNSCC) 1 has been accompanied by an increase in oropharyngeal carcinoma (OPC), the most common HNSCC subtype (1,2). This trend has been attributed to a concomitant decline in smoking and increase in oral human papillomavirus (HPV) infection rates (2). HPV has been reported to be associated with more than 60% of all OPCs in North America, with HPV16 being the predominant high-risk strain in HNSCC (2)(3)(4)(5).
The process of tumorigenesis by oncogenic HPV is well characterized. It is driven by the viral oncoproteins E6 and E7, which in turn inactivate the tumor suppressor proteins p53 and pRb, respectively (6). E7 binds pRb, thereby releasing the E2F transcription factor from pRb, leading to subsequent overexpression of p16 (i.e. cyclin-dependent kinase inhibitor 2A (CDKN2A)) (3). p16 expression is lost in the majority of HPV-negative (HPVϪ) tumors; thus it has been well accepted as a surrogate marker for HPV positivity in OPC. Although in situ hybridization remains the gold standard for HPV detection because it allows direct visualization of HPV in tumor nuclei, p16 immunoexpression is an equivalent method for HPV status determination and is commonly utilized in clinical settings (5,7).
HPV-positive (HPVϩ) and HPVϪ OPCs are distinct tumor entities with different clinical characteristics and molecular profiles (8). Unlike HPVϪ cancers, HPVϩ OPCs are associated with a more favorable outcome and treatment response, younger age at diagnosis, and wild-type p53 status (3,9,10). This has been corroborated by our group in a Canadian cohort, where we demonstrated that the 3-year overall survival (OS) rates were 67% for HPVϪ OPC and 88% for HPVϩ OPC (4,5). Despite this differential outcome, OPCs are typically treated with radiation therapy with or without concurrent chemotherapy irrespective of tumor HPV status (4). Clinical trials are ongoing to determine the efficacy of treatment deescalation for HPVϩ OPC, with the expectation that this will reduce treatment-related toxicities without adversely affecting patient survival (11). However, further studies are required in order to unravel the biological mechanisms accounting for differences in outcome and therapeutic response between HPVϩ and HPVϪ OPCs. An enhanced mechanistic understanding of these two distinct malignancies could aid in the development of tailored therapeutics, as well as biomarkers for treatment stratification.
Analytical protocols for the proteomic characterization of formalin-fixed paraffin-embedded (FFPE) patient materials have been recently developed, enabling utilization of this valuable resource (12,13). When coupled with high-resolution mass spectrometry (MS) techniques, these approaches make it possible to obtain a quantitative snapshot of protein abundances across increasingly large sample cohorts, thereby providing novel insights into mechanisms of disease progression (14 -16). In parallel, the efforts of the Cancer Genome Atlas (TCGA) have produced broad genomic characterization of ϳ300 head and neck cancer (HNC) specimens, including OPCs, highlighting potential genomic alterations responsible for cancer initiation and progression. 2 What is currently missing is an integration of TCGA genomic alterations with indepth proteomic analyses (17). Because proteins and their functions are the ultimate mediators of biological activity, such integrative efforts could highlight important biological regulatory mechanisms unlikely to be discovered otherwise (17)(18)(19)(20). Capitalizing on these developments, we undertook a comparative global proteomics investigation complemented by genomic analysis in order to identify differentially abundant proteins and, using these resources, explore the mechanisms underlying the distinct clinical behavior of HPVϩ versus HPVϪ (HPVϮ) OPCs.

MATERIALS AND METHODS
Clinical Specimens-Institutional Research Ethics Board approval was obtained for this study (REB# 07-0521-CE). Primary diagnostic (i.e. pre-treatment) fully clinically annotated, non-metastatic FFPE OPC biopsies from a total of 82 patients were utilized (5-m sections). The clinical characteristics of the patients are shown in Table I. The discovery and validation cohorts were group-matched for age, sex, stage, and treatment. Samples from the first cohort were utilized for MS-based proteomic discovery (n ϭ 53; median follow-up time of 2.6 years; supplemental Table S1), and those from the second cohort were used for immunohistochemistry (IHC) validation (n ϭ 29; median follow-up time of 2.4 years; supplemental Table S2). Both MS and IHC experiments were performed blinded, without knowledge of HPV status or clinical outcome.
Extraction and Digestion of Proteins from FFPE Tissue-Following macro-dissection of tissue sections to obtain Ͼ70% tumor cell content, tissues were de-paraffinized and rehydrated using sequential washes in xylene; 100%, 98%, and 70% ethanol; and water. The tumor sections were then incubated in a 100 mM ammonium bicarbonate, 20% acetonitrile, pH 8.5, buffer at 95°C for 2 h followed by 65°C for 1 h. The tissue was digested overnight at 37°C with recombinant, proteomics-grade trypsin (Promega, Madison, WI). Undigested tissues were removed via centrifugation, and peptides were solid-phase extracted with C18 spin-columns (The Nest Group Inc., Southbrough, MA) based on the manufacturer's instructions, vacuumconcentrated, and reconstituted in Buffer A (0.1% formic acid, 5% acetonitrile). A BCA assay was performed to determine the peptide concentration, and 5-g peptide aliquots were kept at Ϫ80°C until analyzed via multi-dimensional protein identification technology (MudPIT).
MudPIT Analysis of FFPE Tissue Peptides-A fully automated fivecycle two-dimensional high-performance liquid chromatography sequence was set up as previously described (21). Peptides were loaded onto a 7-cm Kasil fritted pre-column (150-m inner diameter) packed with 3.5 cm of 5-m Magic C18 100-Å reversed-phase material (Michrom Bioresources Inc., Auburn, CA) followed by 3.5 cm of Luna 5-m SCX 100-Å strong cation exchange resin (Phenomenex, Torrance, CA). Samples were loaded automatically from a 96-well microplate autosampler at 3 l/min using the EASY-nLC system (Thermo Scientific). The pre-column was connected to a fused silica analytical column (8 cm long, 75-m inner diameter) via a microsplitter tee (Thermo Scientific) to which a distal 2.0-kV spray voltage was applied. The analytical column was pulled to a fine electrospray emitter using a laser puller. For peptide separation on the analytical column, a water/acetonitrile gradient, controlled by the EASY-nLC (Thermo Scientific), was applied at an effective flow rate of 400 nL/min. Ammonium acetate salt bumps (8 l) at concentrations of 100, 150, 200, and 500 mM were sequentially loaded, and peptides were eluted by a water/acetonitrile gradient as described previously (22). Sample analysis was performed on an LTQ Orbitrap XL (Thermo Scientific) using previously described instrument parameters (22).
Protein Identification and Data Analysis-Raw data were converted to m/z XML files using ReAdW (v1.1) and searched against the Human UniProt database (version 2011/03; 20,227 sequences) using X! Tandem (CYCLONE v2011.12.01.1), OMSSA (v2.1.8), and MyriMatch (v2.1.97) search algorithms. The search was conducted with a fragment ion mass tolerance of Ϯ0.40 Da and a parent ion tolerance of Ϯ10 ppm. Complete tryptic digestion was assumed with one allowed missed cleavage site. Methionine oxidation was specified as a variable modification. For protein inference minimization, an in-house 2 The Cancer Genome Atlas, accepted for publication. grouping scheme was applied, reporting only proteins with substantial peptide information (23). Target/decoy searches were performed to experimentally estimate the protein false discovery rate, which was determined to be 1.3%. Protein identifications with at least two unique tryptic peptides were considered (23,24).
Protein Quantification via MS and Normalization-Adjusted spectral counts were determined in a manner previously described and used as a measure of relative protein abundance (25). To avoid division by 0, adjusted spectral count values of 0 were replaced by 0.1. The fold change (FC) for each protein was determined by calculating the ratio of averaged normalized adjusted spectral counts in HPVϮ samples.
Mass Spectrometry Raw Data-The raw mass spectrometry data associated with this manuscript have been submitted to a public repository (the Mass Spectrometry Interactive Virtual Environment, http://massive.ucsd.edu) for others to download. These data are associated with the identifier MassIVE ID MSV000078610 at the FTP site ftp://MSV000078610@massive.ucsd.edu.
Functional Enrichment Analysis-Protein Center Software (v3.12. 10008) was used for Gene Ontology (GO) and KEGG Pathway enrichment analyses. The DAVID UCSB_TFBS annotation tool (v6.7) was used for transcription factor target enrichment analysis (26). Enrichment analyses were performed against all genes or identified proteins in a given study as the background. Significance was set at FDR Ͻ 0.05.
Data Integration with Published Proteomic Data and the Cancer Genome Atlas-For proteomic data integration, IPI accessions from Slebos et al. were mapped to their corresponding gene names and compared by linking both datasets by gene accessions using an in-house database (27). In cases where multiple proteins mapped to the same gene, the proteomic results were averaged.
For integrated -omics analysis, anonymized Level 3 processed TCGA results for the Head and Neck Provisional Data Release (2013_05_23 Firehose Analysis Run) were downloaded from the cBio-Portal using the R (v3.0.1) API (cgds v1.1.29) on October 8, 2013 (28,29). There were 310 total samples (295 complete), of which 306 had somatic point mutation and copy number alternation (CNA) data, 303 had RNA sequencing data, and 310 had methylation data. Types of data included functional mutation counts (missense, nonsense), copy number calls, RNA-sequencing normalized counts, Infinium 450 methylation values, and clinical attributes. Data processing protocols are described in the MAGE-TAB archives associated with the original data 2 and in the TCGA bladder cancer marker paper (30).
Samples were limited to the oropharynx subsite (tonsil, oropharynx, or base of tongue) with information on HPV status (n ϭ 32; 10 HPVϩ, 22 HPVϪ). HPV status was defined as any sample with evidence for reads aligned to the HPV viral sequence in DNA sequencing (exome, low-pass whole-genome sequencing) or the presence of E6/E7 viral transcripts in RNA sequencing. Normalized mRNA counts were calculated using RSEM (31). Copy number was based on the discretized log 2 ratio of tumor to normal pre-processed probe intensity averaged across each gene (32). Methylation was based on the ratio of background corrected intensities for methylated versus unmethylated probes (␤ values) determined by interrogating the tumor independently. If more than one probe was assigned to a gene, the one with the strongest negative correlation with mRNA expression was selected.
The primary analysis was performed for selected genes that were identified as differentially abundant for proteins (n ϭ 174). Analysis of AP-1 target genes (n ϭ 23) was limited to mRNA counts. Data visualization used the lattice (v0.20 -24) package for the R statistical environment. For qualitative assessment of correlation between proteomic and genomic profiles, concordance was defined as similar significance levels (p Ͻ 0.05 or p Ͼ 0.05) and equivalent directionality of relative changes of molecular profiles in HPVϮ OPCs.
Tumor cytoplasmic staining levels were scored according to intensity of immunoexpression, with 0, 1, 2, or 3 indicating negative, weak, moderate, or strong staining in tumor cells, respectively. Tumor cell plasma membrane staining was scored as a percentage of cells with staining, and the final score was calculated as a product of cytoplasmic and plasma membrane staining intensities. IHC scoring was performed blinded to knowledge of any clinico-pathological parameters or HPV status. Each tissue section was scored twice, and the scores were averaged.
Statistics-Patient OS was calculated from the date of diagnosis to death or the last follow-up time. Disease-free survival (DFS) was calculated from the date of diagnosis to death, recurrence, or the last follow-up time. Patient survival was estimated via the Kaplan-Meier method, and the log-rank test was used for comparison of OS and DFS rates among patient groups. The Cox proportional-hazard (PH) model was used to estimate hazard ratios and corresponding 95% confidence intervals. The non-parametric Kruskal-Wallis test was applied for group comparisons on continuous clinico-pathological variables, and Pearson's 2 test or Fisher's exact test was used for categorical variables. Pearson's 2 test was used to compare GO term classifications based on HPV status.
The Wilcoxon rank-sum test and log 2 (FC) were used to compare continuous genomic features (protein, mRNA, and methylation), whereas a proportion test and frequency difference were used to compare the significance and magnitude of discrete genomic features (CNAs and mutations). For these comparisons, significance was set at p Ͻ 0.05. A protein was considered to be significantly differentially abundant if it also had a ͉log 2 (FC)͉ value greater than 1. Correction for multiple testing was performed using the FDR method in order to determine the top differentially abundant proteins (͉log 2 (FC)͉ Ն 1; FDR Ͻ 0.05) and proteins for functional enrichment analysis (FDR Ͻ 0.05) (33).
Unsupervised Hierarchical Clustering-The globally normalized OPC protein expression dataset was filtered to include the top 10% of proteins with the highest variance. Unsupervised hierarchical clustering was performed with the hclust function (R software v3.0.2) using average linkage as the agglomeration method and Pearson's correlation as a measure of similarity.

Global Proteomic Profiling of HPVϩ and HPVϪ OPCs-In
order to obtain global protein expression profiles for OPCs, we conducted five-step MudPIT analyses on the 53 FFPE biopsy specimens serving as the discovery cohort, including 26 HPVϪ and 27 HPVϩ OPCs. These specimens were deliberately selected to reflect the clinical behavior exemplified by the superior outcome of the HPVϩ patient group while balancing for other clinico-pathological variables such as age, sex, tumor stage, and treatment (Table I; supplemental Table S1). The 3-year OS rates for HPVϩ and HPVϪ patients were 81% and 46%, respectively ( Fig 1A; p ϭ 0.003). The workflow and experimental design of this study are shown in Fig. 1B. As a result of the MudPIT analyses, 2,633 protein groups were identified with high confidence (FDR ϭ 1.3%), with an average of ϳ1,500 protein groups identified in HPVϩ and HPVϪ OPCs (Fig. 1C). The complete lists of protein identifications, normalized protein abundances, and peptide identifications are presented in supplemental Tables S3, S4, and S5. The identified proteins span all the major subcellular components; however, membrane proteins were underrepresented (Fig. 1D). There was no significant difference in relative protein distributions across subcellular components in HPVϮ OPCs (p value Ͼ 0.999, Pearson's 2 test).
Among the 2,633 protein groups identified, 159 were significantly up-regulated and 15 were significantly down-regulated in HPVϮ OPCs ( Fig. 2A). The surrogate marker for HPV positivity, CDKN2A, was the most significantly up-regulated protein in HPVϮ OPCs (FC ϭ 10.17, p Ͻ 0.0001). CDKN2A was detected in 12 of 27 (44%) HPVϩ tumors, but it was not detected in any of the HPVϪ samples (Fig. 2B).
Functional Term Enrichment Analysis-In order to determine the biological functions that were enriched among the differentially abundant proteins, KEGG Pathway and GO term enrichment analyses were performed. Significantly enriched categories included proteins involved in cell cycle, DNA replication, execution of apoptosis, and lymphocyte activation, among others (Fig. 2C). Cell cycle/DNA replication proteins included those induced by the E2F transcription factors: minichromosome maintenance proteins (MCM2/4/5/6), proliferating cell nuclear antigen, flap structure specific endonuclease 1, and p16, as well as the 14 -3-3 family protein (YWHAB) and structural maintenance chromosome protein (SMC1A), among others (34). Examples of proteins involved in apoptosis included p16, DNA fragmentation factor subunit ␣, and high-mobility group protein 2. Lastly, lymphocyte activation proteins were enriched in HPVϩ OPC, consisting of MHC class II proteins (HLA-DRA and HLA-DRB1/3/5), c-SRC kinase, and ICAM1, among others. It should be noted that numerous proteins belonged to more than one category (supplemental Table S6).
Comparative Analysis with a Published OPC Proteome-As a means of verifying the protein abundance patterns observed in the current study, we compared our protein profiles to results from a study of two small pools of frozen HPVϩ and HPVϪ OPC tissues (10 samples per pool) (27). A total of 1,626 protein groups were identified in both studies; Ͼ1,000 proteins were uniquely identified in our work, and ϳ600 were observed in the pooled analysis (27) (Fig. 3A; supplemental Table S7). Because a different analytical method was utilized by Slebos et al. (27) and raw data were unavailable, it was not possible to quantify the correlation between protein ratios across the two studies (supplemental Fig. S1A). However, by comparing the overlap of proteins present at significantly higher or lower abundance in HPVϮ OPC and those with unchanged abundance, we found that 1,124/1,626 (ϳ70%) of the co-detected proteins demonstrated the same trend in expression in both studies. Specifically, the relative expression patterns of 1,124 (from 1,626) commonly identified proteins were similarly unchanged, up-or down-regulated, based on p Ͻ 0.05 (supplemental Figs. S1B-S1D). Furthermore, of our 34 most differentially abundant proteins after multipletesting adjustment (see "Materials and Methods"), 29 were also detected by Slebos et al. (27), of which 12 (41%) proteins were concordantly expressed and 12 additional proteins (ϳ41%) exhibited similar trends in relative abundance, corroborating our observations (Fig. 3B) (27).
Verification of Select Markers via Immunohistochemistry-Four proteins, cortactin (CTTN), methylthioadenosine phosphorylase (MTAP), cytokeratin 17, and ICAM1, were selected for IHC verification in an independent cohort of pre-treatment OPC biopsies (n ϭ 29, Fig. 4; supplemental Table S2). This selection was based on relative abundance in HPVϮ OPCs from the discovery cohort (Fig. 4A), concordance with the published proteome or genomic profiles from TCGA, and reported biological function. The validation cohort was age-, Notes: RT, radiotherapy alone; CRT, chemo-radiotherapy.
sex-, stage-, and treatment-matched to the discovery cohort, as well as between HPVϩ and HPVϪ OPCs (Table I). Two proteins, CTTN and MTAP, were confirmed to be differentially abundant (p ϭ 0.023 and p ϭ 0.009, respectively; Wilcoxon rank-sum test), whereas ICAM1 and cytokeratin 17 exhibited similar relative expression (i.e. higher in HPVϩ or HPVϪ) in comparison with the discovery cohort (p ϭ 0.22 and p ϭ 0.29, respectively; Wilcoxon rank-sum test) (Fig. 4B). Representative IHC images are shown in Fig. 4C.
Examining mRNA Abundance Downstream of Cortactin-Building on the observation that cortactin is down-regulated in HPVϩ versus HPVϪ OPC and its reported biological role in tumor radiation resistance (35), we sought evidence of differential regulation of cortactin-mediated radiation resistance based on HPV status. We hypothesized that selective activation of the ␤1-integrin/FAK/cortactin/JNK1 (MAPK8) signaling axis in HPVϪ OPC would be associated with induction of AP-1 transcription factor target genes, thereby leading to potential radiation resistance (Fig. 5A) (35,36). Thus, relative mRNA abundance of established AP-1 target genes was examined in HPVϮ OPCs from the TCGA HNC dataset. Upon phosphorylation by activated JNK1, c-JUN acts in concert with c-FOS to stimulate the transcription of target genes; thus we specifically examined the target genes regulated by the oncogenic c-JUN and c-FOS proteins of the AP-1 complex (reviewed in Ref. 36). These reported AP-1 target genes were significantly enriched among the differentially abundant mRNAs in HPVϮ OPCs (qq-plot slope ϭ 1.43; p ϭ 9. inent examples of the former included CDKN2A, TP53, and apoptosis-mediating surface antigen FAS, whereas examples of the latter included CD44 antigen, cathepsin L, matrix metalloproteinases (MMP1 and MMP3), and cyclin D1. In a second and unbiased attempt to examine AP-1 target gene enrichment, transcription factor enrichment analysis was performed using the DAVID UCSC_TFBS enrichment tool on two sets of differentially abundant mRNA transcripts from the TCGA HNC dataset: (i) in HPVϩ versus HPVϪ OPCs and (ii) in OPCs with high versus low CTTN mRNA abundance. AP-1 target genes were significantly enriched (FDR Ͻ 0.0001) among these differentially abundant mRNAs, whether they were stratified according to HPV status or CTTN mRNA abundance. Taken together, these results point toward differential regulation of AP-1 target genes based on HPV status.
Prognostic Significance of Cortactin in OPC-The prognostic significance of cortactin was evaluated through estimation of OS and DFS using the Kaplan-Meier method and Cox PH regression analyses. High cortactin protein levels were associated with significantly lower OS and DFS rates in OPC patients (supplemental Figs. S2A and S2B, supplemental Table S9). In addition, patients with high tumor cortactin protein levels had a 5-fold increase in the risk of disease progression or death (hazard ratio ϭ 5.05 (95% confidence interval, 1.80 -14.15), p ϭ 0.002) (supplemental Figs. S2A and S2B, supplemental Table S9). Cortactin levels were associated with several clinico-pathological variables including smoking and drinking, sex, T-stage, and HPV status (supplemental Table S10). It was not possible to assess the independent prognostic utility of cortactin in a multivariate model for OS or DFS given the small number of events in the present cohort. Nonetheless, our findings validate the reported association of cortactin with OPC patient outcome in a univariate analysis model (37).
Integrative -Omics Analysis of OPC-In order to identify potential mechanisms of protein regulation, we conducted a central dogma analysis based on all available -omics data by integrating our differentially abundant proteins with mutation, CNA, methylation, and mRNA profiles from the TCGA HNC study ( Fig. 6; supplemental Table S8). Significant differences in gene mutations correlating with differential protein abundance were not observed, likely because of insufficient power, given the low number of patient samples examined. However, CDKN2A and nuclear mitotic apparatus protein 1

FIG. 3. Comparison of relative protein abundance and protein functions in the current study with the published pooled OPC proteome (27).
A, overlap of protein group identifications between the two OPC proteome studies. B, protein ratios of the top most significantly differentially abundant proteins in the current study and in the study by Slebos et al. (27). Arrows denote the 24 proteins with similar relative abundance trends in both studies, with red arrows highlighting the 12 proteins that were significantly differentially abundant. appeared to exhibit a trend toward increased mutation frequencies in HPVϪ OPC (p ϭ 0.17, proportion test) (Fig. 6). Despite being derived from different cohorts of patients, 44/ 174 of the differentially abundant proteins were associated with corresponding changes in CNAs. In addition, 29 of 38 genes with statistically significant changes in methylation between HPVϩ and HPVϪ OPC showed inverse changes in relative protein abundance. Additionally, 65/174 of the differentially abundant proteins were associated with corresponding changes in mRNA abundance. Importantly, a number of genes displayed corresponding changes across their entire molecular profiles, with CDKN2A, CTTN, cytokeratin 17, MCM2, and proliferating cell nuclear antigen serving as noteworthy examples. In contrast, relative abundance levels of numerous potentially functionally relevant proteins would not have been predicted or prioritized for follow-up investigations based on genomic data alone. As one prominent example, MTAP was identified and verified as expressed at significantly higher levels in HPVϮ OPC at the protein level using MS and IHC, respectively; however, mRNA, methylation, and mutation profiles did not serve as reliable predictors for relative MTAP protein levels. Copy number alterations revealed that deletion in HPVϪ OPC and, to a lesser degree, amplification in HPVϩ OPC might serve as putative mechanisms for differential MTAP regulation. As a second example, significantly higher nuclear mitotic apparatus protein 1 levels in HPVϩ versus HPVϪ OPCs were associated with a complex genomic profile: truncating mutations in two HPVϪ OPC specimens, significant amplifications in HPVϪ OPCs, and no significant changes in methylation status yet significantly higher mRNA levels in HPVϩ OPC. Even though relative mRNA abundance served as a reliable predictor for relative protein abundance for nuclear mitotic apparatus protein 1, because it was not among the top differentially expressed mRNAs, it would not  have been prioritized for further follow-up in a genomicsdriven study, despite potential functional significance (38). As a third example, MCM7 protein was significantly more abundant in HPVϮ OPC (Fig. 3B); however, at the genomic level it was significantly amplified in HPVϪ OPC and showed no significant changes in mRNA, methylation, or mutation status, which would have rendered the genomic profiling unhelpful in predicting relative MCM7 protein abundance.
Proteomic Subtypes of OPC-In order to examine whether proteomic profiles can be used to identify newly prognostic subtypes of OPC, we performed unsupervised hierarchical clustering on the 53 OPCs of the proteomic discovery cohort, revealing two main subgroups (C1, n ϭ 25; C2, n ϭ 28) (Fig.  7A). The subgroups differed significantly in HPV status (p ϭ 0.0024, Fisher's exact test) but were otherwise balanced for clinico-pathological variables such as age, sex, and clinical stage (supplemental Table S10). However, C1 was marginally significantly associated with a higher proportion of smokers (p ϭ 0.051, Fisher's exact test) and higher median smoking pack years (p ϭ 0.061, Kruskal-Wallis test) (supplemental Table S10). The top most variable proteins used to delineate OPC subgroups were enriched for various functional categories, such as the alcoholism pathway (KEGG), cell differenti-ation, DNA metabolism, and cell adhesion (GO biological process categories), among others (Fig. 7B).
Next, the prognostic significance of the proteomic subgroups was evaluated through estimation of OS and DFS using the Kaplan-Meier method and Cox PH regression analyses. C2 patients were found to experience significantly better OS and DFS than C1 patients; the corresponding 3-year rates of OS were 89% versus 26% (p Ͻ 0.001, Kaplan-Meier log-rank test), and the 3-year DFS rates were 89% versus 27% (p Ͻ 0.001, Kaplan-Meier log-rank test) (Figs. 7C and 7D, supplemental Table S11). Accordingly, C2 patients had a ϳ7-fold reduction in the risk of disease progression or death relative to C1 patients (hazard ratio ϭ 0.14 (95% confidence interval, 0.05-0.42), p Ͻ 0.001, Cox PH model) (supplemental Table S11). These associations remained significant in a multivariate model adjusted for HPV status, suggesting that proteomic subgroups might provide independent prognostic value in OPC (supplemental Table S11).
Further reinforcing the results above, C2 OPCs remained associated with superior OS and DFS upon stratification of HPVϩ and HPVϪ OPCs according to proteomic subgroups (supplemental Figs. S2C-S2F, supplemental Table S12). Specifically among HPVϩ OPCs, C2 tumors were associated with

FIG. 5. Model for radioresistance mediated by cortactin in HPV؊ OPC.
A, potential signaling pathway downstream of ␤1-integrin that results in cortactin protein activation and subsequent induction of radiation resistance in HPVϪ OPC. Alternatively, CTTN signaling may occur via receptor tyrosine kinases (RTKs) (35,52). CTTN (red) and c-SRC kinase (CSK, green) were identified among the top differentially abundant proteins during our proteomic discovery. Increased relative CTTN protein expression in HPVϪ OPCs via 11q13 amplification along with a corresponding decrease in the negative regulator CSK potentially leads to increased radioresistance in HPVϪ OPC. We hypothesize that this occurs via induction of downstream signaling (e.g. JNK1 phosphorylation) leading to induction of AP-1 transcription factor target genes. B, known AP-1 target transcripts are enriched among the differentially abundant mRNAs in HPVϮ OPCs (36). The table summarizes the p values based on target genes that are reported to be regulated by c-JUN, c-FOS, or both. C, relative mRNA abundance levels of reported AP-1 target genes in HPVϮ OPCs. Target gene activation or repression by AP-1 (c-FOS or c-JUN) is indicated. Genes with expected relative mRNA abundance levels are denoted by an asterisk. significantly higher DFS rates than C1 OPCs, with corresponding 3-year DFS estimates of 90% versus 43% (p ϭ 0.034, Kaplan-Meier log-rank test) (supplemental Fig. S2F). Concordant associations were observed using Cox PH models, where C2 OPCs were associated with significantly lower risks of disease progression or death than C1 OPCs (supplemental Table S12). Cortactin protein levels were significantly lower in C2 than in C1 OPCs (supplemental Figs. S2G and S2H). Collectively, these findings point toward the presence of two proteomic subtypes of OPC that are strongly linked to HPV status, with distinct prognoses and expression of disease-aggressiveness markers. DISCUSSION We conducted integrative proteomic profiling on a richly annotated cohort of OPCs in order to acquire insight into OPC progression in relation to HPV status. MudPIT analysis of 53 FFPE OPCs yielded 2,633 protein identifications, establishing this as one of the largest FFPE tumor studies employing MS-based proteomics to date. We observed that similar to genomic and transcriptomic profiles, the proteomes of HPVϩ and HPVϪ OPCs harbor distinct differences, with p16 demonstrating the most significant change. Because of its relatively low abundance and small molecular size, p16 protein was identified in only about half of the HPVϩ OPCs; however, with the latest generation of MS instrumentation, this detection bias will likely be overcome in the future. 3 HPV proteins were not detected, perhaps because some were lost during viral integration and were no longer expressed (6). Alternatively, they might have been of low relative abundance or undetectable as a result of the tissue fixation process.
The proteomic analysis of primary pre-treatment OPCs herein provides further evidence supporting numerous key hypotheses that have been explored previously to elucidate biological mechanisms associated with differential HPV status (8, 39 -41). First, functional pathway enrichment analyses corroborated previous reports that differences in the cell cycle regulatory system, the process most well documented in the literature, differentially drive HPVϩ versus HPVϪ OPC (39). Among the cell cycle proteins, several E2F transcription factor targets were enriched in HPVϩ OPC, consequent to E7induced loss of pRB.
As a second notable mechanism, post-therapy enhanced immune response in HPV-associated OPC has been postulated to play a potentially critical role in the superior outcomes 3 L. Sepiashvili   of these patients (42,43). Consistent with this hypothesis, we identified significant differences in expression profiles of immune response proteins in HPVϮ OPCs. Of particular interest was the apparent enrichment of nine proteins involved in lymphocyte activation in HPVϩ OPC (supplemental Table S6), many of which participate in antigen processing and presen-tation for immune recognition. A recent report described an association between higher levels of tumor infiltrating lymphocytes and HPV positivity and improved outcome (40). Our study does suggest the influence of the tumor microenvironment, particularly related to the tumor infiltrating lymphocytes in HPVϮ OPC, which secrete cytokines and can stimulate the expression of several identified proteins (e.g. ICAM1 and MHC class II antigens HLA-DRA, HLA-DRB1, and HLA-DRB3). The specific biological roles of these proteins in HPVϩ OPC progression and modulation of treatment response clearly require further investigation.
Third, the activation of residual p53 in HPVϩ tumor models has been shown to induce apoptosis following radiation, thereby potentially leading to a favorable therapeutic response (41). This is in contrast to HPVϪ tumors, which were largely associated with inactivating p53 mutations. We and others have previously demonstrated that a greater proportion of HPVϩ OPC samples harbor p53 immunoexpression (3,5,10). In the current study, we also observed elevated TP53 levels in HPVϩ OPC, which might be inversely associated with the activation of the cortactin-mediated radiation resistance pathway (Figs. 5A and 5C), and we propose that the relationship between TP53 and CTTN should be further explored in this context.
Cortactin is an oncoprotein located on chromosome 11q13 along with cyclin D1, a site that is frequently amplified in HNSCC (44). In accordance with our findings, high tumor CTTN expression has been shown to correlate with increased aggressiveness and poor outcome in many cancers, but most frequently in HNSCC (37,45). A recent pre-clinical study reported tumor radiosensitization via abrogation of the ␤1integrin/FAK/cortactin/JNK1 signaling axis by administration of an anti-␤1-integrin antibody in vitro and in vivo, making this pathway a therapeutically attractive target for investigation in OPC (35). To the best of our knowledge, we have identified and verified for the first time that cortactin protein expression is significantly higher in HPVϪ than in HPVϩ OPC. Only one other study with parallel findings described higher CTTN mRNA levels in HPVϪ than in transcriptionally active (i.e. positive for E6 and E7 mRNA) HPVϩ OPC (46).
The association between CTTN amplification and differential regulation of AP-1 target genes suggests that this pathway has increased activity in HPVϪ OPC; thus, we propose that differences in CTTN signaling might partially account for the increased radioresistance of HPVϪ OPC (Fig. 5A). Along the same pathway, higher expression of an SRC signaling inhibitor (c-SRC kinase) in HPVϩ OPC, observed in the current proteomic study, might also contribute to enhanced radioresistance (47), though this remains to be validated in mechanistic studies. An anticipated challenge for future investigation of the described pathways is the identification of appropriate HPVϩ and HPVϪ OPC model systems (cell lines and animal models) that faithfully recapitulate the molecular biology and treatment response profiles observed clinically.
We have also shown that tumor MTAP levels were significantly higher in HPVϮ OPC. MTAP, along with p16, is located at 9p21, a commonly deleted chromosomal region in HNSCC (48,49). Tumor MTAP loss or decreased levels are associated with poor patient survival (48). Deficient MTAP levels could be exploited therapeutically in HPVϪ OPC, as these deficient cells are more chemosensitive and can be selectively targeted with purine synthesis inhibitors or toxic purine and pyrimidine analogues (e.g. 5-fluorouracil and 6-thioguanine) when combined with methylthioadenosine (the cleavage substrate for MTAP) (48,50).
As biology is ultimately driven by the expression of functional proteins, the integration of protein expression profiles identified herein with TCGA genomics data might provide a vital link between cancer genomics and clinical phenotypes. Observed concordance between CNA, methylation, and mRNA profiles and relative protein abundance strengthened the confidence in protein level changes and provided multidimensional insight into the regulation of these proteins. In addition, although numerous proteins had parallel evidence of dysregulation across the central dogma, relative protein abundances were frequently difficult to predict based on mutation, CNA, methylation, or mRNA data alone, underscoring the importance of proteomic studies. There are numerous causes for discordance across various molecular profiles, including inter-and intrapatient tumor heterogeneity, different assays used for HPV status determination, and true differences in molecular regulation.
Lastly, two prognostic subgroups of OPC were identified in our analyses. The observation of significant differences in risk of disease progression following multivariate analysis suggests that these subgroups capture biological variation that cannot be accounted for by differences in HPV status alone. Furthermore, the identification of a subset of HPVϩ OPCs with "atypical" behavior characterized by relatively worse outcome and potentially increased aggressiveness, as indicated by higher tumor cortactin levels, presents the possibility of identifying patients at greater risk for disease progression. This is an intriguing finding in light of the currently unmet need to establish criteria (e.g. clinico-pathological or molecular biomarkers) for the selection of patients with HPV-associated OPC toward de-escalated treatment (4). Previous investigation of HNC microarray gene expression data supports the presence of two HPVϩ subgroups that were classified as mesenchymal and classical types. The former was associated with worse patient outcomes and featured an increased abundance of epithelial-to-mesenchymal transition markers (51). Although we did not observe significant differences in the expression of epithelial-to-mesenchymal markers between the subgroups (data not shown), the enrichment of cell differentiation and cell adhesion protein groups supports the presence of two biologically distinct phenotypes with varying disease aggressiveness.
In conclusion, our study demonstrates complex interplay among several key pathways that might be associated with the differences in the clinical phenotypes of HPVϩ and HPVϪ OPC and reveals additional biological variation in OPC to HPV, rendering the interpretation of the molecular biology of this disease challenging. In addition, because of the intricacy of genomic, epigenetic, and transcriptomic alterations in HPVϮ cancers, comparative proteomics will play an important role in clarifying OPC biology, as proteins are the functional cellular units and represent the final products of such alterations. Our study points toward pathways and highlights proteins that differ as a function of HPV status that would not have been detected through genomic investigations alone, allowing for the identification of potential therapeutic targets that could improve outcomes for HPVϪ OPC. Further evaluations will be needed to investigate these proteins, and the differential regulation of these signaling pathways, both in the presence or absence of HPV and in the context of the relevant treatment modalities.