Meta analysis: HPV and p16 pattern determines survival in patients with HNSCC and identifies potential new biologic subtype

Consistent discrepancies in the p16/HPV-positivity have been observed in head and neck squamous cell carcinoma (HNSCC). It is therefore questionable, if all HPV+ and/or p16+ tested cancers are HPV-driven. Patients down-staged according to the HPV-dependant TNM are at risk for undertreatment and data in clinical trials may be skewed due to false patient inclusion. We performed a meta-analysis to classify clinical outcomes of the distinct subgroups with combined p16 and HPV detection. 25 out of 1677 publications fulfilled the inclusion criteria. The proportion of the subgroups was 35.6% for HPV+/p16+, 50.4% for HPV−/p16−, 6.7% for HPV−/p16+ and 7.3% for HPV+/P16−. The HPV+/p16+ subgroup had a significantly improved 5-year overall-survival (OS) and disease-free-survival in comparison to others both for HNSCC and oropharyngeal cancers. The 5-year OS of the HPV−/p16+ HNSCC was intermediate while HPV+/p16− and HPV−/p16− had the shortest survival outcomes. The clearly distinct survival of HPV−/p16+ cancers may characterize a new relevant HPV-independent subtype yet to be biologically characterized. The possibility also exists that in some HPV+/p16+ cancers HPV is an innocent bystander and p16 is independently positive. Therefore, in perspective, HPV-testing should distinguish between bystander HPV and truly HPV-driven cancers to avoid potential undertreatment in HPV+ but non-HPV-driven HNSCC.

Preclinical studies also demonstrated that biological features depending on tumor HPV status would influence the effectiveness of treatment. HPV-driven tumors respond better to chemotherapy and X-ray or proton therapy than HPV − tumors [18][19][20] . Because of the differences in biological features and prognosis, individually optimized therapy for patients with HPV-driven tumors would minimize treatment-related toxicity and improve outcomes. Consequently, de-escalated treatment protocols are under investigation.
Most importantly, for a correct classification a reliable distinction between the different entities is crucial. HPV-DNA testing and p16 ink4a (cyclin-dependent kinase 2a) immunohistochemical staining are both well-established methods in identifying HPV + tumors and often regarded delivering equal information on HPV positivity. This is due to the correlation of high-risk HPV E7 expression and in consequence an upregulation of p16. Since, p16 immunohistochemical staining is inexpensive, convenient in use, and the interpretation of results is established it is widely used for detection of HPV-related HNSCC. Therefore, in the modified 8th AJCC/UICC, p16 was recommended as HPV surrogate marker with the cutoff point for diffuse (≥75%) overexpression in a histological section and at least moderate (+2/3) staining intensity 17 . Although both p16 overexpression and HPV-DNA-positivity have shown their independent prognostic value, as we summarized before 21 , there are p16 + / HPV − and p16 − /HPV + subgroups in which surprisingly different prognosis relative to p16-and HPV-status was observed. Survival of patients with HNSCC was better if associated with HPV + /p16 + or HPV − /p16 + . Therefore, in addition to the HPV-related prognostic feature, the biological relevance of p16 independent of HPV infection is currently of interest and under investigation, possibly describing another subgroup of HNSCC with a role of p16 in HPV-independent HNSCC.
In this meta-analysis, we included all current clinical studies and evaluated the clinical relevance of HPVDNA-positivity and p16 overexpression in HNSCC. Current observations in elucidating the biological role of p16 in HPV + and HPV − tumors were also discussed.

Methods
Selection criteria and literature search strategy. Four database searches were performed for publications that statistically analysed subgroup survival after detection of both HPV and p16 markers in PubMed (http://www.ncbi.nlm.nih.gov/pubmed), OVID (www.ovid.com), EMBASE (www.embase.com), and Wanfang (www.wanfangdata.com.cn). This search included publication dates up to April 20, 2017, adding an additional two years to the previously performed literature search 21 . We searched for the terms "HPV, p16, head neck". We also included references quoted in original or review articles that may not have been found during the initial literature search. We screened the articles and included all studies of HNSCC patients which investigated survival rates by the p16 and HPV status of the tumor. Our search strategy was performed in accordance with PRISMA criteria and registered in the PROSPERO register (CRD42017062330). We excluded studies that met the following criteria: missing patient survival information, evaluation of only one marker (HPV or p16), non-HNSCC primary cancer (e.g., nasopharyngeal carcinoma, skin cancer, pre-cancer), cell culture or animal models, and reviews or case reports. We also excluded studies with duplicate patient data from the same or similar populations (based on the authors' names and institutions) in a second round selection process. Where this occurred, we selected the study which was either more recent or had larger patient numbers. We also excluded studies with insufficient survival data. Finally, we included studies with the following criteria: (1) the numerical portion of the subgroups HPV + /p16 + versus HPV − /p16 − versus HPV + /p16 − versus HPV − /p16 + in HNSCC patients; (2) the numerical survival data of these subgroups (Hazard ratio (HR); overall survival (OS); disease free survival (DFS)) or Kaplan-Meier curves of the subgroups of OS or DFS. Data extraction. Two authors (A.C. and A.E.A.) extracted the relevant data from the selected publications according to the aforementioned inclusion criteria 21 . In the case of any discrepancies, we re-analysed the study and the two authors reached a consensus decision. We extracted all relevant information from the studies, including: author, publication date, study timeframe, country, tumor stage and localisation, number of patients, study design, data on alcohol and tobacco consumption, number of HPV positive and negative patients, number of patients included in the subgroups HPV + /p16 + , HPV − /p16 -, HPV + /p16 − , or HPV − /p16 + , HPV subtypes, HR-status, 5-year OS or DFS of the subgroups, p16 and HPV detection method. We used GraphClick (Version 3.0.2, Arizona Software 2010, www.arizona-software.ch/graphclick) for data processing in studies where the OS or DFS was displayed as Kaplan-Meier plot.
Statistical analysis. The OS and DFS of all subgroups was evaluated using relative risk (RR) 21 , calculating summary RR estimates and 95% confidence intervals (CI) using maximum-likelihood methods for linear mixed models. We assessed study heterogeneity using a chi-squared based Q test. An absence of heterogeneity between the studies was indicated by a p-value greater than 0.05. Existing heterogeneity was examined using the I2 index in the meta-analysis, which was represented as a percentage value between 0 and 100. We initially applied a fixed-effects model (Mantel-Haenszel method and chi-squared test) to the data. Where there was significant heterogeneity, we used the random-effects model (DerSimonian-Liard method). We examined the RR of the 5-year OS and DFS of all subgroups correlating with the HPV + /p16 + and HPV − /p16 − groups, depending on the data we extracted from each publication. In cases where the HR was described in the studies, we performed the same analysis. We compared all of the studies using a forest plot.
Publication bias was examined using a funnel plot. We used the R Version 3.1.0 (R Core Team 2014) computing environment for all statistical analyses 22 . Data Availability. The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
HR for OS of the HPV/p16 subgroups. We could determine the OS-HR from 12 studies. Five studies used HPV + /p16 + and 7 studies used HPV − /p16 − as reference markers. We summarize the results of the individual meta-sub-analyses in Table 4. The HR for the OS of the subgroup HPV + /p16 + was significantly increased compared to the HPV − /p16 − , being irrespective of whether HPV + /p16 + or HPV − /p16 − was used as reference values. The HR for the better OS of the subgroup HPV + /p16 + was significantly increased compared to the HPV − /p16 + and the HPV + /p16 − , respectively. The meta-analyses of the HRs for the OS of the subgroup HPV − /p16 − , the HPV − /p16 + and the HPV + /p16 − included only 2 and 4 studies, respectively. Because the HRs of these sub-analyses didn't show significant differences, we are unable to draw any general conclusion due to the limited data.      (Fig. 2a). The fixed effect model was used in all sub meta-analyses (data not shown) due to non-significant heterogeneity, indicating statistically robust results. The RR and CI were essentially unchanged in comparison with the whole meta-analyses. We also performed a sensitivity analysis assessing the possible bias resulting from the HPV detection methods. We divided the meta-analysis of the 5-year OS of the subgroup HPV + /p16 + and HPV − /p16 − into two groups classified by HPV detection methods: one group using PCR [25][26][27][28]32,34,35,38,43,46,47 and one group using ISH without PCR 36,39,40,44 . These two meta-analyses showed comparable results to the complete data meta-analyses (Fig. 2a). Again, the fixed effect model was used (data not shown) due to non-significant heterogeneity (p > 0.05). The RR and CI were essentially unchanged compared with the whole data meta-analyses.
In order to test for a bias introduced as systematic error (due to low sensitivity of the HPV detection (false HPV − ) inherent to some detection methods (e.g. ISH) or testing of only individual HPV types (only HPV16 or 18; missed other types)), we investigated the HPV − /p16 + subgroup further, by excluding the following studies 26 (Fig. 2a).
We next performed sensitivity analyses to investigate the effect of additional studies. The HPV + /p16 + subgroup was associated with better survival compared to the HPV -/p16 − (fixed effects model; RR of 3.08; 95% CI 2.69-3.51; P = 0.49) 38 38,44,46,47 . Therefore, the sub meta-analyses confirmed the whole meta-analyses results with the complete international studies data set (Figs 2-3).
Publication bias. The funnel plot shapes did not reveal obvious evidence of asymmetry.

Discussion
The incidence of high-risk-HPV + tumors is exceeding 25% in HNSCC and 70% in oropharyngeal HNSCC 48 . This etiologically distinct HNSCC subtype has been associated with improved clinical outcome. Therefore, positive HPV status is a recommended biomarker for patient stratification towards de-escalation treatment regimens. Consequently, a new staging algorithm for OPSCC was recommended recently in the 8th AJCC/UICC guideline. Nevertheless, the strategy for a new staging paradigm in all head and neck regions is still under-investigation and being discussed. Since HPV can be found as an innocent bystander and p16 can be positive independently of HPV, before inclusion into such trials, it should be verified if the tumor is truly HPV-driven in order not to skew data and to avoid undertreatment of the patients' cancer. In a number of investigations, this discrepancy among HPV markers (p16 + IHC and HPV-DNA + ) was consistently found. And, moreover, these subtypes based on HPV/p16 status have shown different clinical outcomes 21 . In this meta-analysis, we confirm and expand to recent investigations to increase the knowledge about (a) the clinical relevance of HPV + HNSCC and OPSCC; (b) the incidence and clinical course of subtypes of HPV + tumors.
There is a discordant group of p16 − cases in the HPV + compartment and HPV − cases in the p16 + compartment. The number of studies included for evaluation of the discrepant cases of HPV + /p16 − and HPV − /p16 + was increased by eight with addition of 428 new patients in this meta-analysis. The relative incidences of HPV + / p16 − and HPV − /p16 + HNSCC were 7.3% and 6.7%, respectively. The survival data showed that HNSCC patients with HPV + status defined as HPV + /p16 + have a better 5-year OS and DFS than subgroups with HPV − /p16 − , HPV + /p16 − and HPV − /p16 + . These significant observations have been made in all head and neck regions and in OPSCC in particular and were consistent with previous studies. The sensitivity analysis confirmed the consistency of the 8 new additional studies. Thus, the survival benefit of the HPV + /p16 + subgroup is obvious. HPV − /   Table 4. Meta-analyses on the hazard ratio of the overall survival of the subgroups of HPV +/− and p16 +/− . Abbreviations: hazard ratio, HR.  p16 + HNSCC have a better 5-year OS than the HPV + /p16 − subtype after excluding the study from Ramshankar et al. 42 who found in early staged oral tongue squamous cell carcinoma patients that p16 overexpression was associated with lower survival and increased risk for disease recurrence irrespective of the HPV16 DNA status. Compared to the HPV − /p16 − subgroup, 5-year OS of HPV − /p16 + HNSCC is better while HPV + /p16 − is not. HPV + in p16 − HNSCC may be an innocent bystander with no functional involvement. Therefore, a careful investigation is required why HPV is negative to exclude false negative results of HPV tests or to prove truly HPV-independent development. Better survival in p16+ subgroups raises the question for its cause if independent of HPV. Consequently, in clinical trials these subtypes should be investigated separately to clarify if cancers displaying the HPV − /p16 + phenotype also qualify to be considered for de-escalation protocols. p16 is a member of the INK4 class of cell-cycle inhibitors (INK4a) and functions as tumor suppressor. It binds to cyclin-dependent-kinases (CDK) 4 and CDK6 and prevents their association with cyclin D1, and consequently, the phosphorylation and inactivation of the retinoblastoma protein (Rb) 49 . In HPV-driven tumors, high-risk HPV E7 protein triggers a cellular defense response mediated by p16 and inactivating the retinoblastoma (Rb) pathway. Therefore, p16 overexpression is an excellent biomarker for high-risk HPV-associated malignancies including cervical cancer 50 and HNSCC 21 . In a number of premalignant lesions and non-HPV driven tumors 21 , however, a p16 overexpression is also present. The underlying mechanisms of p16 overexpression in these non-HPV driven tumors is currently undetermined. These tumors often harbor mutations such as RAS and BRAF 51 . But a study in p16 + /HPV − head and neck and anogenital SCCs has shown that overexpression of p16 in these tumors is not an attribute to KRAS mutations 52 . It has been also suggested that deregulation of Rb or Rb loss results in increased p16 expression in tumor cells which is associated with uncontrolled cell proliferation in malignant tumors 53,54 . A recent study identified in HPV − high-grade neuroendocrine carcinomas of the head and neck an overexpression of p16 55 . Most of these tumors had Rb loss and a low or absent cyclin D1 expression. Therefore, we hypothesize that mechanisms other than HPV infection may affect the p16-Rb-cyclin D1 pathway and induce cell cycle activation in HPV − HNSCC. However, future studies are required to clarify the pathogenic mechanisms between these subgroups.

Number of studies
As p53 is a key event in transformation and not directly associated with p16, p53 wild-type has been shown in HPV + tumors and p53 mutation-type staining in HPV − tumors. However, in a cervical adenocarcinoma, diffuse p16 immunoreactivity is not necessarily indicative of a high-risk HPV-associated tumor 56 . p16 INK4A enhances the transcriptional and the apoptotic functions of p53 through DNA-dependent interaction 57 . Therefore, we hypothesize that in the HPV − /p16 + subgroup, which are E6 negative, the p53 is presumably mutated and this subgroup represents the HPV independent cases.
The prevalence of HPV-associated HNSCC potentially depends on the sensitivity and specificity of the detection method. Therefore, the detection technique for HPV may be another cause for discordant results in HPV/p16 testing. Therefore, we analysed the data separately according to the HPV-detection method used. HPV detection by PCR and by ISH only (without PCR) were used as HPV detection methods in 20 and 5 studies, respectively. Sensitivity analysis showed that the results of the 5-year OS were comparable using both HPV detection methods. Evans et al. 32 59 . Prigge et al. demonstrated in a meta-analysis the high sensitivity but only moderate specificity of p16INK4a and HPV DNA PCR when used as single tests to detect a transforming HPV infection in OPSCC. However, by combining the two tests, specificity was significantly optimized without altering the sensitivity 6 .
During HPV infection, the HPV E6/E7 oncogenes are expressed at low levels. Sensitive techniques such as qPCR detect HPV E6/E7 transcripts despite very limited expression [60][61][62][63][64] . Immunohistochemical evaluation of E6/E7 oncoprotein expression is another method for HPV detection which is independent from RNA or DNA degradation 65 . False negative results in HPV PCR testing may be caused by gene losses when L1 targets were used (15 of 20 studies testing for HPV DNA). These gene losses can cause the PCR to be negative, even though HPV is present (false HPV − /p16 + ). For E6/E7 targets this is not the case. However, studies using E6/E7 targets also detected patients belonging to the HPV − /p16 + subgroup 26,33,34,45,46 . Therefore, the HPV − /p16 + subgroup should be tested using the uniform high-sensitivity method for E6/E7 expression.
Detection of E6 and E7 mRNA expression is highly associated with p16 expression [60][61][62][63][64] . To identify a truly driven HPV-infection of OPSCC, the RNAscope HPV-test showed comparable results with p16-based algorithms combined with HPV PCR or HPV ISH. The RNAscope HPV-test performed better than p16 alone 66 . The quality of the results using mRNA detection are still controversial 67 . As a first step to analyze causal oncogenic HPV involvement, the detection of viral DNA is more practicable as viral RNA is sensitive to degradation 68 .
In about 90% of HPV-associated OPSCC, the high-risk-HPV-type 16 is found 48,[69][70][71] . Other HPV types than genotype 16 may explain the identified subgroups if HPV tests have a restricted genotype spectrum. Most studies investigated multiple HPV types, however the number of investigated HPV types varied. Some studies investigated only p16 34,37 . After excluding studies which tested only individual HPV types or used HPV detection methods with low sensitivity, we confirmed the distinct 5-year OS of the HPV − /p16 + subgroup to be inferior compared with the HPV + /p16 + subgroup and superior compared with the HPV − /p16 − subgroup. Thus, the HPV − /p16 + subgroup should be tested for all possibly involved HPV genotypes. In addition, the HPV − /p16 + subgroup may be caused by HPV independent mechanisms. HPV + OPSCCs have similar survival benefits in Brazil (GENCAPO study), the US (CHANCE study), and Europe (ARCAGE study) 72 . The present sensitivity analysis showed comparable results for the 5-year OS in Europe, the US, and Asia. Therefore, a geographic differentiation of the study origin is not necessary. As anti-smoking campaigns and their success differ in these countries overall there is a difference in the portion of the HPV positive HNSCC. However, HPV + patients with tobacco consumption have to be distinguished from HPV + non-smokers [73][74][75] because the true etiology may be drug-associated and HPV an innocent bystander infection.
The prognostic utility of HPV among non-oropharyngeal-derived HNSCC is limited. The effect of HPV16/p16 was significantly different in non-OPSCC compared with OPSCC 72 . Chung et al. 40 found that p16 + non-OPSCC have better outcomes compared to the corresponding patients with p16 − non-OPSCC. Salazar et al. 37 found no survival benefit for non-OPSCC in p16 + patients. However, when both p16 and HPV DNA were considered, concordantly positive non-OPSCC had significantly better survival. There were not sufficient data to perform a meta-analysis for non-OPSCC in the present study, therefore, a definitive conclusion cannot be drawn at this time.
After the recent introduction of a specific TNM system for HPV + cancers and trials evaluating de-escalation protocols, HPV detection that includes detection of activity by measuring mRNA and protein of HPV oncogenes is an important step to correctly interpret data: HPV oncogene expression is prognosis relevant while HPV DNA as a bystander is not. One example for such a misinterpretation that is under current circumstances possible is p16 + HNSCC with HPV bystander infection that results in a HPV + /p16 + phenotype which would possibly be undertreated in a de-escalation protocol. Therefore, as an ideal detection method for HPV-associated/driven cancers we propose the following algorithm for detection: perform IHC to determine the p16-status and if the staining is positive, determine positivity for HPV-oncogene mRNA or protein.
In conclusion, the information obtained from our meta analysis has revealed a potential new biologic subtype of HNSCC. From a research and clinical perspective, recruiting HPV-driven patients would be critical for the success of clinical trials towards redirecting the treatment of those patients. Further undertreatment in cases of bystander HPV and HPV-independent p16 + needs to be avoided. It remains important to elucidate the risk for progression and therapy failure especially for the HPV − /p16 + subgroup and potentially HPV false negative patients to prevent erroneous classification of patients for downscaling treatment by de-escalation-therapy. Additionally, re-examining patient's specimens in a cluster by testing HPV-oncogene mRNA or protein can identify additional epidemiologic links between HPV-driven and non-HPV-driven tumors.