Is High-Risk Sexual Behavior a Risk Factor for Oropharyngeal Cancer?

Simple Summary Several lines of evidence established a link between high-risk (HR) sexual behavior (SB), the persistence of human papillomavirus (HPV) DNA in saliva, and the presence of oncogenic HR-HPV subtypes in oropharyngeal squamous cell carcinoma (OPSCC). Especially one influential case-control study by D’Souza et al. was responsible for the definitive acceptance of “high risk sexual behavior” as being causatively involved in the etiology of HPV-driven OPSCC. Utilizing case-control studies can be problematic in respect to achieving reliable statistical inference. For generalizability and drawing conclusions for the general population, the selection of cases and controls studied is critical. Substantial bias can be introduced. Therefore, the aim of our study was to replicate these former findings in a nested case-control study of OPSCC patients and propensity score (PS)-matched unaffected controls from a large population-based German cohort study. Here we demonstrate discrepant findings regarding HR-SB being a risk factor for OPSCC. Abstract (1) Background: Several lines of evidence established a link between high-risk (HR) sexual behavior (SB), the persistence of human papillomavirus (HPV) DNA in saliva, and the presence of oncogenic HR-HPV subtypes in oropharyngeal squamous cell carcinoma (OPSCC). A highly influential case-control study by D’Souza et al. comparing OPSCC patients and ENT patients with benign diseases (hospital controls) established HR-SB as a putative etiological risk factor for OPSCC. Aiming to replicate their findings in a nested case-control study of OPSCC patients and propensity score (PS)-matched unaffected controls from a large population-based German cohort study, we here demonstrate discrepant findings regarding HR-SB in OPSCC. (2) Methods: According to the main risk factors for HNSCC (age, sex, tobacco smoking, and alcohol consumption) PS-matched healthy controls invited from the population-based cohort study LIFE and HNSCC (including OPSCC) patients underwent interviews, using AUDIT and Fagerström, as well as questionnaires asking for SB categories as published. Afterwards, by newly calculating PSs for the same four risk factors, we matched each OPSCC patient with two healthy controls and compared responses utilizing chi-squared tests and logistic regression. (3) Results: The HNSCC patients and controls showed significant differences in sex distribution, chronologic age, tobacco-smoking history (pack years), and alcohol dependence (based on AUDIT score). However, PS-matching decreased the differences between OPSCC patients and controls substantially. Despite confirming that OPSCC patients were more likely to self-report their first sexual intercourse before age 18, we found no association between OPSCC and HR-SB, neither for practicing oral-sex, having an increased number of oral- or vaginal-sex partners, nor for having casual sex or having any sexually transmitted disease. (4) Conclusions: Our data, by showing a low prevalence of HR-SB in OPSCC patients, confirm findings from other European studies that differ substantially from North American case-control studies. HR-SB alone may not add excess risk for developing OPSCC.


Introduction
Human papillomavirus (HPV) is a driver of a subset of head and neck squamous cell carcinoma (HNSCC), in particular, HPV-driven oropharyngeal squamous cell carcinoma (OPSCC) emerging from epithelia lining Waldeyer's ring, in particular. As high-risk oncogenic HPV subtypes such as HPV16 are transmitted via body fluids containing virus particles, and other HPV-driven cancers emerge from the epithelia of the uterine cervix and the anogenital region, HPV-driven OPSCC is considered a sexually transmitted disease (STD). A very influential case-control study by D'Souza et al. [1] compared 100 OPSCC and 200 age-and sex-matched patients from the same ENT hospital who were accrued as controls. This study established higher frequencies of antibodies to HPV16 proteins, and, in particular, to anti-HPV16 early proteins E6 and E7 as markers for HPV-driven OPSCC characterized by HPV-DNA positivity. To be more precise, 72% of tumors in their OPSCC sample were positive for HPV DNA, and the prevalence of antibodies to either E6 or E7 among the 100 OPSCC was 64% [1]. After earlier hints from other studies [2][3][4][5][6], this study was responsible for the definitive acceptance of "high risk sexual behavior" (HR-SB) as being causatively involved in the etiology of HPV-driven OPSCC, as 88% of OPSCC patients reported a lifetime number of ≥1 oral sex partners, accompanied by an increased prevalence of HPV16 or any HPV infection in the oral cavity of OPSCC patients (32% and 37% compared to 4% and 6% in controls) reflected by odds ratios (ORs) of 11.3 (5.0-25.7) and 10.0 (4.8-20.7) for OPSCC [1]. This study, however, investigated patients from a single tertiary American hospital and utilized statistical models and adjustment for the factors age, sex, tobacco, and alcohol, which are among those known to be causatively involved in the development of HNSCC (includingOPSCC), to elucidate the significance of HR-SB in this regard. As case-control studies and adjusting for confounders can be problematic in respect to achieving reliable statistical inference and relying solely on a single case-control study from another part of the world, the transferability of such findings might be limited through deviating prevalence and distribution of clinical characteristics, deviating socio-cultural environment, and differing covariates, as well as other unknowns, including the genetic background. For generalizability and drawing conclusions for the general population, the selection of cases and controls studied is critical. Substantial bias can be introduced whenever the cases and controls are not randomly chosen or are selected from a sample that is not representative for either the cases, controls, or both. Such results may rather reflect the special property of the particular sample than be representative of the general population, thus potentially leading to misinterpretations. Hence, even well-conducted case-control studies are at an increased risk for far-fetched extrapolation of their findings to unrelated populations. Preferable to case-control studies and adjusting for already known risk factors are population-based cohort studies. When executing a nested case-control study, utilizing healthy participants of a randomly drawn sample from the same cohort should result in healthy controls and superior controls compared to any kind of patients from the same hospital. This is obviously true, as "controls not affected by the disease" came to the hospital with a substantial medical need for treatment of another disease. Such controls inherently introduce substantial selection bias despite any statement that they had "benign disease not related to HNSCC", as in the aforementioned study [1]. It remains questionable how such patients can be seen as "healthy controls", as they are acceptable only in absence of any other "control". However, most epidemiologic investigations in HNSCC report substantial differences between the general population and HNSCC patients, which are predominantly older males, and the general population regarding substantially higher prevalence of high-level alcohol consumption and tobaccosmoking history, and simultaneous exposure to both, in particular. This extends to a multitude of other occupational and environmental exposures. The impact of the most dominant risk factors could be responsible for underestimating the impact of any other causative factor, as they are confounders and introduce confounding bias. Adjusting for these confounders within a case-control study lowers the power, and matching in casecontrol studies introduces additional sources of bias, e.g., colliding bias [7]. However, matching within a cohort study removes both types of bias [7,8].
Drawing a propensity score (PS)-matched sample of participants attending the same cohort study based on the major risk factors for HNSCC (tobacco-smoking history, alcoholconsumption level, age, and sex) before assessing the distribution of covariates between OPSCC and PS-matched controls has the potential to reduce analytical bias further. As we had the unique opportunity to perform such an analysis in the framework of the LIFE study [9][10][11], we here demonstrate the discrepant findings respective to self-reported sexual behavior in a PS-matched nested case-control study of OPSCC patients and unaffected controls from a large population-based German cohort study.

Study Population and Patient Samples
The LIFE A1 Adult Study of the Leipzig Research Centre for Civilization Diseases (LIFE [9,10]) is a large population-based cohort study. Within LIFE, a total of 10,000 adults were scheduled to be recruited randomly from the City of Leipzig (cohort A1) to serve as a control sample for various diseases, including HNSCC. The sub-project LIFE B7 HN-SCC [10,11] was a cohort study in the framework of the same population-based cohort study LIFE, and it was conducted according to the guidelines of the Declaration of Helsinki and approved by the Ethics Committee of the University Leipzig (votes 201-10-12072010 and 202-10-12072010). The LIFE study provided a rationale to identify a nested sample of n = 300 volunteers from the LIFE A1 Adult study to serve as controls for our cohort of HNSCC patients to allow for the identification of risk factors for HNSCC and to gain information about gene-environment interaction not only of well-known risk factors (alcohol and smoking) by additionally investigating other lifestyle factors potentially involved in HNSCC etiology, including sexual behavior. From 4 August 2010 to 18 July 2012, we enrolled a total of n = 450 patients who were suspected of having head and neck cancer. A cross-sectional comparison of HNSCC patients (cases from B7) and controls without HNSCC (from A1) was designed as a nested case-control study with 1:1 matching. We used the main risk factors for HNSCC-male sex, chronologic age, alcohol consumption, and tobacco smoking history (pack years smoked)-of the first 147 patients accrued to calculate PSs in B7 and all 6798 A1 participants at that time. With a scheduled sample size of 300, we consecutively invited A1 participants according to their PS, in descending order, to have the same interview used in B7 (Figure 1). According to weekly sent invitations, 303 out of the 698 highest-scoring A1 controls responded (response rate 43.4%), provided informed consent, and completed the interview to serve as a reference.

Matching Process and Demographic Variables
The first idea to match 147 HNSCC cases and 147 controls out of 6798 volunteers as "twins" via an exact matching due to the specified risk factors (tobacco smoking, alcohol consumption, age, and sex) failed. We found only 57 pairs among 147 HNSCC cases and 6798 LIFE A1 Adult volunteers with an identical characteristic according to sex, age, stratum of pack years (10 PY increments), and belonging to the same out of the four daily alcohol-consumption categories (0 or <1 g/d, 1-30 g/d, 31-60 g/d, or >60 g/d) of which the controls could have been invited for the second visit to record information about sexual behavior, etc. Moreover, these exact matching pairs were HNSCC patients and controls with predominantly low exposure to both tobacco and alcohol. These are risk-factor characteristics that are very common in A1 Adult participants but rarely found in HNSCC patients; hence, only 57 HNSCC cases not representative for HNSCC could be matched.

Matching Process and Demographic Variables
The first idea to match 147 HNSCC cases and 147 controls out of 6798 volunteers as "twins" via an exact matching due to the specified risk factors (tobacco smoking, alcohol consumption, age, and sex) failed. We found only 57 pairs among 147 HNSCC cases and 6798 LIFE A1 Adult volunteers with an identical characteristic according to sex, age, stratum of pack years (10 PY increments), and belonging to the same out of the four daily alcohol-consumption categories (0 or < 1 g/d, 1-30 g/d, 31-60 g/d, or > 60 g/d) of which the controls could have been invited for the second visit to record information about sexual behavior, etc. Moreover, these exact matching pairs were HNSCC patients and controls with predominantly low exposure to both tobacco and alcohol. These are risk-factor characteristics that are very common in A1 Adult participants but rarely found in HNSCC patients; hence, only 57 HNSCC cases not representative for HNSCC could be matched.
As shown in Table 1, in our sample of HNSCC cases, 36% had >38 PY and were current smokers at the time of diagnosis. This exposure is mostly not reached by "normal" volunteers and is very seldom reported by healthy adults, including the random sample from our population-based cohort study LIFE A1 Adult. As high-level daily alcohol consumption is the second most common among risk factor for HNSCC and is consistently identified in the multitude of epidemiological studies, and, moreover, a simultaneously high exposure to alcohol and tobacco smoking is found in HNSCC patients, this was the rationale to match HNSCC cases and controls according to these risk factors, as well as to the other major risk factors for HNSCC, age, and (male) sex. As shown in Table 1, in our sample of HNSCC cases, 36% had >38 PY and were current smokers at the time of diagnosis. This exposure is mostly not reached by "normal" volunteers and is very seldom reported by healthy adults, including the random sample from our population-based cohort study LIFE A1 Adult. As high-level daily alcohol consumption is the second most common among risk factors for HNSCC and is consistently identified in the multitude of epidemiological studies, and, moreover, a simultaneously high exposure to alcohol and tobacco smoking is found in HNSCC patients, this was the rationale to match HNSCC cases and controls according to these risk factors, as well as to the other major risk factors for HNSCC, age, and (male) sex.
Therefore, we decided to perform PS-matching and used the list generated and sorted according to the highest PS, which was predominantly linked to the highest tobacco smoke category, to invite potential A1 Adult participants to participate in the nested case-control study and come for a second visit. As inviting n = 698 A1 Adult volunteers was required to accrue 303 controls who would attend the second visit and answer our questionnaires, the presence of high-level tobacco smoking accompanied by a high level of alcohol exposure remained lower in responding and interviewed controls. In the sample of 303 controls, a higher exposure to alcohol (but often without smoking) could be observed that had (at least partially) compensated for the often-lower level of smoking of the individual and increased its PS and allowed for inviting the A1 Adult volunteer to serve as a control. italic. b The p-value is from the heteroscedastic t test. c Odds ratio adjusted according to Cox and Moses by adding 0.5 to each cell to reduce bias and prevent division by zero caused by empty cells [12,13]. d Neck squamous cell carcinoma from an unknown primary tumor. Please note that due to rounding errors, the percentages shown may not sum up to 100 percent, as the distribution is shown only for a percentage of the available data.
Among the final sample of n = 317 HNSCC patients who answered >50% of the questionnaire items and n = 303 A1 Adult controls accepting the invitation to participate as controls for the nested cohort study, we applied a second round of PS matching of all n = 112 OPSCC patients (patients with the primary tumor in the base of the tongue = ICD-10-C01, the uvula = C05, the palatine tonsils = C09, or other parts of the oropharynx = C10) and the n = 303 A1 Adult volunteers. Finally, we obtained 94 OPSCC patients matched with 188 PS-matched controls ( Table 2). According to this enrollment strategy, only 14 women were suitable as controls, but obviously some male controls were equivalent or allowed data-driven PS-matching to female OPSCC patients. Overall, making use of PS-matching was able to reduce the otherwise larger distance between OPSCC cases and controls that can be described as being of a small effect. This can be concluded from the PS in n   [12,13]. d A casual-sex partner was defined as a partner in a "one-night stand" or a partner who was a stranger. e Squamous cell carcinoma. # OPSCC patients who could not be matched with two appropriate controls having a propensity score within the individuals PS ± 0.1 were excluded.
Clinical characteristics and demographic variables of cases and controls are provided for all participants in Table 1 and for the PS-matched subsample in Table 2.

Questionnaires
We used the well-established questionnaires AUDIT [14] and Fagerström [15] to assess alcohol dependency and nicotine dependency, respectively. Besides interpretation of the obtained answers, score points handled as numerical values were also analyzed. Referring to D'Souza's study [1], and with the aim of confirming their findings in a German cohort, we used the same cutoff values reported in their study to ask for lifetime numbers of sex partners. Due to low numbers of patients and controls reporting anal sex and same-sex partners in their study, we omitted asking the respective questions in our study.
The items of the questionnaires were provided by a trained interviewer (a certified study nurse) who read out questions to the probands during a structured face-to-face interview. The probands were asked if they belonged to one of the predefined answer categories then recorded by the interviewer.

Statistical Analyses and Calculation of Propensity Scores
The statistical analyses were performed using SPSS version 27 (IBM Corporation, Armonk, NY, USA) and included Pearson's Chi-square (χ 2 ) tests to assess differences between categorical variables, as well as logistic regression for multivariate analyses and the calculation of PS for PS-matching.
The PS calculation was performed based on the covariates smoking (in pack years, continuous), alcohol per day (categorical), chronologic age (continuous), and sex (categorical) automatically. This procedure runs a logistic regression on the group indicator and the covariates (predictors) and then uses the resulting PS (a value between 0 and 1) with the defined matching tolerance (caliper width of 0.1 was chosen, as this is recommended as the optimum compromise by most investigators) to select controls for cases. We used (by checking the respective checkboxes) the option to give priority to exact matches and to randomize the case order when drawing matches without resampling.

Molecular Analyses of HPV
The HPV DNA status and genotype were determined in 100 ng DNA of each sample as previously described [11]. RNA samples of HNSCC positive for the subtype HPV16 underwent analysis of E6*I transcripts by RT-PCR and were concluded to be positive for HPV16 RNA whenever HPV16 E6*I transcripts were detected. The CINtec kit (Roche) was used for the detection of p16 in formalin-fixed, paraffin-embedded primary tumor samples from OPSCC only. The detection of at least 20% stained tumor cells was used to conclude p16 positivity. HPV-related OPSCC was defined as p16 positivity, whereas the status HPVdriven was concluded only if OPSCC had simultaneous positivity for high-risk HPV DNA and/or RNA plus p16-positivity above the cutoff level of ≥70% OPSCC cells [16].

Results
From 4 August 2010 to 18 July 2012, n = 450 patients suspected of having head and neck cancer provided written informed consent and were accrued for LIFE B7. Patients without any sign of malignancy (n = 61) and those with synchronous or metachronous malignancy of other histology (n = 35) were excluded (Figure 1). Out of n = 354 potentially eligible patients with tumor sites in the head and neck region and pathologically confirmed squamous cell carcinoma without any other synchronous malignancy, 329 (93%) were enrolled and agreed to participate in the interview. Of these 329 HNSCC patients, 317 provided answers to more than 50% of questionnaire items (Figure 1). Among these n = 317 HNSCC were n = 112 OPSCC (ICD-10 codes C01, C05, C09, or C10). The characteristics of 317 HNSCC cases and 303 controls are shown in Table 1.
Despite the PS-matched invitation of potential controls based on the main risk factors age, sex, tobacco smoking (expressed in pack years), and alcohol consumption, substantial differences were noticed. A history of heavy tobacco use and greater nicotine dependence (Fagerström questionnaire [15]) were more prevalent in those with HNSCC, whereas the controls had greater daily alcohol consumption, were older, and were more frequently of the male sex. The AUDIT [14] scores for alcohol dependence did not differ significantly between cases and controls. Probably related to the selection process of controls from the LIFE A1 Adult study, a history of former drinking was found only among HNSCC patients, leading to significant differences in this respect.
The self-reported employment status did not differ significantly. Nevertheless, controls more often reported living in larger apartments with more rooms. HNSCC patients more often reported complete tooth loss and various aspects of impaired oral hygiene.
Due to highly different characteristics in HNSCC and controls potentially hindering reliable comparisons, we performed PS matching of 112 OPSCC patients and 303 controls. Applying a caliper width of 0.1, we randomly assigned two controls to each OPSCC patient. This provided a total sample of 188 controls for 94 PS-matched OPSCC patients. The remaining 18 of the 112 OPSCC patients (16%) and 115 controls (38%) without compatible matching partners were excluded from further analysis ( Figure 1 and Table 2).
Out of the 94 OPSCC, 66 (70.2%) were deemed HPV-related, as they expressed p16. According to Table 2, the members of the PS-matched OPSCC subgroup and their PS-matched controls demonstrated a comparable distribution of chronologic age, average quantity of pack years, and alcohol dependence (based on AUDIT score). However, related to the selection process of controls, and even by using the specified caliper width of 0.1, some significant differences in the distribution of predictors used in the propensity-score-based automatic matching persist, as were described. Comparing 94 OPSCC cases and 188 PSmatched controls ( Table 2) revealed narrower characteristics but also a higher exposure to tobacco (pack years smoked) and greater nicotine dependence (Fagerström questionnaire [15]) in OPSCC cases, whereas the controls had greater daily alcohol consumption, were older, and were more frequency of the male sex. Regarding the propensity scores, a high level for one risk factor in the absence of the other obviously could (partially) compensate for the other and resulted in a comparable PS despite deviating in pack-years tobacco-smoking history and/or daily alcohol consumption ( Table 2).
Within the PS-matched analysis, we found OPSCC patients to be more likely to selfreport their first sexual intercourse before age 18. There were no differences in frequency of having ever had casual sex or using a condom usually or always. Sexually transmitted diseases (STDs) were more frequent among controls (not significant). However, the appearance of oral or genital warts and a positive family history of tumor disease or SCC was low but numerically slightly higher in OPSCC patients. We found no association between OPSCC and sexual behavior, neither for the numbers of oral-sex partners or vaginal-sex partners, as the lifetime numbers were significantly lower in OPSCC patients. This observation relates to all categories and not only to the most extreme numbers. However, there were no differences after the Bonferroni correction (Table 2).
Venn diagrams for the distribution of ≥6 oral-sex partners and >25 vaginal-sex partners in 188 controls vs. either 94 OPSCC or 66 p16+ OPSCC demonstrate a rather reduced lifetime prevalence. Any unusual clustering or double-positive patterns were absent ( Figure 2). The frequency of females among OPSCC and p16+ OPSCC was 18/94 (19.1%) and 12/66 (18.2%), respectively, and hence nearly identical. There was no significant difference between cases and controls regarding ≥6 oral sex partners; none of the 18 female OPSCC patients but 4 of 14 (28.6%) of the female controls reported a lifetime prevalence of ≥6 oral-sex partners (p = 0.015). Within the subgroup of 33 patients with HPV-driven (i.e., ≥ 70% p16+ HR-HPV DNA+) OPSCC, a comparable distribution regarding the number of oral-or vaginal-sex partners and a low frequency of HR-SB were observed, and especially among the 30 patients p16+ HPV16 DNA+ RNA+ OPSCC, as only three male patients (one patient each (3.3%)) were in the HR-SB groups reporting either a lifetime number of ≥26 vaginal-sex partners, ≥6 oral-sex partners, or both. Overall, HR-SB was lower in OPSCC patients than in controls, and this frequency was not increased in patients with HPV16-driven OPSCC cases. trols, i.e., control patients attending the same hospital as the cases but due to another disease. Superior regarding representativeness is a random sample of the same population (population-based controls, such as LIFE A1 Adult). In either case, the selection of controls must be performed with great care to avoid selection bias [17][18][19]. The controls should always be recruited from the same reference population from which the studied case group is derived. In other words, considering time and place, an individual should be included as a control in the study only if, assuming that he or she would have developed the disease, he or she would also have been eligible for the case group ("study base principle"). Furthermore, controls should be selected randomly in a way to minimize the risk of uncontrolled confounding ("deconfounding principle").
Referring to these principles, we note that D'Souza [1] selected controls from a set of patients with benign diseases who were accrued in the same hospital during the recruitment period, after the patients were referred to the hospital and accrued from a larger OPSCC sample. Therefore, neither the cases nor the controls are representative for all OP-SCC or represent a randomly recruited healthy population. These controls were matched for age and sex, thus reducing the confounding bias by introducing colliding bias [7,8]. However, the cases differed substantially from the controls in other risk factors for developing OPSCC, such as heavier tobacco smoking, alcohol consumption, and the use of marijuana [1], and it remained unclear if these differences are the same in European samples Having had any oral sex, in contrast, was not different in females (p = 0.265) but in males, with about a 19% higher proportion in controls (35.5% vs. 54.6%, p = 0.005). The logistic regression failed to demonstrate any link between the self-reported number of vaginal-sex partners, casual sex, or sexually transmitted disease and being diagnosed with OPSCC or p16+ OPSCC, in particular, whereas oral sex was found to be significantly protective for OPSCC according to an OR of 0.384 (95% CI, 0.206-0.716), p = 0.003, for 1-5; and an OR of 0.296 (95% CI, 0.113-0.770), p = 0.013, for ≥6 lifetime oral sex partners. However, the logistic regression demonstrated an earlier sexual debut (age < 18 compared to ≥ 18 years) in OPSCC patients being accompanied by an OR of 1.994 (95% CI, 1.054-3.773), p = 0.034, confirming the link between earlier sexual debut (age < 18 years).

Discussion
Circumventing some potential sources of bias related to case-control studies, we executed a nested case-control study utilizing consecutive accrued OPSCC cases and PSmatched controls from the German population-based cohort study LIFE [9,10] to answer the question of whether HR-SB-oral sex, in particular-is a relevant etiologic factor in the development of HPV-related OPSCC. Our study did not show huge differences in sexual behavior between OPSCC patients and the controls or a comparable prevalence of self-reported characteristics, including HR-SB, thus demonstrating a replication failure of the findings from the American case-control studies, especially the most often cited study of D'Souza et al. [1].
Generally, in case-control studies, newly diseased individuals (cases) are compared with non-diseased individuals (controls) regarding various risk factors (exposure, e.g., age, sex, tobacco smoking, and alcohol drinking), preferably those with an already-known impact. Such controls can either be from the same population or so-called hospital controls, i.e., control patients attending the same hospital as the cases but due to another disease. Superior regarding representativeness is a random sample of the same population (population-based controls, such as LIFE A1 Adult).
In either case, the selection of controls must be performed with great care to avoid selection bias [17][18][19]. The controls should always be recruited from the same reference population from which the studied case group is derived. In other words, considering time and place, an individual should be included as a control in the study only if, assuming that he or she would have developed the disease, he or she would also have been eligible for the case group ("study base principle"). Furthermore, controls should be selected randomly in a way to minimize the risk of uncontrolled confounding ("deconfounding principle").
Referring to these principles, we note that D'Souza [1] selected controls from a set of patients with benign diseases who were accrued in the same hospital during the recruitment period, after the patients were referred to the hospital and accrued from a larger OPSCC sample. Therefore, neither the cases nor the controls are representative for all OPSCC or represent a randomly recruited healthy population. These controls were matched for age and sex, thus reducing the confounding bias by introducing colliding bias [7,8]. However, the cases differed substantially from the controls in other risk factors for developing OPSCC, such as heavier tobacco smoking, alcohol consumption, and the use of marijuana [1], and it remained unclear if these differences are the same in European samples with deviating distribution in a number of risk factors. Indeed, one central question in case-control studies is whether and in which form the control group is defined and how reliable controls should be selected and eventually matched to cases regarding known confounders and their distribution to increase the possibility to elucidate risk factors other than those wellestablished risk factors. In practice, such matching is achieved by different forms of matching [19]; in our study, this was achieved by inviting real healthy controls who were unaffected by the disease (HNSCC) out of a population-based cohort study [9,10], according to their PSs, and thereafter matching each OPSCC with two controls according to the four main risk factors for HNSCC, using newly calculated PSs. Notably, these patients are truly representative for all OPSCC cases treated in our hospital, which treats the majority of HNSCC patients living in the Leipzig region, congruent with the LIFE study area [9,10].
Patients who are included in case-control studies need to be truly representative for all cases of a given population. Therefore, specialized centers or tertiary hospitals with selective referral may not be that representative for both OPSCC cases and healthy controls, as intended. Indeed, Maura Gillison [20], in response to a letter questioning the representativeness of their findings [1], stated that the authors "cannot exclude the possibility that subjects who did not have traditional risk factors were more likely to participate in [their] study as it was performed in a hospital and was not population-based". Unfortunately, the scientific community mostly neglected this information, thus limiting the transferability of the findings to the general population and the interpretation of the correlation between HR-SB and OPSCC in their study [1]; instead, they interpreted the data from their study as evidence for a causative involvement of HR-SB in the etiology of OPSCC in general. However, there might be numerous differences between OPSCC patients in various regions of the world. Moreover, even the best study performed in a single hospital might not result in findings that are applicable to all other populations in the world or provide evidence for a causative involvement of HR-SB in general. Their well-conducted study showed a correlation between antibodies to HPV early proteins and HPV-positive OPSCC patients who had also had a higher prevalence of HR-SB. The latter statistical association observed unfortunately cannot explain how differences in HR-SB translate into HPV-related OPSCC or if other etiologic factors are more important. Focusing on comparisons of cases and controls derived from the same population and largely identical distribution of known major risk factors other than the covariate analyzed would have been required to draw such conclusions.
Indeed, PS-matching decreased the differences between HNSCCs, including OPSCC, patients, and controls, in our study, despite appearing to be unable to completely eliminate all confounding bias, as more men than women and a higher level of alcohol consumption within controls were found, whereas the cases included more smokers in higher tobaccosmoking categories, and some other differences were also observed (Tables 1 and 2).
However, there were no differences in the self-reported numbers of lifetime vaginalsex and oral-sex partners between our PS-matched sample of OPSCC patients and controls. Comparing our findings to the Study of D'Souza et al. [1], we observed a substantially lower prevalence of HR-SB in controls and an even lower prevalence of HR-SB in OPSCC cases (Tables 3 and 4), arguing against a substantial impact of HR-SB and a large fraction of OPSCC that is attributable to HR-SB. The consistent absence of HR-SB in the overwhelming majority of HPV-driven OPSCCs stands against the argument of a lowered frequency of HPV-driven OPSCC in our cohort (35.1% vs. 64% in their sample, according to seropositivity for HPV16 E6 and/or E7 antibodies) that would have lowered the chance to detect an impact of HR-SB on the development of HPV-driven OPSCC.  Only 37.2% of LIFE B7 OPSCC patients stated ever having practiced oral sex compared to 88% of their cases [1]. Likewise, the lifetime-numbers of oral and vaginal sex partners they reported were much higher. Differences might be related to deviating sexual behavior norms in Germany versus North America [1], as practicing oral-genital sex is generally reported more often in American [1][2][3][4][5][6]21] than in European studies [22][23][24][25][26][27].
There are a number of studies highlighting either a lower age at sexual debut/first intercourse or a higher proportion of OPSCC patients reporting an age below 18 years at first intercourse, and our study confirms these reports [1,3,27,28]. However, this behavioral aspect is often discussed as being linked to the rather low income of parents [29]. However, we cannot exclude that a lower age at first intercourse increases the risk for OPSCC via the increased vulnerability of the epithelia of younger persons, making them more susceptible to becoming infected with HPV.
Regarding other European trials, Tachezy et al. [23] collected the data of 86 patients with a primary cancer of the oral cavity or oropharynx and 124 controls in the Czech Republic, regarding demographics, behavioral risk factors, and risks related to HPV exposure. Referring to sexual behavior, data with an increased risk for HNSCC could be shown after adjusting for age and the consumption of alcohol and tobacco only for practicing oral-anal sex (OR 4.3, 95% CI 1.3-14.8, p = 0.02). The number of sex partners (<6 vs. >6; OR 1.2, 95% CI 0.6-2.7, p = 0.59), as well as practicing oral-genital sex (OR 0.5, 95% 0.2-1.2, p = 0.14) could not be confirmed as risk factors for HNSCC and also not for OPSCC [23]. Other European studies also failed to demonstrate any significant correlation between HR-SB and oral or oropharyngeal cancer [22,24,25].
Farsi et al. [30] published a meta-analysis of 20 case-control studies of sexual behavior in patients with HNSCC. Among all included studies, nine were from North America, five from Europe, four from Latin America, one from Asia, and one from Oceania. Only three of them adjusted their analyses for HPV status. According to ORs in random-effect models including all studies, an increased risk of HNSCC was found for the number of sexual partners (19 studies; OR 1.29, 95% CI 1.02-1.63) and the number of oral-sex partners (5 studies; OR 1.69, 95% CI 1.00-2.84). After excluding studies contributing the most to heterogeneity (e.g., D'Souza et al. [1]) or not adjusting for age, sex, smoking, and alcohol consumption, neither the number of sexual partners nor practicing oral sex (OR 0.95, 95% CI 0.75-1.20 and 1.03, 95% CI 0.84-1.26, respectively) was associated with oral or oropharyngeal SCC. This is in line with our findings. They concluded that observed associations might be partly attributed to confounding [30], and our findings support their interpretation.
However, the prevalence of HPV-related OPSCC per se was found to be significantly higher in the geographic region of North America than Europe [31,32]. Similar to the HPV prevalence of 72% in OPSCC in D'Souza's study [1], prevailing estimates indicate that, in the United States, approximately 60% to 70% of OPSCCs are caused by (or at least are related to) HPV infection [21,33,34]. Similarly, in a comprehensive review of the global burden of infection-associated cancers, systematic reviews estimated that HPV infection accounted for 56% to 60% of OPSCCs in North America, compared with 17% to 41% of OPSCCs in European countries [31,35]. As HPV infection itself is associated with HR-SB, it serves as a confounder in many of the studies presented. HPV-induced oropharyngeal carcinomas have steadily increased in recent decades, possibly due to changing sexual behavior of the population [36,37]. However, not every person with oropharyngeal HPV infection or a risky lifestyle develops HNSCC. Most HPV infections are cleared by the immune system, eventually leading to humoral and cellular immunity; the presence of antibodies to HPV capsid proteins and L1, in particular, may be indicators for prior infection. Indeed, the frequency of antibodies to L1, but not to the oncogenic E6 and E7 proteins, demonstrates an association with self-reported sexual behavior [38]. Within HPV subtypes, e.g., HPV16, there are variants with varying distributions in geographical regions. While the European HPV16 prototype (E) is dominant in Germany, the HPV16 Asian-American (AA) variant is prevalent in the United States of America. This could be important, as HPV16 AA is more oncogenic [39][40][41]. Regional differences in genetic or immunological predisposition for a non-cured HPV infection may also lead to an increased frequency of HPV-positive OPSCC in different geographic regions. The risk for cancer evolvement is increased by genetic variants in genes encoding enzymes involved in DNA repair or the metabolism of alcohol [42,43]. They also increase the risk independent of lifestyle-associated risk factors [43]. Genetic instability and carcinogen exposure lead to somatic mutations and an increased tumor mutational burden if they cannot be controlled by the immune system. Higher cancer incidences are linked to immune defects in the natural killer cell (NK cell) and T-cell system. As T-cell responses to peptides of oncogenes critically depend on binding and proper presentation [44] by human leukocyte antigen (HLA) antigens and their combinations (haplotypes), and the distribution of these differs according to ancestry, the genetic background of patients could also be involved in deviating findings by comparing diverse populations.
HPV-driven cancers such as cancer of the uterine cervix and vulvar, penile, anal, and oropharyngeal cancer are found to be increased in immune-suppressed populations [5], either iatrogenic induced after organ transplantation or related to immune deficiency. Immune deficiency can be inherited or acquired [45,46]. It is well-known that increased age, marijuana exposure, or HIV infection are linked to immune deficiency, contributing to an elevated frequency of opportunistic infections and particular cancers in the affected. Marijuana use and Human Immunodeficiency Virus (HIV) seropositivity are linked to reduced clearance of HPV and persistence of oral HPV infection [5]. Thus, regional variation in the co-incidence of undetected or untreated HIV infection also affects the likelihood of HPV-induced cancer-OPSCC, in particular. In the absence of (uncontrolled) HIV infection or persistent marijuana use, age-related immune suppression and age-related loss in immune surveillance remain the essential contributors to increased cancer risk even in prior immune-competent subjects. This might be the case in our cohort of HNSCC patients without any HIV-positive case and without marijuana use and a higher age compared to other studies from the U.S. [1,6]. This might also be reflected by the lower prevalence of HPV-driven OPSCC in Germany and especially in our cohort.
We could not confirm the numbers of oral-and vaginal-sex partners, as well as ever having practiced oral-genital sex, as risk factors for HNSCC or even p16+ OPSCC (Table 2) and, among them, the HPV-driven cases ( Figure 2). The significantly increased proportion of OPSCC patients who were aged <18 years at the time of first intercourse was the only finding that was consistent with many studies, including those from the U.S. [1,3,27,28]. This may argue for a so-far mechanistically unexplained risk for oncogenic infections linked to increased vulnerability at a younger age. However, an earlier sexual debut correlates only with an increased prevalence of antibodies to L1 but not E6 or other early proteins [38,47]. Therefore, most early vaccinations of girls and boys would probably be advantageous [42,48]. In Germany, the Standing Committee on Vaccination (STIKO) began to recommend HPV vaccination for young females aged 12 to 17 in 2007. Only since 2018 has the HPV vaccination been recommended for young men (and women) aged 9 to 14. Considering that our study was conducted during the years 2010 to 2012 and regarding the age distribution in our study with only one HNSCC case at age 25, one case at age 35, one control at age 36 but no control in the age category of 18 to 30, we expect no impact of HPV vaccination on HR-SB, immune competence towards HPV, or occurrence of neoplastic transformation of epithelia by HPV in our cohort. Further studies in years to come should explore the potential effects of HPV vaccination on the occurrence of OPSCC.
The limitations of our study arise from using a modified questionnaire that explicitly asks for categories according to the cutoff values for the number of sexual partners, as in D'Souza et al. [1], and not allowing free responses. Moreover, we are aware of the potential for recall bias or misreporting because the interview, as opposed to a self-administered questionnaire, could also have led to inaccurate responses when providing such sensitive information due to social suitability or shame in a face-to-face setting. This could have contributed to fewer subjects reporting HR-SB in both the cases and controls. In addition, we had only sufficiently high case numbers according to p16 positivity or negativity of the tumor material but not according to HPV seropositivity of probands. Moreover, we detected only 35.2% of HPV-driven OPSCC among the 94 PS-matched OPSCC cases. Given the low prevalence of HPV-driven OPSCC, and even more of HR-SB in patients with OPSCC, p16-positive OPSCC, and even HPV-driven OPSCC, it is reasonable to believe that the low numbers might have led to a substantially underestimated effect, or that we may have missed an impact of HR-SB on development of a larger proportion of OPSCC patients.
Our study contributes controversial data to the discussion about the association between HR-SB and OPSCC, and further studies are needed as substantial differences between populations exist, e.g., European and American. HPV-driven or p16+ OPSCC in Leipzig, Germany, and Baltimore, Maryland, appear to have not only different prevalences but also at least partially deviating characteristics, thus altogether impairing the transferability of the findings between geographical regions. We hereby suggest paying attention to an appropriate interpretation of data from various regions and obtaining data from studies performed in comparable populations that avoid bias and the misinterpretation of results. The appropriate selection of cases and matched healthy controls from population-based cohort studies appear to be a requirement for conducting insightful case-control studies.

Conclusions
The data we collected on the sexual behavior of OPSCC patients and controls are similar to those of other European studies but differ substantially from those of North American studies. The differences may be due to different regional-cultural sexual behaviors or linked to particular HPV variants. A varying genetic or immunological predisposition may also cause a selective immune incompetence or more general immune deficiency, allowing an oral HPV infection to evolve into cancer. Consequently, and according to the low prevalence in our OPSCC patients, sexual behavior alone may not be the sole or most responsible contributor to the development of OPSCC. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The complete datasets presented in this article are not readily available because of patient confidentiality and participant privacy terms. Requests to access the datasets should be directed to G.W., gunnar.wichmann@medizin.uni-leipzig.de.