Large-scale Analysis Demonstrates Familial Testicular Cancer to have Polygenic Aetiology

Testicular germ cell tumour (TGCT) is the most common cancer in young men. Multiplex TGCT families have been well reported and analyses of population cancer registries have demonstrated a four- to eightfold risk to male relatives of TGCT patients. Early linkage analysis and recent large-scale germline exome analysis in TGCT cases demonstrate absence of major high-penetrance TGCT susceptibility gene (s). Serial genome-wide association study analyses in sporadic TGCT have in total reported 49 inde- pendent risk loci. To date, it has not been demonstrated whether familial TGCT arises due to enrichment of the same common variants underpinning susceptibility to sporadic TGCT or is due to shared environmental/lifestyle factors or disparate rare genetic TGCT susceptibility factors. Here we present polygenic risk score analysis of 37 TGCT susceptibility single-nucleotide polymorphisms in 236 familial and 3931 sporadic TGCT cases, and 12 368 controls, which demonstrates clear enrichment for TGCT susceptibility alleles in familial compared to sporadic cases ( p = 0.0001), with the majority of familial cases (84 – 100%) being attributable to polygenic enrichment. These analyses reveal TGCT as the ﬁ rst rare malignancy of early adulthood in which familial clustering is driven by the aggregate effects of polygenic variation in the absence of a major high-penetrance susceptibility gene. Patient summary: To date, it has been unclear whether familial clusters of testicular germ cell tumour (TGCT) arise due to genetics or shared environmental or lifestyle factors. We present large-scale genetic analyses comparing 236 familial TGCT cases, 3931 isolated TGCT cases, and 12 368 controls. We show that familial TGCT is caused, at least in part, by presence of a higher dose of the same common genetic variants that cause susceptibility to TGCT in general.

\ Testicular germ cell tumour (TGCT) is the most common cancer in young men, with over 18 000 new cases of TGCT diagnosed annually in Europe [1]. Over the past 40 yr, several authors have reported families with multiple cases of TGCT. Such observations, coupled with the higher concordance of TGCT in monozygotic twins than in dizygotic twins have suggested a heritable basis to TGCT [2]. In the 2000s, systematic family studies, including in population-based registries, confirmed that first-degree relatives of patients with TGCT, have four-to eight-fold higher risk for TGCT. Based on data from the Swedish nationwide registry, around 2% of TGCT cases have a firstdegree relative with TGCT [3]. Whilst the clustering of TGCT in families has raised the possibility of Mendelian susceptibility, linkage analyses and large-scale exome sequencing of familial TGCT have not provided evidence for highpenetrance susceptibility genes [4][5][6].
Meanwhile, recent genome-wide association studies (GWAS) have identified single-nucleotide polymorphisms (SNPs) at 49 independent loci associated with TGCT risk [7,8]. While the identification of such risk alleles proves the existence of inherited susceptibility, the genetic basis of familial TGCT is unclear. The identified risk SNPs are common, have modest effects, and have been discovered by comparing unselected cases with controls. Although statistical predictions suggest that common susceptibility may account for around 37% of the familial risk [7,8], thus far no direct evidence for such a polygenic aetiology has been reported. Whilst the rapid doubling of TGCT incidence over the past 40 yr has been taken as evidence for significant environmental influences on TGCT aetiology, no specific environmental risk factors have been robustly established [1,9]. Therefore, it is an open question as to whether familial TGCT is a consequence of the co-existence of unusually high numbers of common risk alleles or arises due to other rarer genetic factors, shared environmental exposure, or common lifestyle factors.
To explore the role of polygenic susceptibility in the aetiology of familial TGCT, we studied 236 familial and 3931 sporadic TGCT cases, and 12 368 healthy population controls derived from two previously published GWAS (see Supplementary materials). Briefly, cases and controls were genotyped using illumina arrays with recovery of untyped genotypes by imputation. Both cases and controls were of European ancestry. We extracted the genotypes of tag SNPs for 37 risk loci (Supplementary Table 2) which have been robustly associated with TGCT and for which high-quality direct or imputed genotypes were available for both GWAS datasets. We quantified risk allele burden using two approaches. First, we calculated the total number of risk alleles and second, we calculated a polygenic risk score (PRS) combining the number of risk alleles weighted by their effect size. To avoid statistical inflation resulting from the analysis of multiple related family members, we only included a single affected proband from each of the 236 multiplex TGCT families (Supplementary Table 1).
Comparing the different groups, we observed a significant enrichment of risk alleles in familial cases compared with sporadic cases (average number of risk alleles, p = 0.0001; average PRS, p = 0.0001; Fig. 1; Supplementary  Table 3). Notably, these differences were seen in each of the contributing GWAS datasets (Supplementary Figs. 1 and 2). The most overrepresented risk allele in familial compared to sporadic cases was SNP rs3751673 (chromosome 16q24.2; risk allele frequency, 0.75 vs 0.65, respectively; Cochran-Armitage trend test, p = 0.006; Bonferroni-corrected, p = 0.22; Supplementary Table 4).
We next examined the relationship between PRS and familial TGCT in more detail. Although not statistically significant, likely on account of the limited number of "large" families available for analysis, PRS was found to be greater in probands from families with three or more cases of TGCT compared with two-case TGCT families (Supplementary Fig. 3). Familial relative risks for TGCT have been reported in epidemiological studies to be higher for brothers than in father-son TGCT families [3]; however, we found no significant difference in PRS between familial cases analyzed by pedigree structure (Supplementary Fig.  3). To better quantify the extent of disease within each family and take account of bilateral disease, we generated a family history score that incorporates the number of affected individuals, degree of relatedness, and the occurrence of bilateral disease (see Supplementary materials). PRS was positively correlated with this family history score, albeit again not reaching statistical significance (p = 0.09, Fig. 2). Collectively, these results provide strong evidence for enrichment for known common TGCT risk alleles underpinning familial clustering of TGCT.
We next estimated the proportion of familial TGCT that is attributable to enrichment of common risk alleles adopting the strategy of Halvarsson et al [9], which is based on the premise that, theoretically, a PRS can be drawn from one of two distributions depending on aetiology. Cases caused by enrichment of common TGCT risk alleles will follow a Fig. 2 -Relationship between ranked polygenic risk score and family history score. Spearman rank correlation showing the relationship between ranked polygenic risk score (y-axis) and ranked family history score (x-axis) for the 228 familial testicular germ cell tumour cases with a known pedigree structure. The underlying distributions of those data are shown, respectively, on the opposite axes. Rho = 0.09; p = 0.09. E U R O P E A N U R O L O G Y X X X ( 2 0 1 8 ) X X X -X X X right-shifted risk score distribution, whereas cases caused by other factors (not reflected in PRS) will follow the same distribution as the population. To estimate the relative proportion of the two underlying distributions, we fitted a two-component Gaussian mixture model to the observed PRS for familial TGCT cases, restricting one component of the model to the distribution parameters defined by controls. The proportion of familial TGCT following a right-shifted distribution (ie, enriched for TGCT susceptibility alleles) in our dataset was 100%, with the lower bound of this estimate at 84% (Supplementary Table 5). Similar results were obtained when using the number of risk alleles, unweighted by effect size (98%, 95% confidence interval [CI]: 54-100%; Supplementary Table 5). Together, these results indicate attribution to enrichment for common TGCT risk alleles for the majority of TGCT families analyzed.
Survival from TGCT is high; however, the success of treatment is accompanied by long-term consequences associated with survivorship. TGCT is a model of disease in which prediction and early intervention could be impactful, as the precursor carcinoma-in-situ (CIS) lesion is reliably present from adolescence, with CIS cells exfoliated in the semen [10]. Early intervention could reduce the occurrence of invasive cancer arising in young men, reducing the burden of chemotherapy-related survivorship issues and reducing mortality in the minority with treatment-refractory disease state. Therefore, whilst understanding the inherited basis of TGCT is useful for counseling of individuals concerned about a family history of the disease, such information may also be important for TGCT-screening programs targeting those at elevated a priori risk.
In conclusion, we present the first evidence of clear enrichment in familial TGCT for common TGCT susceptibility alleles, demonstrating that familial clustering of TGCT is at least in part due to enrichment for the same genetic factors that confer susceptibility to sporadic TGCT. We demonstrate attribution to polygenic susceptibility in the majority of TGCT families. The enrichment of common TGCT risk alleles among familial cases is sufficiently modest in magnitude that existence of additional genetic and environmental drivers of familial TGCT remains possible. Future studies will provide further elucidation of the genetic basis of familial TGCT and empower clinical application.