Elucidation of the Genomic-Epigenomic Interaction Landscape of Aggressive Prostate Cancer

Background Majority of prostate cancer (PCa) deaths are attributed to localized high-grade aggressive tumours which progress rapidly to metastatic disease. A critical unmet need in clinical management of PCa is discovery and characterization of the molecular drivers of aggressive tumours. The development and progression of aggressive PCa involve genetic and epigenetic alterations occurring in the germline, somatic (tumour), and epigenomes. To date, interactions between genes containing germline, somatic, and epigenetic mutations in aggressive PCa have not been characterized. The objective of this investigation was to elucidate the genomic-epigenomic interaction landscape in aggressive PCa to identify potential drivers aggressive PCa and the pathways they control. We hypothesized that aggressive PCa originates from a complex interplay between genomic (both germline and somatic mutations) and epigenomic alterations. We further hypothesized that these complex arrays of interacting genomic and epigenomic factors affect gene expression, molecular networks, and signaling pathways which in turn drive aggressive PCa. Methods We addressed these hypotheses by performing integrative data analysis combining information on germline mutations from genome-wide association studies with somatic and epigenetic mutations from The Cancer Genome Atlas using gene expression as the intermediate phenotype. Results The investigation revealed signatures of genes containing germline, somatic, and epigenetic mutations associated with aggressive PCa. Aberrant DNA methylation had effect on gene expression. In addition, the investigation revealed molecular networks and signalling pathways enriched for germline, somatic, and epigenetic mutations including the STAT3, PTEN, PCa, ATM, AR, and P53 signalling pathways implicated in aggressive PCa. Conclusions The study demonstrated that integrative analysis combining diverse omics data is a powerful approach for the discovery of potential clinically actionable biomarkers, therapeutic targets, and elucidation of oncogenic interactions between genomic and epigenomic alterations in aggressive PCa.


Introduction
Prostate cancer (PCa) is the second most diagnosed and second leading cause of cancer deaths among men in the United States [1]. In 2019, an estimated 174,650 men were diagnosed with PCa and 31,620 men died from the disease [1]. Majority of the PCa deaths are attributed to localized high-grade aggressive tumours which progress rapidly to metastatic disease [2,3]. These tumours are characterized by poor prognosis, high recurrence rates, and poor survival rates [2,3]. The development and progression of aggressive PCa involve three separate, but related, genomes-the germline, somatic or tumour, and epigenomes [2][3][4][5][6][7][8][9]. Traditionally, the analysis of germline, somatic, and epigenetic mutations in aggressive PCa has been conducted as separate research endeavours [4]. Increasingly, germline and tumour genomes are being explored jointly to understand how genetic risk variants contribute to PCa [4]. However, to date, integration of information on germline, somatic, and epigenetic mutations to gain insights about how genetic and epigenetic mechanisms interact and cooperate to drive aggressive PCa has not been reported.
Genome-wide association studies (GWAS) have enabled discovery of germline mutations associated with an increased risk of developing PCa [4,10]. Genetic susceptibility variants from GWAS are being incorporated in risk prediction algorithms such as polygenic risk scores (PRSs) [11,12] to identify individual patients at the high risk of developing aggressive PCa [12][13][14]. PRSs are poised to improve clinical outcomes via precision medicine and precision prevention. However, one of the limitations for clinical implementation of PRSs is that the causal association between germline genetic risk variants used for calculating polygenic risk scores and aggressive PCa has not been established. Moreover, the genetic susceptibility variants reported to date explain only a small proportion of the phenotypic variation. Thus, integrating GWAS information with other omics data has the promise of not only associating genetic risk variants with tumourigenesis but also explaining the missing variation.
Advances in the next-generation sequencing technologies have enabled sequencing of the PCa or tumour and epigenomes [15,16]. The Cancer Genome Atlas (TCGA) [15] and the International Cancer Genome Consortium (ICGC) [16] have performed large-scale sequencing of tumour and epigenomes generating vast amounts of information on somatic, epigenetic, and gene expression profiles for many cancers including PCa. However, despite the large amounts of multiomics data generated by these large cancer genome sequencing projects, genomic and epigenomic data from these projects have not been leveraged and optimally integrated with germline mutation information to elucidate the genetic-epigenetic interaction landscape in aggressive PCa. With the availability of germline, somatic, and epigenetic mutation information on PCa, we are now well-positioned to integrate these pieces of information to identify the genomic and epigenomic drivers of aggressive PCa. The objective of this investigation was to elucidate the genomic and epigenomic interaction landscape of aggressive PCa. Our working hypothesis was that aggressive PCa originates from a complex interplay between genetic (both germline and somatic mutations) and epigenomic alterations. We further hypothesized that these complex arrays of interacting genomic and epigenomic factors affect gene expression, network states, and signalling pathways which in turn drive aggressive PCa. We addressed these hypotheses using integrative data analysis combining information on germline, somatic, and epigenomic alterations using gene expression data as the intermediate phenotype. We leveraged this integrative analysis approach with network and pathway analysis to elucidate the genomic-epigenomic interaction landscape in aggressive PCa.

Study Design and Sources of Genomics and Epigenomics
Data. The development and progression of aggressive PCa involve three separate, but interrelated genomes, the germ-line, somatic (tumour), and epigenomes. Alterations in these genomes lead to measurable changes affecting therapeutic decision-making in the in management of PCa. Therefore, the discovery of molecular drivers of aggressive PCa should take a comprehensive approach that combines pieces of information from all three genomes. Here, we used an integrative genomics approach that combines germline mutation from GWAS with somatic mutation and DNA methylation from TCGA using gene expression data as the intermediate phenotypes and unifying parameter. The integrative analysis approach was leveraged with network and pathway analysis to elucidate possible oncogenic interactions between genes containing germline, somatic, and epigenetic mutations. The overall project design showing sources of data and analysis workflow integrating multiomics data is shown in Figure 1.
Germline mutation data was obtained from a wellcurated and annotated catalogue of genetic variants associated with an increased risk of developing PCa that we have developed and published [4,17]. Details pertaining data collection, curation, and annotation have been published elsewhere [4,17] and were based on international guidelines for assessing cumulative evidence on GWAS associations [18][19][20][21][22]. This data was supplemented with data from the updated GWAS catalogue [10,23,24]. Overall, the GWAS data set included 401 genes containing 631 germline mutations (single-nucleotide polymorphisms (SNPs)) associated with an increased risk of developing PCa, linked with SNP identification numbers (rs-IDs), evidence of association as determined by the GWAS P value, gene name, and associated chromosome position. Information on SNP-IDs and gene names was further verified using the single-nucleotide polymorphisms database (dbSNP) (https://www.ncbi.nlm.nih .gov/snp/) [25] and the Human Genome Nomenclature Committee (HGNC) database (https://www.genenames.org/ ) which houses approved gene names and their aliases [26]. Information on genes and germline mutations including the original reports from which the information was derived is presented in Supplementary Table S1.
Somatic mutation information, DNA methylation, and gene expression along with clinical variables on aggressive PCa were obtained from The Cancer Genome Atlas (TCGA) [27]. The data were downloaded from the Genomics Data Commons portal (https://portal.gdc.cancer.gov/) using the data transfer tool [28]. Somatic mutation, DNA methylation, and gene expression were all generated on the same 188 individual patients diagnosed with aggressive PCa and 52 control samples. All the samples were linked with clinical information. Aggressive tumours were defined as tumours with Gleason grade 8-10 and or Gleason grade 7 with pathological score of 4 + 3 (primary + secondary) and were authenticated using clinical information and the American Urological Association (AUA) protocol [29]. Gene expression data was checked for quality by removing the genes (rows) with missing data, such that each row had at least ≥30% data using counts per million (CPM) filter (>0.5) implemented in R [30]. The resulting data set with 18,428 probes was normalized using the trimmed mean of M value (TMM) normalization method and transformed using Voom module in the 2 BioMed Research International Limma package implemented in R [30]. Probe IDs were replaced by annotated gene symbols and names using the Ensemble database. Somatic mutation data was processed to identify the number of genes containing somatic mutations and the number of somatic mutations per gene across samples. This processing step generated a catalogue of 4,779 somatic mutated genes and 6,658 somatic mutation events used in the analysis. A complete list of somatic mutated genes and number of somatic mutation events per gene is presented in Supplementary Table S2. As noted, DNA methylation data was generated from the same 188 tumour and 52 control samples as gene expression and somatic mutation data using the Illumina Human-Methylation450 BeadChip [31]. The data was processed using the Illumina DNA methylation data processing and analysis protocols [32,33] implemented in our pipeline [34]. The data was corrected for batch effects and normalized using quantile normalization implemented in the R Package consistent with Illumina DNA methylation data analysis protocol [31][32][33][34][35].

Bioinformatics Analysis.
We performed gene expression and DNA methylation data analysis using the pipelines we have developed and implemented in R Bioconductor pack-ages [34]. We performed whole transcriptome analysis comparing gene expression levels between tumour and control samples using the Limma package implemented in R [30] to identify all significant differentially expressed genes distinguishing aggressive tumours from control samples. We used the false discovery rate (FDR) procedure to control for multiple hypothesis testing [36]. Genes were ranked on P values, log2 fold change (LogFC), and FDR. Likewise, we performed whole methylome analysis comparing DNA methylation profiles between tumours and control samples to discover a signature of significantly differentially methylated genes and CpG sites using the Limma package implemented in R [30]. We employed the FDR in the analysis to correct for multiple hypothesis testing [36]. The discovered CpG sites were annotated with gene symbols using the Ensemble Biomart database [37]. We computed the number of CpG sites per gene for significantly differentially methylated genes to get a quantitative assessment of DNA methylation sites per gene. The methylation sites were classified as either hypomethylated (down) or hypermethylated (up) based on the direction of regulation using the Limma package [30]. The genes and CpG sites were then ranked on P values, LogFC, FDR, and number of significantly (P < 0:05) differentially methylated sites. Differentially expressed genes 3 BioMed Research International and differentially methylated genes were merged and sorted by gene symbols, expression, and methylation P values to discover a signature of differentially expressed genes which were also differentially methylated. We investigated the impact of DNA methylation on gene expression using a two-way plot of expression LogFC against the DNA methylation LogFC using the Starburst plot [38] using only differentially expressed genes which were also differentially methylated. Genes associated with the diseases were further evaluated for the presence of germline and somatic mutations to identify a signature of genes containing germline, somatic, and epigenetic mutations transcriptionally associated with aggressive PCa. Genes containing germline, somatic, and epigenetic alterations associated with aggressive PCa were subjected to network and pathways analysis using Ingenuity Pathway Analysis (IPA) software package [39] to identify gene regulatory networks and signalling pathways enriched for the three types of mutations. We used gene ontology (GO) [40] analysis implemented in IPA to characterize the genes according to molecular function, biological processes, and cellular components in which they are involved.

Discovery of Gene Expression and DNA Methylation
Signatures. To discover gene expression and DNA methylation signatures associated with aggressive PCa, we performed whole methylome and whole transcriptome analysis comparing tumour to control samples separately. The results of this investigation are summarized in Figure 2(a). The comparison of DNA methylation profiles between tumour and control samples revealed a signature of 12,426 significantly (P < 0:05) differentially methylated genes associated with aggressive PCa (Figure 2(a)). There was significant variation in patterns of DNA methylation profiles and the number of CpG sites associated with aggressive PCa.
The number of CpG sites per gene ranged from 1 to 480 in tumour samples. The most highly significantly differentially methylated genes were PTPRN2, PRDM16, PCDHGA1, PCDHGA2, PCDHGA3, PCDHGB1, MAD1L1, PCDHGA4, PCDHGB2, PCDHGA5, PCDHGB3, and PCDHGA6 with ≥200 significantly (P < 0:05) differentially methylated CpG sites per gene. A complete list of all significantly differentially methylated genes distinguishing tumour samples from controls along with the number of differentially methylated CpG sites per gene is presented in Supplementary Table S3. The comparison of gene expression levels between tumour and control samples produced a signature of 12,100 significantly (P < 0:05) differentially expressed genes (Figure 2(a)). The most highly significantly differentially expressed genes were SIM2, HOXC6, NKX2-3, DLX1, EPHA10, PCAT7, ARHGEF38, PRR36, and EZH2 (P < 10 -11 ). A complete list of significantly differentially expressed genes associated with aggressive PCa is presented in Supplementary Table S4.
To address the hypothesis that aberrantly expressed genes associated with aggressive PCa are also aberrantly expressed, we combined the 12,426 significantly (P < 0:05) differentially methylated genes with the 12,100 significantly (P < 0:05) differentially expressed genes and ranked the genes based on expression and CpG sites P values. The analysis produced a signature of 6,486 containing both alterations ( Figure 2(a), intersection). In addition, the investigation produced a signature of 5,614 genes altered in the transcriptome only and a signature of 5,940 genes with only epigenetic alterations associated with aggressive PCa (Figure 2(a)). The discovery of a signature of genes altered in both the trascriptome and the methylome and signatures of different sets of genes altered in each of them demonstrates the power of integrative analysis using complementary technologies.
Having discovered the 6,486 aberrantly methylated genes transcriptionally associated with aggressive PCa (Figure 2(a)), we conducted additional investigation on these genes to determine whether DNA methylation affects gene expression. The results showing the effect of aberrant DNA methylation on gene expression are presented in a two-way Starburst plot in Figure 2(b). The investigation revealed that aberrant DNA methylation affects gene expression ( Figure 2(b)). We discovered 206 upregulated, 77 down regulated, 152 hypomethylated, and 30 hypermethylated genes ( Figure 2(b)). Three genes HOXC4, HOXC6, and NOX4 were hypomethylated and downregulated, whereas 14 genes CYP27A1, NRK, EMX2OS, C2orf88, PRKCB, WFDC2, NRG2, MCF2, COL4A6, PROM1, AOX1, HIF3A, CYP11A1, and GATA3 were hypomethylated and upregulated. The gene SLC2A9 was hypermethylated and upregulated. The results confirmed our hypothesis that aberrant DNA methylation affects gene expression at varying levels.
To determine the extent of epigenomic alterations for the 6,486 genes containing both alterations, we computed the P values for the most variable CpG sites and the number of CpG sites across tumour samples for each gene. Genes were ranked according to the number of CpG sites in the gene. The results showing the top 23 most highly significantly differentially methylated genes with >100 CpG sites per gene are presented in Table 1. Also presented in the table are probes showing the most highly significant CpG sites, their estimates of P values, number of CpG sites per gene, and estimates of gene expression p-values.
The analysis revealed significant variation in patterns of DNA methylation profiles among the genes ( Table 1). The number of CpG sites per gene ranged from 1 to 480. The genes PTPRN2, PRDM16, PCDHGA1, PCDHGA2, PCDHGB1, MAD1L1, PCDHGA4, PCDHGB2, PCDHGA5, PCDHGB3, PCDHGA6, RPTOR, COL11A2, KCNQ1, PCDHA1, PCDHGA9, PCDHGB6, AGAP1, ATP11A, PCDHGA10, PCDHGB7, MCF2L, and CACNA1HA had the most highly significantly differentially CpG sites and the highest number of CpG sites per gene ≥ 100 CpG sites (Table 1). Among the 23 genes in Table 1 included the genes PTPRN2, PCDHGB1, ATP11A, and CACNA1HA which have been experimentally confirmed to be associated with aggressive PCa [41][42][43][44]. A complete list of all the 6,486 genes containing both genomic and epigenomic alterations along with the number of methylation sites per gene is presented in Supplementary Table S5. Taken together, the results of these investigations show that a subset of genes that are 4 BioMed Research International transcriptionally associated with tumours is aberrantly methylated and that aberrantly methylated genes affect gene expression in aggressive PCa.

Discovery of Somatic Mutation and DNA Methylation
Signatures. Although development and progression of aggressive PCa tumours are driven by acquired somatic driver mutations [3], enduring epigenetic landmarks define the tumour microenvironment [45]. Therefore, our next step in this investigation was to determine whether aberrantly methylated genes transcriptionally associated with aggressive PCa are somatic mutated. We hypothesized that aberrantly methylated genes transcriptionally associated with aggressive PCa are somatic mutated. We addressed this hypothesis by integrating somatic mutation information with epigenomic and gene expression data. Specifically, we evaluated aberrantly methylated genes transcriptionally associated with aggressive PCa for the presence of somatic mutations using the 4,779 genes containing somatic mutations. The results of this investigation are presented in a threeway Venn diagram shown in Figure 3. The analysis revealed a signature of 1,702 genes containing all three alterations ( Figure 3). In addition, the analysis produced a signature of 796 somatic mutated genes transcriptionally associated with the disease and a signature of 1,264 somatic mutated aberrantly methylated in aggressive PCa (Figure 3). A total of 1,017 somatic mutated genes were neither aberrantly methylated nor transcriptionally associated with the disease (Figure 3). A complete list of all the 1,702 somatic mutated genes aberrantly methylated and transcriptionally associated with aggressive tumours is presented in Supplementary  Table S6. A complete list of the 796 somatic mutated genes transcriptionally associated with the diseases and a complete list of the 1,264 somatic mutated genes aberrantly methylated in aggressive PCa are presented in Supplementary Table S7.
To determine the extent of somatic and epigenetic alterations and whether the most highly mutated genes are the most highly epigenetically altered and or vice versa, we evaluated the 1,702 genes containing all three alterations ( Figure 3). The results showing the top 45 most highly somatic mutated (>3 somatic events per gene) genes are presented in Table 2. Also presented in Table 2 are the most highly significant CpG sites and associated P values along with the number of CpG sites per gene and gene expression P values.
There was significant variation in the distribution of somatic mutations and methylation sites per gene. The most highly somatic mutated genes were SPOP, FOXA1, LRP1B, OBSCN, CSMD3, FREM2, AHNAK, PLCB4, SYNE1, PCDH18, CDH23, DCHS2, VPS13D, MACF1, PTPRD, HFM1, AHNAK2, CTNNB1, and SACS (Table 2). Further evaluation of the results revealed that not all highly somatic mutated genes were highly differentially methylated ( Table 2). The most highly differentially methylated genes were SPOP, OBSCN, CSMD3, AHNAK, SYNE1, CDH23, DCHS2, VPS13D, MACF1, PTPRD, TACC2, GRIN2A, PCDHGA9, SALL1, NPAT, DST, CACNA1C, ZFHX3, PCDHA1, EPHA3, and PTEN (Table 2). Conversely, not all the most highly somatic mutated genes were highly differentially methylated. The observed significant variation in DNA methylation can be explained in part by the phenotypic heterogeneity inherent in aggressive PCa [8]. Overall, the investigation revealed that a subset of aberrantly methylated genes is somatic mutated and that the distribution of somatic and epigenetic alterations in these genes varies significantly. The 5 BioMed Research International discovery of somatic mutated genes which were also epigenetically altered suggests that some of the genes driving tumourigenesis may be under genetic and epigenetic control.

Discovery of Germline, Somatic, and Epigenetic Mutation
Signatures. As noted earlier in this report and consistent with other reports [2][3][4][5][6][7][8][9], the development and progression of aggressive PCa involve three separate, but related, genomes-the germline, somatic or tumour, and epigenomes. Therefore, optimal integration of omics data should include all three genomes and the phenotype they regulate. Thus, to address the hypothesis that somatic and epigenetics mutated genes associated with aggressive PCa harbour germline mutations and to infer the potential causal association between genetic susceptibility and aggressive PCa, we evaluated the 401 genes containing germline mutations for their association with aggressive PCa using gene expression information and for the presence of somatic mutations and epigenetic alterations.
The results of this investigation are presented in a four-way Venn diagram in Figure 4. Out of the 401 genes containing germline mutations evaluated, 41 genes contained germline, somatic, and epigenetic alterations and were transcriptionally associated with aggressive tumours. In addition, we discovered 202 genes transcriptionally associated with aggressive PCa, 223 genes aberrantly methylated, 122 genes somatic mutated, and 97 aberrantly methylated genes transcriptionally associated with the disease (Figure 4). A subset of 92 genes was altered only in the germline and was neither  Figure 3: Three-way Venn diagram showing the results of somatic mutated, aberrantly DNA methylated, differentially expressed genes associated with aggressive PCa discovered through analysis and integration of somatic mutation, DNA methylation, and gene expression data.  (Figure 4). Overall, the investigation confirmed our hypothesis that genes containing germline mutations are associated with aggressive PCa and harbour both somatic and epigenetic alterations. The discovery of genes altered only in the germline can be explained partially by the differences in population cohorts from which GWAS and sequence data were derived. GWAS discoveries are inherently heterogeneous and derived from heterogeneous populations, which gene expression can be population and time specific. Under such conditions, the observed outcome is expected.

BioMed Research International
In addition to evaluating the distribution of genes containing germline, somatic, and epigenetic mutations, we performed a quantitative assessment on the discovered gene signatures to evaluate the frequency distribution and extent of germline, somatic, and epigenetic mutation events among the 41 genes containing all three alterations. The results of this investigation are presented in Table 3.
There was significant variation in the distribution of germline, somatic, and epigenomic alterations ( Table 3). The number of somatic and germline mutations was lower than the number of CpG sites in each gene (Table 3). Interestingly, the 41 gene signature included the genes BRCA1, KLK3, KLK2, PDLIM5, and ITGA6, containing genetic variants reported to be directly associated with aggressive PCa [4,[46][47][48], and the genes AMIGO2, ATF71P, BRCA1, KLK2, KLK3, MDM4, and PDLIM5 used in gene panels for PCa screening and assessing disease prognosis [46][47][48]. Overall, the investigation confirmed our hypothesis that somatic and epigenetic mutated genes harbour germline mutations and provides some foundational knowledge about the potential link between the genetic susceptibility variants and tumourigenesis. The discovery of epigenetic mutated genes without germline mutations tends to suggest that part of the missing variation not explained by GWAS may be explained by DNA methylation.

Discovery of Altered Molecular Networks and Signalling
Pathways. The objective of this investigation was to elucidate the genomic and epigenomic interaction landscape of aggressive PCa. The results in preceding sections have shown that genes genetically altered in the tumour genome are aberrantly methylated and that somatic and epigenetic mutated genes harbour germline mutations. To gain insights about the possible oncogenetic interactions between genetic and epigenetic changes, we performed network and pathway analysis. Our working hypothesis was that aggressive PCa originates from a complex interplay between genomic (involving both germline and somatic mutations) and epigenomic alterations. We further hypothesized that these complex arrays of interacting genomic and epigenomic factors affect gene expression, molecular networks, and signalling pathways which in turn drive aggressive PCa. We addressed these hypotheses using network and pathways analyses to identify molecular networks and signalling pathways enriched for genetic and epigenetic alterations and characterized their functional connectivity. For this analysis, we used the 41 genes containing germline, somatic, and epigenetic mutations. Because genes containing germline mutations explain only a small proportion of the phenotypic variation and their causal association with the disease has not been established, we also included the most highly somatic and epigenetic mutated genes without germline mutations.
The results of network analysis are presented in Figure 5. Network analysis produced 19 molecular networks with the Z-scores ranging from 2 to 51. The analysis revealed functionally related genes containing germline, somatic, and epigenomic alterations interacting in gene regulatory networks ( Figure 5).
The discovered networks contained genes predicted to be involved in cancer, cell-to-cell signalling and interaction, organismal injury and abnormalities, reproductive system disease, cellular assembly and organization, amino acid metabolism, posttranslational modification, immunological disease, DNA damage and repair, and hereditary disorder. The analysis also produced molecular networks containing genes predicted to be involved in cell cycle, cell death and survival, cellular development, organ development, and reproductive system development and function. Among the genes revealed by network analysis included the genes    BioMed Research International  Figure 5 [4]. Overall, the investigation revealed molecular networks enriched for germline, somatic, and epigenetic mutations involved in aggressive PCa. The investigation confirmed our working hypothesis was that aggressive PCa is an emergent property of molecular networks of functionally related genes containing germline, somatic mutations, and epigenetic alterations.
Pathway analysis revealed 96 signalling pathways enriched for germline, somatic, and epigenetic mutations. The topmost highly significant signalling pathways are presented in Figure 6. Also presented in the figure is the threshold P value marked by the yellow line, above which the pathways were declared significant following correction for multiple hypothesis testing. The investigation revealed the STAT3, IL-15, PTEN, axonal guidance, cancer, FAT10 cancer, RAR activation, EGF, androgen, NF-κB, ATM, PI3K, and P53 signalling pathways ( Figure 6). In addition, the investigation revealed the cell cycle: G1/S checkpoint regula-tion, and IL-8; and cell cycle: G2/M DNA damage checkpoint regulation, PI3K/AKT, and the PCa signalling pathways ( Figure 6). Overall, the results of the investigation confirmed our working hypothesis that oncogenic interactions among genes containing genetic and epigenetic mutations affect signalling pathways which in turn drive aggressive PCa.
In summary this integrative data approach combining multi-omics data revealed that genomic and epigenomics alterations in the germline and tumour genomes can lead to measurable changes that could guide elucidation of the genomic-epigenomic landscape in aggressive PCa. This interdisciplinary integrated approach establishes putative functional bridges between germline, somatic (tumour), and epigenetics and the pathways the control. These observations suggest that genes and pathways driving aggressive PCa are under genetic and epigenetic control and that integrative analysis combining data from complementary technologies provides a unified and optimal approach to the discovery of potential clinically actionable biomarkers and targets for the development of novel therapeutics in aggressive PCa.

Discussion
The last decade has witnessed remarkable progress in the discovery and development of comprehensive catalogues of germline genetic susceptibility variants associated with an increased risk of developing PCa using GWAS [4,5,10,17]. In parallel to large-scale genotyping, next-generation sequencing has generated massive amounts of genomic and epigenomic data from tumour genomes [15,16]. Traditionally, genotyping and sequencing have been conducted as separate research endeavours. Here, we combined information on germline, somatic, and epigenetic alterations using gene expression data as the intermediate phenotype to elucidate the genomic-epigenomic interaction landscape of aggressive PCa. The investigation revealed functionally related germline, somatic, and epigenetic mutated genes associated with aggressive tumours. The investigation further revealed molecular networks and signalling pathways enriched for genetic and epigenetic mutations and that DNA methylation affects gene expression. To the best of our knowledge, this is the first study to comprehensively integrate information on germline, somatic, and epigenetic mutations at the gene, net-work, and pathway levels using gene expression as the intermediate phenotype. We summarize the clinical significance and translational aspects of this investigation as follows. First, the discovery of genes such as KLK3 and AR altered in germline, somatic (tumour), epigenome, and the transcriptome, coupled with the findings that aberrant DNA methylation affects gene expression demonstrates that integrative analysis combining information from complimentary technologies provides a unified approach for the discovery of potential clinically actionable biomarkers in aggressive PCa. Indeed, aberrant DNA methylation in PCa has been reported [49][50][51]. The novel and innovative aspects of our investigation are that they combine diverse omics data and assesses the impact of DNA methylation on gene expression and to establish putative functional bridges between germline, somatic, and epigenetic alterations and the pathways they control in aggressive PCa.
Second, the discovery of genes such as BRCA1, AR, ATM, and KLK3 containing germline, somatic, and epigenetic mutations is of particular interest. This reveals a potential link between genetic susceptibility and tumourigenesis. Importantly, while tumour development and progression  Figure 6: Signalling pathways enriched for germline, somatic, and epigenetic mutations in aggressive PCa. The y-axis shows the pathway names, and the x-axis shows the -log(P values) on which pathways were ranked and selected. The yellow line indicates the threshold level expressed as the -log(P-value) above which the signalling pathway was declared significant. may be driven by acquired somatic driver mutations in these genes, the actions of somatic mutations maybe primed by germline mutations and enduring epigenetic landmarks may be defining the tumour microenvironment [45]. Moreover, epigenetic alterations in DNA repair genes such as BRCA1 and ATM discovered in this investigation could cause genome instability and silencing of tumour suppressor genes, such as P53, leading to carcinogenesis [52][53][54]. Third, the discovery of a signature of 41 genes containing germline, somatic, and epigenetic alterations is of particular interest. To date, risk prediction algorithms such as PRSs use germline mutations mapped to genes used in this investigation [11][12][13][14]. However, the causal association between genetic susceptibility variants used in computing PRSs and aggressive PCa has not been established. Moreover, the genetic susceptibility variants reported thus far explain only a small proportion of the phenotypic variation, which raises the question of "where is the missing heritability"?. Incorporation of somatic mutation, epigenetic, and gene expression data as demonstrated here has the potential to address some of the limitations incurred in current risk prediction models and could address the question of missing variation not accounted for by risk variants [55,56]. This could be achieved by leveraging germline mutation information and integrating it with somatic and epigenetic mutation using gene expression data as demonstrated here to develop more robust and more accurate genetic risk prediction models to enhance precision medicine and precision prevention [57]. This is an attractive approach because both germline and epigenomic variations are heritable and affect gene expression variation [58][59][60].
Fourth, the discovery of key signalling pathways implicated in aggressive PCa including STAT3, PTEN, molecular mechanisms of cancer, AR, ATM, PI3K/AKT, PCa, and P53 signalling pathways [61,62] was intriguing. First, it demonstrates that the signalling pathways driving aggressive PCa are likely under genetic and epigenetic control. Second and perhaps more importantly is that these findings provide a rational basis for the discovery of potential targets critical to the development of novel therapeutics for aggressive PCa. This is noteworthy because, currently, the AR and PI3K signalling pathways are used as therapeutic targets in aggressive PCa, as androgen-deprivation therapy (ADT) is one of the most effective therapeutic modalities [61,62]. Overall, this comprehensive multidisciplinary approach to elucidation of the genomic-epigenomic interaction landscape of aggressive PCa provides novel insights about the power of integrative analysis combining diverse omics data for the discovery of genetic and epigenetic drivers of aggressive PCa and how they interact and cooperate to drive the clinical phenotypes.

Conclusions
The investigation revealed DNA methylation and gene expression signatures associated with aggressive PCa and that aberrant DNA methylation affects gene expression. The investigation revealed that germline and somatic mutated genes are aberrantly methylated and transcriptionally associated with aggressive PCa. The investigation revealed that aggressive PCa is an emergence property of gene regulatory networks and signalling pathways under genetic and epigenetic controls. Integrative analysis combining genomic and epigenomic data using gene expression as the intermediate phenotype is a powerful approach for elucidating the genomic-epigenomic interaction landscape in aggressive PCa, discovery of potential clinically actionable biomarkers, and targets for the development of novel therapeutics.

Data Availability
Original clinical information, mutation, gene expression, and DNA methylation data used in this study were downloaded from The Cancer Genome Atlas (TCGA) via the Genomics Data Commons and are available at https://www .cancer.gov/about-nci/organization/ccg/research/structuralgenomics/tcga via the GDC https://gdc.cancer.gov/. Germline mutations were derived from the literature (Supplementary Table SA) and from the Genome-wide Association Information (GWAS) catalog located at the NHGRI-EBI Catalog of published genome-wide association studies data at https://www.ebi.ac.uk/gwas/downloads/summary-statistics. Additional data is shared through supplementary tables referenced in the manuscript and listed in the manuscript and provided as supplementary material to this report.

Disclosure
The views expressed in this manuscript are those of the authors and do not represent the funding sources or agency.