An integrative multi-omics analysis to identify candidate DNA methylation biomarkers related to prostate cancer risk

Wu, Lang; Yang, Yaohua; Guo, Xingyi; Shu, Xiao-Ou; Cai, Qiuyin; Shu, Xiang; Li, Bingshan; Tao, Ran; Wu, Chong; Nikas, Jason B.; Sun, Yanfa; Zhu, Jingjing; Roobol, Monique J.; Giles, Graham G.; Brenner, Hermann; John, Esther M.; Clements, Judith; Grindedal, Eli Marie; Park, Jong Y.; Stanford, Janet L.; Kote-Jarai, Zsofia; Haiman, Christopher A.; Eeles, Rosalind A.; Zheng, Wei; Long, Jirong

doi:10.1038/s41467-020-17673-9

Download PDF

Article
Open access
Published: 06 August 2020

An integrative multi-omics analysis to identify candidate DNA methylation biomarkers related to prostate cancer risk

Lang Wu¹^na1,
Yaohua Yang²^na1,
Xingyi Guo²,
Xiao-Ou Shu²,
Qiuyin Cai²,
Xiang Shu²,
Bingshan Li ORCID: orcid.org/0000-0003-2129-168X^3,4,
Ran Tao^4,5,
Chong Wu ORCID: orcid.org/0000-0002-8400-1785⁶,
Jason B. Nikas ORCID: orcid.org/0000-0001-9703-0422⁷,
Yanfa Sun^1,8,
Jingjing Zhu¹,
Monique J. Roobol ORCID: orcid.org/0000-0001-6967-1708⁹,
Graham G. Giles ORCID: orcid.org/0000-0003-4946-9099^10,11,
Hermann Brenner^12,13,14,
Esther M. John¹⁵,
Judith Clements^16,17,
Eli Marie Grindedal¹⁸,
Jong Y. Park ORCID: orcid.org/0000-0002-6384-6447¹⁹,
Janet L. Stanford^20,21,
Zsofia Kote-Jarai²²,
Christopher A. Haiman²³,
Rosalind A. Eeles ORCID: orcid.org/0000-0002-3698-6241²²,
Wei Zheng ORCID: orcid.org/0000-0003-1226-070X²,
Jirong Long²,
The PRACTICAL consortium,
CRUK Consortium,
BPC3 Consortium,
CAPS Consortium &
PEGASUS Consortium

Nature Communications volume 11, Article number: 3905 (2020) Cite this article

7666 Accesses
23 Citations
19 Altmetric
Metrics details

Subjects

Abstract

It remains elusive whether some of the associations identified in genome-wide association studies of prostate cancer (PrCa) may be due to regulatory effects of genetic variants on CpG sites, which may further influence expression of PrCa target genes. To search for CpG sites associated with PrCa risk, here we establish genetic models to predict methylation (N = 1,595) and conduct association analyses with PrCa risk (79,194 cases and 61,112 controls). We identify 759 CpG sites showing an association, including 15 located at novel loci. Among those 759 CpG sites, methylation of 42 is associated with expression of 28 adjacent genes. Among 22 genes, 18 show an association with PrCa risk. Overall, 25 CpG sites show consistent association directions for the methylation-gene expression-PrCa pathway. We identify DNA methylation biomarkers associated with PrCa, and our findings suggest that specific CpG sites may influence PrCa via regulating expression of candidate PrCa target genes.

Integrative multi-omic analysis identifies genetically influenced DNA methylation biomarkers for breast and prostate cancers

Article Open access 16 June 2022

A genomic and epigenomic atlas of prostate cancer in Asian populations

Article 25 March 2020

Association of imputed prostate cancer transcriptome with disease risk reveals novel mechanisms

Article Open access 15 July 2019

Introduction

Prostate cancer (PrCa) is the second most frequently diagnosed malignancy among men and the fifth leading cause of cancer death worldwide¹. Its survival rate is relatively high for localized stage disease, but decreases substantially for metastatic disease². Effective strategies are critical for risk assessment, screening, and early detection of PrCa, aimed at decreasing its public health burden. Although prostate-specific antigen (PSA) has demonstrated efficacy for detecting PrCa early^3,4, there lacks a clear cutoff point for PSA with high sensitivity and specificity^5,6,7. The benefits of PSA screening for reducing PrCa mortality remains controversial^8,9,10. Furthermore, there are adverse effects, such as overdiagnosis¹¹. Therefore, additional effective biomarkers are needed for risk assessment and early detection of PrCa.

Aligned with findings of a crucial role for DNA methylation in PrCa development¹², research has identified several methylation markers to be potentially associated with PrCa risk, such as methylation at GSTP1, CDKN2A, DNMT3B, SCGB3A1, and HIF3A^{12,13,14,15,16}. However, most prior studies have assessed only a couple of candidates. Recent emerging studies profiling genome-wide methylation usually included a relatively small number of subjects¹⁷, resulting in inadequate power for the identification of associated methylation biomarkers. Besides these limitations, there are a number of biases commonly encountered in conventional epidemiologic studies, including selection bias, uncontrolled confounding, and reverse causation, that make it difficult to determine whether the identified associated markers are causally associated with PrCa.

One strategy to reduce some of these biases is to use genetic variants to develop an instrument to assess the association between DNA methylation and PrCa. Such an approach is based on the principle of the random assortment of alleles from parents to offspring during gamete formation, and thus a genetically determined proportion of DNA methylation levels should be less susceptible to selection bias and reverse causation in principal. Research has shown that a large portion of CpG sites have high heritability^18,19. Genome-wide association studies (GWAS) have also identified a large number of genetic loci associated with DNA methylation levels^20,21. Many of these genetic variants could potentially serve as strong instrumental variables for evaluating associations between DNA methylation and PrCa risk in an adequately powered study.

Besides a potential utility in improving PrCa risk assessment, the identification of promising DNA methylation markers using a design of genetic instruments may also contribute to understanding of the genetics and etiology of PrCa. Epidemiological research provides strong support for a genetic predisposition to PrCa^22,23. To date, GWAS have identified ~150 genetic loci for PrCa^24,25,26. However, together these variants explain <30% of the familial relative risk, and the underlying biological mechanisms for a majority of the identified loci remain unclear²⁴. Recently, we performed a large transcriptome-wide association study (TWAS) of PrCa, in which we identified multiple associations between genetically predicted gene expression and PrCa risk²⁷. Interestingly, many of the associated genes were identified to be candidate target genes of GWAS-identified risk SNPs²⁷. Aligned with the recognized role of DNA methylation in regulating gene expression, we hypothesize that some GWAS-identified risk SNPs may regulate expression of their target genes through influencing DNA methylation levels. In this study, we perform a large integrative multi-omics analysis involving data of genomics, methylomics, and transcriptomics aiming to uncover novel CpG sites and genes that may contribute to PrCa development.

Results

DNA methylation prediction models

Using FHS data, we were able to build DNA methylation prediction models for 223,959 CpG sites, of which 81,432 showed a prediction performance (R²) of at least 0.01 (≥10% correlation between predicted and measured DNA methylation levels). For 77,243 of those CpG sites, there were no SNPs within the binding site. Interestingly, there tended to be positive weak correlations between methylation prediction model performance and number of input variants within the 2-MB window of each CpG site (Pearson correlation coefficient 0.03, P = 1.60 × 10⁻¹³; Spearman correlation coefficient 0.02, P = 1.43 × 10⁻⁶). We further applied these 77,243 models to the genetic data in WHI and evaluated their performance by comparing predicted methylation levels with measured levels. Overall, DNA methylation that could be predicted well in FHS also tended to be predicted well in WHI (a correlation coefficient of 0.96 for R² in two datasets; Supplementary Fig. 1). These 77,243 CpG sites were selected for analyses for their associations between predicted DNA methylation and PrCa risk.

Associations of genetically predicted methylation with PrCa

Of the 77,243 CpG sites tested, genetically predicted DNA methylation of 759 located at 82 genomic loci were associated with PrCa risk after Bonferroni correction (P ≤ 6.47 × 10⁻⁷) (Table 1; Supplementary Table 1 and Supplementary Data 1; Manhattan plot in Fig. 1). This included 15 located at 10 genomic loci that were more than 500 kb away from any PrCa risk variant identified in GWAS or fine-mapping studies (Table 1). An association between a higher DNA methylation level and increased PrCa risk was detected for cg18800143, cg07645299, cg12627844, cg16397176, cg11562153, cg13866093, cg00444740, cg20100049, cg22370235, cg04739953, cg01715842, and cg23397578. Conversely, an inverse association between methylation level and PrCa risk was identified for cg24388424, cg06836406, and cg13230424. Of these 15 CpG sites at novel loci, after conditioning on the near PrCa risk variant, the associations of genetically predicted DNA methylation levels for four CpG sites (cg18800143, cg16397176, cg06836406, and cg13230424) remained at P ≤ 6.47 × 10⁻⁷ (Table 1).

Table 1 Fifteen novel methylation-prostate cancer associations for CpG sites located at genomic loci at least 500 kb away from any known prostate cancer risk variant^a.

Full size table

**Fig. 1: A Manhattan plot of the association results from the prostate cancer methylome-wide association study using S-PrediXcan.**

For the remaining 744 CpG sites located at known PrCa risk loci (Supplementary Table 1 and Supplementary Data 1), after conditioning on the adjacent PrCa risk SNP, an association at P ≤ 6.47 × 10⁻⁷ persisted for 63 CpG sites (Supplementary Table 1). This suggests that the associations of these 63 CpG sites with PrCa risk are potentially independent of the PrCa risk SNPs identified in GWAS or fine-mapping studies (Supplementary Table 1). For the other 681 CpG sites, their associations with PrCa risk became weaker, if not completely attenuated, after conditioning on the PrCa risk SNP (Supplementary Data 1). These are potentially due to (1) the previously identified associations of risk SNPs with PrCa at these loci may be mediated through the DNA methylation of these CpG sites identified in the current study, or (2) confounding effects (Supplementary Data 1). We estimated that the 15 CpG sites at novel loci and the 63 CpG sites independent of PrCa risk SNPs could explain 0.69% of familiar risk of PrCa (methods in Supplementary Information).

Based on annotation using ANNOVAR, there were substantial inflations of the “exonic” and “ncRNA exonic” regions for the identified PrCa-associated CpG sites when compared with the overall tested 77,243 CpG sites (chi-square tests: 15.28% versus 7.44%, P = 6.36 × 10⁻¹⁶; 5.53% versus 2.42%, P = 6.37 × 10⁻⁸) (Supplementary Table 2). Also, a substantial decreased proportion of the “intergenic” region was observed (chi-square test: 15.42% versus 25.10%, P = 1.13 × 10⁻⁹) (Supplementary Table 2).

Through an annotation of the 759 PrCa-associated CpG sites using eFORGE v1.2, there tends to be an overlap of their positions with regions containing lysine 4 mono-methylated H3 histone (H3K4me1) markers across 38 of 39 cell types included in the consolidated Roadmap Epigenomics Project, including blood tissues (Supplementary Fig. 2). This suggests that the identified CpG sites associated with PrCa risk may be enriched in enhancers and may be involved in transcriptional activation. We also observed significant enrichment for the associated CpG sites with positions of genes encoding transcription factors (P = 0.001).

For the identified 759 CpG sites showing an association in the PRACTICAL, CRUK, CAPS, BPC3, and PEGASUS consortia, we further evaluated their associations using independent UK Biobank data. In this analysis with far fewer PrCa cases, 554 CpG sites (73%) also showed an association at P < 0.05 with the same direction of effect (Supplementary Data 2). These suggested that the CpG-PrCa risk associations identified in the main analyses using data of the PRACTICAL, CRUK, CAPS, BPC3, and PEGASUS consortia were quite robust. We performed downstream analyses focusing on these 759 CpG sites.

Potential target genes of the PrCa-associated CpG sites

Of the 759 PrCa-associated CpG sites, association analyses were performed for 689 pairs of CpG site-gene, including 613 CpG sites with 244 flanking genes. Overall, associations at a false discovery rate (FDR) < 0.05 were observed for methylation levels of 42 CpG sites with expression of 28 neighbor genes in blood tissue (Supplementary Table 3). Interestingly, we also observed several associations between DNA methylation and expression of genes encoding transcription factors at P < 0.05 (Supplementary Table 4). In the TCGA dataset of tumor-adjacent normal prostate tissue, albeit with a quite limited sample size (n = 34), we observed that 26 of the 37 associations that could be assessed showed the same direction of effect compared with that in the blood tissue (Supplementary Table 5). Among them, 11 showed statistical significance at P < 0.05 in this small dataset (Supplementary Table 5).

Associations of potential target genes with PrCa risk

Of the 28 potential target genes of the identified CpG sites based on blood tissue analyses, blood tissue gene expression prediction models were built for 22 genes, and prostate tissue prediction models were built for 14 genes with a prediction performance (R²) of at least 0.01 (≥10% correlation). Using the S-PrediXcan method, we evaluated associations between the genetically predicted expression of these genes and PrCa risk. Of the 22 genes with blood tissue prediction models built, 18 demonstrated an association at FDR < 0.05 (Table 2). For 12 of them with prostate tissue prediction models built as well, nine showed an association at P < 0.05 (Table 2). For all of the nine genes except for VPS53, the direction of associations was consistent for the predicted expression in blood versus prostate tissue. Of two other genes with models built for prostate tissue only, HLA-DOB showed a significant association with PrCa risk (beta = 0.068, P = 2.65 × 10⁻⁴), and C11orf21 did not show a significant association (P = 0.21).

Table 2 Associations between genetically predicted mRNA expression levels of candidate target genes of identified CpG sites and prostate cancer risk.

Full size table

Associations showing consistent direction of effect

There were 25 CpG sites and 14 genes with consistent directions of association for the DNA methylation–gene expression–PrCa pathway (Table 3). For example, the CpG site cg20240347 located upstream of MDM4, and its DNA methylation level was positively associated with expression of MDM4 (coefficient 0.21; P = 1.69 × 10⁻¹⁴). There was an inverse association between genetically predicted expression of MDM4 and PrCa risk (OR = 0.36; P = 1.55 × 10⁻¹⁹). There was also evidence supporting the genetically predicted DNA methylation of cg20240347 to be associated with a decreased PrCa risk (OR = 0.93; P = 2.61 × 10⁻¹⁹). Interestingly, MDM4 has been previously implicated as a potential target gene that is responsible for the identified association signal of index SNP rs4245739 in GWAS²⁵, and in our recent TWAS study²⁷. Our results highlight a possible role of the CpG site cg20240347 in the underlying biological mechanism of the link between MDM4 and PrCa. Whether the DNA methylation of these CpG sites at the corresponding loci of the genes in Table 3 may play a role in PrCa etiology through the regulation of expression of these genes warrants further investigation. Ingenuity pathway analysis (IPA)²⁸ suggested potential enrichment of cancer-related functions for the 14 implicated genes (Supplementary Table 6). The top canonical pathways identified included cell cycle (P = 0.033) and cancer drug resistance (P = 0.039). It is worth noting that based on the predicted DNA methylation–PrCa risk, DNA methylation–gene expression, and predicted gene expression–PrCa risk results, we also observed six CpG sites and four genes (VAMP8, C4B, BAIAP2L1, and NCOA4) with inconsistent directions of association for the DNA methylation–gene expression–PrCa pathway (Supplementary Table 7). Of these genes, NCOA4, BAIAP2L1, and VAMP8 are candidate PrCa susceptibility genes identified in earlier TWAS^27,29,30. Future work is needed to better understand these associations.

Table 3 Associations showing consistent direction of effect for the methylation–gene expression–prostate cancer risk pathway.

Full size table

Discussion

This is the first large-scale study to comprehensively evaluate associations of genetically predicted DNA methylation levels with PrCa risk. We identified 759 CpG sites whose predicted DNA methylation levels demonstrated an association after Bonferroni correction, including 15 located at novel loci. Of the 744 CpG sites located at known PrCa risk loci, 63 showed an association, even after conditioning on adjacent PrCa risk SNPs. In additional analyses involving gene expression, we observed some evidence suggesting that 25 CpG sites may influence PrCa risk via regulating expression of 14 candidate PrCa target genes. Our study provided substantial information to improve the understanding of genetics and etiology for PrCa, and it also generated multiple CpG sites as potential biomarkers for risk assessment of PrCa, the most common male malignancy globally.

For processing DNA methylation data for genetic model building, we performed quartile normalization for subjects followed by rank normalization for methylation levels, a standard approach widely used in the community for DNA methylation analyses³¹. We acknowledge, however, that such an approach could be suboptimal for CpG sites whose distributions of methylation do not resemble standard normal. Future endeavors for developing more sophisticated methods to deal with this are needed to pick up additional relevant signals. In this study, we identified 759 associated CpG sites, of which 42 were observed to be associated with expression of 28 flanking genes that were annotated by ANNOVAR, based on positions. For the other identified CpG sites, it is possible that genes that are not the most proximal ones could be target genes for local or distal regulation. However, to determine the exact target genes of these CpG sites involves additional lines of evidence besides statistical association, which is beyond the scope of this study. We observed 25 CpG sites with consistent directions of association for the DNA methylation–gene expression–PrCa pathway. Of the 14 linked genes, 10 (MDM4, NUCKS1, PM20D1, VAMP5, GPR160, PDK1, UHRF1BP1, MCAT, LY6G5C, and VPS53) demonstrated an association with PrCa risk in recent TWAS studies^27,30. Furthermore, MDM4 and NUCKS1 have been previously implicated as potential target genes at GWAS-identified PrCa risk loci^25,32. Our results incorporating DNA methylation provide additional insight into the potential mechanism for the link between these genes and PrCa development. Interestingly, in vitro experiments of silencing PDK1 could decrease cell proliferation and inhibit the invasion and migration capability of PrCa cells³³. Further functional studies are needed to better characterize whether there are potential regulatory effects of the identified 25 CpG sites on the expression of the 14 adjacent genes for PrCa development. Importantly, our design of integrating genome, methylome, and transcriptome data provides some evidence that 25 CpG sites may regulate expression of 14 candidate target genes, which further influences PrCa risk. Through the innovative integrative analyses harnessing large-scale human subject data, our study not only identifies several associations consistent with prior findings but it also uncovers potentially important roles of novel CpG sites and putative target genes (e.g., CFAP44, TRIM26, MICB, and ZDHHC7) in prostate tumorigenesis.

For the aim of identifying effective methylation biomarkers for risk assessment of PrCa, a design focusing on blood tissue would be optimal. Such a design could be suboptimal for characterizing the biological mechanism of PrCa development, when compared with the design using genetic instruments of DNA methylation levels identified in prostate tissue, considering potential tissue specificity in DNA methylation levels. On the other hand, research has shown that the genetic regulation of DNA methylation for many CpG sites tends to have a cross-tissue consistency, as indicated by studies comparing blood and different brain region tissues, and among lung, breast, and kidney tissues^20,34. Furthermore, it is challenging to obtain prostate tissues from a large number of healthy individuals. Although prostate tumor-adjacent normal tissue methylation data are available in TCGA, tumor-adjacent normal tissue samples from PrCa patients may contain cancer cells; therefore, the methylation profile of these samples could be different from that of normal prostate tissue samples from healthy men. The statistical power for the model building using TCGA data could also be low due to the relatively small sample size available. In this study, for assessing DNA methylation–gene expression associations to determine potential target genes of identified CpG sites, besides using data from blood tissue (Supplementary Table 3), we also leveraged data from tumor-adjacent normal prostate tissue in TCGA. Despite a small sample size, we observed evidence supporting many of the associations identified using blood tissue data (Supplementary Table 5). For evaluating predicted gene expression–PrCa risk associations, our analyses using prostate tissue gene expression prediction models also support many of the associations identified using blood tissue prediction models (Table 2).

In the current work, a large number of subjects (N = 1595) in the reference FHS dataset was used for the DNA methylation prediction model building. Aligned with the huge sample size for our main association analyses for PrCa risk (79,194 cases and 61,112 controls), our study provides an unparalleled opportunity to detect the DNA methylation–PrCa associations. The use of genetic instruments rendered our study as potentially less susceptible to several limitations commonly encountered in conventional epidemiological studies, such as selection bias and reverse causation. On the other hand, it is worth noting that similar to TWAS, the associations observed in our analyses focusing on CpG sites are also vulnerable to confounding due to pleiotropy and co-localization of genetic signals. For instance, it would be difficult to distinguish a situation in which one causal methylation quantitative trait locus (mQTL) regulates the methylation of two CpG sites from a scenario in which two CpG sites have two causal mQTLs that are in linkage disequilibrium (LD) with each other. Correlated total methylation levels across CpG sites, correlated predicted DNA methylation across CpG sites, as well as shared genetic variants between DNA methylation genetic prediction models and gene expression prediction models, could all lead to spurious associations in our analyses³⁵. When faced with two correlated predictors, regularized regression models like elastic net will randomly down weight one of them, which may be the true causal variant. Despite these potential limitations, our study generated a list of promising PrCa-associated CpG sites that warrant further investigation. By integrating the relationship between DNA methylation, gene expression, and PrCa risk using multi-omics data from different sources, we were able to identify consistent associations of the DNA methylation–gene expression–PrCa risk pathway. This supports a very interesting hypothesis that methylation at selected CpG sites could influence PrCa risk through the regulation of expression of adjacent target genes, which warrants further investigation. The current work generates a list of promising CpG sites showing an association with PrCa, which can be investigated further in future studies that directly measure levels of these CpG sites. Identification of circulating DNA methylation biomarkers could be useful for PrCa risk assessment.

In conclusion, in a large-scale study to evaluate associations between genetically predicted DNA methylation levels and PrCa risk, we identified 759 CpG sites that showed an association, including 15 at novel loci, and an additional 63 that represent association signals independent of known risk variants. We also observed that specific CpG sites may influence PrCa risk via regulating expression of candidate PrCa target genes. Further investigation of these findings will provide additional insight into the biology and genetics of PrCa, as well as facilitate risk assessment of PrCa.

Methods

Study design

The overall study design is shown in Fig. 2. First, we built comprehensive genetic prediction models for DNA methylation levels by using data of the Framingham Heart Study (FHS). After external validation, we selected methylation models with satisfactory prediction performance for association analyses of genetically predicted methylation levels with PrCa risk, by using data of the PRACTICAL consortia which involves 79,194 cases and 61,112 controls. For CpG sites showing an association with PrCa risk, we assessed associations of their methylation with expression of adjacent genes (FHS, N = 1367), to identify potential target genes of these CpG sites. For the suggested candidate target genes, we further assessed associations of their genetically predicted expression with PrCa risk.

Building of DNA methylation prediction models

We obtained the individual level genome-wide genotyping and white blood cell DNA methylation data from the FHS Offspring Cohort (dbGaP accession numbers: phs000342 and phs000724). The details of the FHS Offspring Cohort have been described elsewhere³⁶. In brief, DNA was genotyped using the Affymetrix 500 K array, and DNA methylation was profiled using the Illumina HumanMethylation450 BeadChip. The genotype data were imputed to the Haplotype Reference Consortium reference panel³⁷. SNPs with high imputation quality (R² ≥ 0.8), minor allele frequency ≥0.05, included in the HapMap Phase 2 version, and those that were not strand ambiguous were used to build DNA methylation prediction models. For DNA methylation data, the “minfi” package³⁸ was used to filter out low-quality samples, exclude low-quality methylation probes, estimate cell-type composition, and calculate methylation beta values. We performed quantile normalization to bring the methylation profile of each sample to the same scale, and rank normalization for each CpG site to map each set of DNA methylation values to a standard normal. We adjusted for age, sex, six cell-type composition variables, and the top ten principal components (PCs) derived from genotype data. Genetic and DNA methylation data from 1595 genetically unrelated subjects of European descent were used to build DNA methylation prediction models for this study.

For each CpG site, we built a genetic model to predict DNA methylation levels using the elastic net method as implemented in the “glmnet” package of R, with α = 0.5^39,40,41 (Supplementary Software 1). Genetic variants flanking a 2-Mb window of each CpG site were used to build the model. Tenfold cross-validation was used for internal validation. Prediction R² values, the square of the correlation between predicted and measured methylation levels, were used to estimate the model prediction performance.

External validation of the models

To further evaluate the validity of the built methylation prediction models, we performed external validation using data from 883 unrelated healthy female participants of European descent included in The Women’s Health Initiative (WHI) (dbGaP accession numbers: phs000315, phs000675, and phs001335). Genotype data and white blood cell DNA methylation data were processed using a similar approach, as described above. The predicted DNA methylation for each CpG site was calculated using the models that were established using FHS data, and then compared with the measured level using Spearman’s correlation.

Associations between predicted methylation and PrCa

Considering that our model external validation dataset WHI included females only, and that there is a high concordance of the model performance (R²) in FHS and WHI, we included DNA methylation prediction models (1) with a R² ≥ 0.01 (≥10% correlation between predicted and measured methylation levels) in FHS, a standard criterion used in TWAS for gene expression^{27,39,42,43,44}, heritability of which tends to be similar to that of DNA methylation in blood^31,45, and (2) for probes with no SNPs within the probe-binding site, considering that the measurement of DNA methylation levels for such probes tends to be unbiased⁴⁶. Overall, we evaluated associations between genetically predicted methylation levels of 77,243 CpG sites with PrCa risk.

We estimated the association between genetically predicted DNA methylation levels and PrCa risk using S-PrediXcan, which has been described elsewhere⁴⁷ (Supplementary Software 1). We used the summary statistics data for the association of genetic variants with PrCa risk that had been generated from 79,194 PrCa cases and 61,112 controls of European ancestry in the PRACTICAL, CRUK, CAPS, BPC3, and PEGASUS consortia^26,48. In brief, 46,939 PrCa cases and 27,910 controls were genotyped using OncoArray, which included 570,000 SNPs (http://epi.grants.cancer.gov/oncoarray/). Also included were data from several previous PrCa GWAS of European ancestry: UK stage 1 and stage 2, CaPS 1 and CaPS 2, BPC3, NCI PEGASUS, and iCOGS. These genotype data were imputed using the June 2014 release of the 1000 Genomes Project data as reference. Logistic regression summary statistics were then meta-analyzed using an inverse variance fixed effect approach.

A Bonferroni-corrected threshold of P < 6.47 × 10⁻⁷ (0.05/77,243) was used to determine a statistically significant association. For CpG sites showing a significant association between genetically predicted methylation levels with PrCa risk, we further evaluated whether the observed associations were independent of nearby PrCa risk variants identified in GWAS or fine-mapping studies, by performing GCTA-COJO analysis⁴⁹. For this analysis, the risk SNP showing the most significant association with PrCa risk in the PRACTICAL, CRUK, CAPS, BPC3, and PEGASUS consortia was adjusted for calculating association betas and standard errors of DNA methylation predicting SNPs with PrCa risk. These association statistics were then used for re-running the S-PrediXcan analyses.

Familial relative risk of PrCa explained by novel CpG sites

For PrCa-associated CpG sites that were located at novel loci or independent from known PrCa risk variants, we used the linkage disequilibrium (LD) score regression method⁵⁰ to evaluate the proportion of familial relative risk of PrCa that could be explained by predicted methylation levels of these CpG sites. In brief, we firstly applied the prediction models of these CpGs to the genetic data of male controls included in the pancreatic cancer GWAS data (N = 3655) to generate the predicted methylation of these CpGs for each of the participants. Detailed information for this dataset, quality control, and imputation has been described elsewhere⁵¹. We further used the formula Z² = 1 + (N_Tl/M)/\(h\)² to estimate the heritability explained by these CpG sites. Here for each CpG, Z represents the Z score of the association between the predicted methylation and PrCa risk; N_T represents the number of individuals included in the GWAS of the PRACTICAL, CRUK, CAPS, BPC3, and PEGASUS consortia, namely, 140,306; l represents the LD score of the CpG of interest; M represents the number of CpG sites that were significantly associated with PrCa risk; and \(h\)² is the estimated heritability of PrCa risk that could be explained by the predicted methylation of the CpG sites of interest. The LD score for each CpG was estimated by adding up the squared Pearson correlation coefficient (R²) of the CpG of interest with all the other CpG sites. Finally, after fitting a linear regression model using data of all these CpGs, the estimated heritability of PrCa risk that could be explained by the predicted methylation of the CpGs of interest, along with the standard error and P value, were estimated. Given that the heritability of PrCa was estimated to be 57%⁵², the familial relative risk of PrCa that could be explained by predicted methylation levels of these CpGs was calculated as ℎ²/0.57.

Validation of identified CpG sites using the UK Biobank

Individual level data of the UK Biobank were used to validate the identified associated CpG sites. The UK Biobank released GWAS data on ~500,000 individuals⁵³. PrCa cases were determined by combining Hospital Episode Statistics (HES) data and self-reported data. Specifically, cases were defined as hospital admission, type of cancer, or cause of death due to ICD-9 185.9 or ICD-10 C61 or a self-reported cancer code. We calculated associations of genetically predicted DNA methylation of the identified CpG sites with PrCa risk, adjusting for age, age², and top 20 PCs provided by the UK Biobank. As the number of cases in the UK Biobank is substantially smaller than that in the PRACTICAL, CRUK, CAPS, BPC3, and PEGASUS consortia, we used results from the UK Biobank to confirm the validity of the CpG sites identified in analyses of the consortia data, instead of using their results to filter out CpG sites.

Functional annotation of PrCa-associated CpG sites

We annotated the position and genomic region information of the identified PrCa-associated CpG sites through ANNOVAR⁵⁴. The CpG sites were annotated into one of 13 functional categories, including exonic, intronic, intergenic, upstream, 3′-UTR, 5′-UTR, ncRNA intronic, ncRNA exonic, splicing, downstream, upstream/downstream, 5′-UTR/3′-UTR, and exonic/splicing. We used eFORGE⁵⁵ v1.2 to assess whether the identified CpG sites were enriched in DNase I hypersensitive sites (DHSs) and loci overlapping with various histone modification types, such as H3K27me3, H3K36me3, H3K4me3, H3K9me3, and H3K4me1, across different tissues and cell lines available in the Roadmap Epigenomics Project⁵⁶, the Encyclopedia of DNA Elements (ENCODE)⁵⁷ and the BLUPRINT Epigenome⁵⁸. For each CpG site set of interest, eFORGE performs an overlap analysis against the functional elements for each tissue or cell line separately, and then counts the number of overlaps. A background distribution of the expected overlap counts for the CpG site set of interest is obtained by picking sets of CpG sites with the same number as the test set, matched for gene relationship and CpG island relationship annotation. The matched background sets are then overlapped with the functional elements and the background distribution of overlaps are determined. 1000 matched sets are used. The enrichment value for the test set is expressed as the -log₁₀(binomial P value). Enrichments outside the nominal 95th and 99th percentile of the binomial distribution (after Benjamini–Yekutieli multiple testing correction) are considered significant. We also evaluated whether the associated CpG sites were enriched in loci of genes encoding transcription factors⁵⁹.

Determine genes associated with identified CpG sites

For CpG sites with genetically predicted DNA methylation levels significantly associated with PrCa risk, we evaluated associations between methylation and expression levels of genes flanking their loci by using data from the FHS Offspring Cohort (dbGaP accession numbers: phs000363 and phs000724) and The Cancer Genome Atlas (TCGA). Details of the FHS Offspring Cohort, DNA methylation, and gene expression data have been described elsewhere^36,60,61. Overall, DNA methylation and gene expression data were available for 1367 unrelated individuals. For the CpG sites showing a significant association with PrCa risk, associations between the normalized methylation levels in beta values and normalized expression levels of genes flanking the CpG sites were estimated, after adjusting for age, sex, top PCs, and estimated cell-type compositions based on methylation data. We further assessed significant methylation–gene expression associations identified in blood tissue analyses in adjacent normal prostate tissue of PrCa patients in the TCGA (N = 34). The processing of DNA methylation and gene expression data has been described elsewhere^62,63.

Associations of potential target genes with PrCa risk

For genes whose expression levels were associated with DNA methylation levels, we assessed whether the genetically predicted expression levels of these genes in blood and prostate tissue were also associated with PrCa risk^44,64,65. We used prediction models developed using the PrediXcan method (Elastic Net) and leveraging data from the v8 version of the Genotype-Tissue Expression dataset (GTEx) project (http://predictdb.org/). Details of the methods of building gene expression prediction models using SNPs have been described elsewhere^44,47,66. The prediction models were used to estimate the associations between genetically predicted gene expression levels and PrCa risk in the PRACTICAL, CRUK, CAPS, BPC3, and PEGASUS consortia using S-PrediXcan⁴⁷.

Associations showing a consistent direction of effect

We assessed the associations between genetically predicted DNA methylation levels and PrCa risk, associations between DNA methylation and gene expression levels, and the associations between genetically predicted gene expression and PrCa risk to assess associations showing consistent direction of effect for the DNA methylation–gene expression–PrCa risk pathway. This could indicate the possibility that genetically predicted DNA methylation might putatively influence PrCa risk through the regulation of expression of flanking target genes.

Functional enrichment analysis

We performed functional enrichment analysis for the identified genes consistent with the DNA methylation–gene expression–PrCa risk pathway. Canonical pathways, top associated diseases and biofunctions, and top networks associated with these genes were estimated using IPA software²⁸.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

The OncoArray genotype data and relevant covariate information (i.e., ethnicity, country, principal components, etc.) for prostate cancer study are available in dbGAP (Accession no.: phs001391.v1.p1). In total, 47 of the 52 OncoArray studies, encompassing ~90% of the individual samples, are available. The previous meta-analysis summary results and genotype data are currently available in dbGaP (Accession no.: phs001081.v1.p1). The datasets of FHS Offspring Cohort and WHI are publicly available via dbGaP (www.ncbi.nlm.nih.gov/gap): dbGaP Study Accession: phs000342 and phs000724 for FHS, and phs000315, phs000675, and phs001335 for WHI. TCGA data can be accessed through the Genomic Data Commons Data Portal.

Code availability

The relevant codes are available in the Supplementary Software 1.

References

Torre, L. A. et al. Global cancer statistics, 2012. CA: Cancer J. Clin.65, 87–108 (2015).
Google Scholar
Gaudreau, P. O., Stagg, J., Soulieres, D. & Saad, F. The present and future of biomarkers in prostate cancer: proteomics, genomics, and immunology advancements. Biomarkers in Cancer8, 15–33 (2016).
CAS PubMed PubMed Central Google Scholar
Catalona, W. J., Smith, D. S., Ratliff, T. L. & Basler, J. W. Detection of organ-confined prostate cancer is increased through prostate-specific antigen-based screening. J. Am. Med. Assoc.270, 948–954 (1993).
Article CAS Google Scholar
Antenor, J. A., Han, M., Roehl, K. A., Nadler, R. B. & Catalona, W. J. Relationship between initial prostate specific antigen level and subsequent prostate cancer detection in a longitudinal screening study. J. Urol.172, 90–93 (2004).
Article PubMed Google Scholar
Thompson, I. M. et al. Operating characteristics of prostate-specific antigen in men with an initial PSA level of 3.0 ng/ml or lower. J. Am. Med. Assoc.294, 66–70 (2005).
Article CAS Google Scholar
Parekh, D. J., Ankerst, D. P., Troyer, D., Srivastava, S. & Thompson, I. M. Biomarkers for prostate cancer detection. J. Urol.178, 2252–2259 (2007).
Article CAS PubMed Google Scholar
Thompson, I. M. et al. Prevalence of prostate cancer among men with a prostate-specific antigen level < or =4.0 ng per milliliter. N. Engl. J. Med.350, 2239–2246 (2004).
Article CAS PubMed Google Scholar
Schroder, F. H. et al. Screening and prostate cancer mortality: results of the European Randomised Study of Screening for Prostate Cancer (ERSPC) at 13 years of follow-up. Lancet384, 2027–2035 (2014).
Article PubMed PubMed Central Google Scholar
Schroder, F. H. et al. Screening and prostate-cancer mortality in a randomized European study. N. Engl. J. Med.360, 1320–1328 (2009).
Article PubMed Google Scholar
Andriole, G. L. et al. Mortality results from a randomized prostate-cancer screening trial. N. Engl. J. Med.360, 1310–1319 (2009).
Article CAS PubMed PubMed Central Google Scholar
Draisma, G. et al. Lead time and overdiagnosis in prostate-specific antigen screening: importance of methods and context. J. Natl Cancer Inst.101, 374–383 (2009).
Article PubMed PubMed Central Google Scholar
Massie, C. E., Mills, I. G. & Lynch, A. G. The importance of DNA methylation in prostate cancer development. J Steroid Biochem. Mol. Biol.166, 1–15 (2017).
Article CAS PubMed Google Scholar
Lee, W. H. et al. Cytidine methylation of regulatory sequences near the pi-class glutathione S-transferase gene accompanies human prostatic carcinogenesis. Proc. Natl Acad. Sci. USA91, 11733–11737 (1994).
Article ADS CAS PubMed PubMed Central Google Scholar
Mian, O. Y. et al. GSTP1 Loss results in accumulation of oxidative DNA base damage and promotes prostate cancer cell survival following exposure to protracted oxidative stress. Prostate76, 199–206 (2016).
Article CAS PubMed Google Scholar
Geybels, M. S. et al. Epigenomic profiling of DNA methylation in paired prostate cancer versus adjacent benign tissue. Prostate75, 1941–1950 (2015).
Article CAS PubMed PubMed Central Google Scholar
Kobayashi, Y. et al. DNA methylation profiling reveals novel biomarkers and important roles for DNA methyltransferases in prostate cancer. Genome Res.21, 1017–1027 (2011).
Article CAS PubMed PubMed Central Google Scholar
FitzGerald, L. M. et al. Genome-wide measures of peripheral blood dna methylation and prostate cancer risk in a prospective nested case-control study. Prostate77, 471–478 (2017).
Article CAS PubMed Google Scholar
McRae, A. F. et al. Contribution of genetic variation to transgenerational inheritance of DNA methylation. Genome Biol.15, R73 (2014).
Article PubMed PubMed Central CAS Google Scholar
Grundberg, E. et al. Global analysis of DNA methylation variation in adipose tissue from twins reveals links to disease-associated variants in distal regulatory elements. Am. J. Hum. Genet.93, 876–890 (2013).
Article CAS PubMed PubMed Central Google Scholar
Hannon, E., Weedon, M., Bray, N., O’Donovan, M. & Mill, J. Pleiotropic effects of trait-associated genetic variation on DNA methylation: utility for refining GWAS loci. Am. J. Hum. Genet.100, 954–959 (2017).
Article CAS PubMed PubMed Central Google Scholar
Bell, J. T. et al. DNA methylation patterns associate with genetic and gene expression variation in HapMap cell lines. Genome Biol.12, R10 (2011).
Article CAS PubMed PubMed Central Google Scholar
Demichelis, F. & Stanford, J. L. Genetic predisposition to prostate cancer: update and future perspectives. Urol. Oncol.33, 75–84 (2015).
Article PubMed Google Scholar
Crawford, E. D. Epidemiology of prostate cancer. Urology62, 3–12 (2003).
Article PubMed Google Scholar
Al Olama, A. A. et al. A meta-analysis of 87,040 individuals identifies 23 new susceptibility loci for prostate cancer. Nat. Genet.46, 1103–1109 (2014).
Article CAS PubMed PubMed Central Google Scholar
Eeles, R. A. et al. Identification of 23 new prostate cancer susceptibility loci using the iCOGS custom genotyping array. Nat. Genet.45, 385–391 (2013). 391e381-382.
Article CAS PubMed Google Scholar
Schumacher, F. R. et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat. Genet.50, 928–936 (2018).
Article CAS PubMed PubMed Central Google Scholar
Wu, L. et al. Identification of novel susceptibility loci and genes for prostate cancer risk: a transcriptome-wide association study in over 140,000 European descendants. Cancer Res.79, 3192–3204 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kramer, A., Green, J., Pollard, J. Jr & Tugendreich, S. Causal analysis approaches in ingenuity pathway analysis. Bioinformatics30, 523–530 (2014).
Article PubMed CAS Google Scholar
Emami, N. C. et al. Association of imputed prostate cancer transcriptome with disease risk reveals novel mechanisms. Nat. Commun.10, 3107 (2019).
Article ADS PubMed PubMed Central CAS Google Scholar
Mancuso, N. et al. Large-scale transcriptome-wide association study identifies new prostate cancer risk regions. Nat. Commun.9, 4079 (2018).
Article ADS PubMed PubMed Central CAS Google Scholar
Huan, T. et al. Genome-wide identification of DNA methylation QTLs in whole blood highlights pathways for cardiovascular disease. Nat. Commun.10, 4267 (2019).
Article ADS PubMed PubMed Central CAS Google Scholar
Thibodeau, S. N. et al. Identification of candidate genes for prostate cancer-risk SNPs utilizing a normal prostate tissue eQTL data set. Nat. Commun.6, 8653 (2015).
Article ADS CAS PubMed Google Scholar
Li, W. et al. CD44 regulates prostate cancer proliferation, invasion and migration via PDK1 and PFKFB4. Oncotarget8, 65143–65151 (2017).
Article PubMed PubMed Central Google Scholar
Stueve, T. R. et al. Epigenome-wide analysis of DNA methylation in lung tissue shows concordance with blood studies and identifies tobacco smoke-inducible enhancers. Hu. Mol. Genet.26, 3014–3027 (2017).
Article CAS Google Scholar
Wainberg, M. et al. Opportunities and challenges for transcriptome-wide association studies. Nat. Genet.51, 592–599 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kannel, W. B., Feinleib, M., McNamara, P. M., Garrison, R. J. & Castelli, W. P. An investigation of coronary heart disease in families: the Framingham Offspring Study. Am. J. Epidemiol.110, 281–290 (1979).
Article CAS PubMed Google Scholar
McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet.48, 1279–1283 (2016).
Article CAS PubMed PubMed Central Google Scholar
Aryee, M. J. et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics30, 1363–1369 (2014).
Article CAS PubMed PubMed Central Google Scholar
Wu, L. et al. A transcriptome-wide association study of 229,000 women identifies new candidate susceptibility genes for breast cancer. Nat. Genet.50, 968–978 (2018).
Article CAS PubMed PubMed Central Google Scholar
Yang, Y. et al. Genetically predicted levels of DNA methylation biomarkers and breast cancer risk: data from 228 951 women of European descent. J. Natl Cancer Inst.112, 295–304 (2020).
Article PubMed CAS Google Scholar
Yang, Y. et al. Genetic data from nearly 63,000 women of European descent predicts DNA methylation biomarkers and epithelial ovarian cancer risk. Cancer Res.79, 505–517 (2019).
CAS PubMed Google Scholar
Shi, J. et al. Transcriptome-wide association study identifies susceptibility loci and genes for age at natural menopause. Reprod. Sci.26, 496–502 (2019).
Article CAS PubMed Google Scholar
Lu, Y. et al. A transcriptome-wide association study among 97,898 women to identify candidate susceptibility genes for epithelial ovarian cancer risk. Cancer Res.78, 5419–5430 (2018).
Article CAS PubMed PubMed Central Google Scholar
Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet.47, 1091–1098 (2015).
Article CAS PubMed PubMed Central Google Scholar
Wheeler, H. E. et al. Survey of the heritability and sparse architecture of gene expression traits across human tissues. PLoS Genet.12, e1006423 (2016).
Article PubMed PubMed Central CAS Google Scholar
McRae, A. F. et al. Identification of 55,000 replicated DNA methylation QTL. Sci. Rep.8, 17605 (2018).
Article ADS PubMed PubMed Central CAS Google Scholar
Barbeira, A. N. et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun.9, 1825 (2018).
Article ADS PubMed PubMed Central CAS Google Scholar
Wu, L. et al. Analysis of Over 140,000 European descendants identifies genetically predicted blood protein biomarkers associated with prostate cancer risk. Cancer Res.79, 4592–4598 (2019).
Article CAS PubMed PubMed Central Google Scholar
Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet.44, 369–375 (2012). S361-363.
Article CAS PubMed PubMed Central Google Scholar
Mancuso, N. et al. Integrating gene expression with summary association statistics to identify genes associated with 30 complex traits. Am. J. Hum. Genet.100, 473–487 (2017).
Article CAS PubMed PubMed Central Google Scholar
Zhu, J. et al. Associations between Genetically Predicted Blood Protein Biomarkers and Pancreatic Cancer Risk. Cancer Epidemiol Biomarkers Prev29, 1501–1508, (2020).
Article PubMed CAS PubMed Central Google Scholar
Mucci, L. A. et al. Familial risk and heritability of cancer among twins in nordic countries. J. Am. Med. Assoc.315, 68–76 (2016).
Article CAS Google Scholar
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature562, 203–209 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res.38, e164–e164 (2010).
Article PubMed PubMed Central CAS Google Scholar
Breeze, C. E. et al. eFORGE: a tool for identifying cell type-specific signal in epigenomic data. Cell Rep.17, 2137–2150 (2016).
Article CAS PubMed PubMed Central Google Scholar
Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature518, 317–330 (2015).
Article CAS PubMed PubMed Central Google Scholar
Consortium, E. P. An integrated encyclopedia of DNA elements in the human genome. Nature489, 57–74 (2012).
Article ADS CAS Google Scholar
Adams, D. et al. BLUEPRINT to decode the epigenetic signature written in blood. Nat. Biotechnol.30, 224–226 (2012).
Article CAS PubMed Google Scholar
Hu, H. et al. AnimalTFDB 3.0: a comprehensive resource for annotation and prediction of animal transcription factors. Nucleic Acids Res.47, D33–D38 (2019).
Article CAS PubMed Google Scholar
Joehanes, R. et al. Gene expression signatures of coronary heart diseasesignificance. Arterioscler. Thromb. Vasc. Biol.33, 1418–1426 (2013).
Article CAS PubMed PubMed Central Google Scholar
Marioni, R. E. et al. DNA methylation age of blood predicts all-cause mortality in later life. Genome Biol.16, 25 (2015).
Article PubMed PubMed Central CAS Google Scholar
Nikas, J. B., Mitanis, N. T. & Nikas, E. G. Whole exome and transcriptome RNA-sequencing model for the diagnosis of prostate cancer. ACS Omega5, 481–486 (2020).
Nikas, J. B., Nikas, E. G. & Genome-Wide, D. N. A. Methylation model for the diagnosis of prostate cancer. ACS Omega4, 14895–14901 (2019).
Article CAS PubMed PubMed Central Google Scholar
Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet.48, 245–252 (2016).
Article CAS PubMed PubMed Central Google Scholar
Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet.48, 481–487 (2016).
Article CAS PubMed Google Scholar
Barbeira, A. N. et al. Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet.15, e1007889 (2019).
Article PubMed PubMed Central CAS Google Scholar

Download references

Acknowledgements

The authors thank Wanqing Wen of the Vanderbilt University School of Medicine for his help with this study. The authors also would like to thank all of the individuals for their participation in the parent studies and all the researchers, clinicians, technicians and administrative staff for their contribution to the studies. This study used resources at the Advanced Computing Center for Research and Education (ACCRE) at Vanderbilt University, Nashville, TN (NIH S10 Shared Instrumentation Grant 1S10OD023680-01 (Meiler). A full description of funding and acknowledgments for the PRACTICAL, CRUK, BPC3, CAPS, and PEGASUS consortia are included in the Supplementary Note. Lang Wu is supported by the University of Hawaii Cancer Center Seed Grant. Yanfa Sun is partially supported by the Department of Education of Fujian Province, P R China.

Author information

These authors contributed equally: Lang Wu, Yaohua Yang.
Deceased: Brian E. Henderson.

Authors and Affiliations

Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI, USA
Lang Wu, Yanfa Sun & Jingjing Zhu
Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
Yaohua Yang, Xingyi Guo, Xiao-Ou Shu, Qiuyin Cai, Xiang Shu, Wei Zheng, Jirong Long & William J. Blot
Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, TN, USA
Bingshan Li
Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
Bingshan Li & Ran Tao
Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
Ran Tao
Department of Statistics, Florida State University, Tallahassee, FL, USA
Chong Wu
Research & Development, Genomix Inc, Minneapolis, MN, USA
Jason B. Nikas
College of Life Science, Longyan University, Longyan, Fujian, P. R. China
Yanfa Sun
Department of Urology, Erasmus University Medical Center, Rotterdam, The Netherlands
Monique J. Roobol & Monique J. Roobol
Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, 207 Bouverie St, Melbourne, VIC, 3010, Australia
Graham G. Giles & Graham G. Giles
Cancer Epidemiology & Intelligence Division, Cancer Council Victoria, 615 St Kilda Rd, Melbourne, VIC, 3004, Australia
Graham G. Giles & Graham G. Giles
Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
Hermann Brenner & Hermann Brenner
German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
Hermann Brenner & Hermann Brenner
Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany
Hermann Brenner & Hermann Brenner
Department of Medicine (Oncology) and Stanford Cancer Institute, Stanford University School of Medicine, Stanford, CA, USA
Esther M. John & Esther M. John
Australian Prostate Cancer Research Centre-QLD, Institute of Health and Biomedical Innovation and School of Biomedical Science, Queensland University of Technology, Brisbane, QLD, Australia
Judith Clements, Jyotsna Batra & Judith Clements
Translational Research Institute, Brisbane, QLD, Australia
Judith Clements & Judith Clements
Department of Medical Genetics, Oslo University Hospital, Oslo, Norway
Eli Marie Grindedal & Eli Marie Grindedal
Department of Cancer Epidemiology, Moffitt Cancer Center, Tampa, FL, USA
Jong Y. Park & Jong Y. Park
Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
Janet L. Stanford, Janet L. Stanford & Lisa F. Newcomb
Department of Epidemiology, School of Public Health, University of Washington, Seattle, WA, USA
Janet L. Stanford & Janet L. Stanford
Division of Genetics and Epidemiology, The Institute of Cancer Research, and The Royal Marsden NHS Foundation Trust, London, UK
Zsofia Kote-Jarai, Rosalind A. Eeles, Rosalind A. Eeles, Zsofia Kote-Jarai, Sara Benlloch, Rosalind A. Eeles & Zsofia Kote-Jarai
Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA
Christopher A. Haiman, Brian E. Henderson, Christopher A. Haiman, David V. Conti, Sue Ann Ingles, Brian E. Henderson & Christopher A. Haiman
Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA
Fredrick R. Schumacher & Fredrick R. Schumacher
Seidman Cancer Center, University Hospitals, Cleveland, OH, USA
Fredrick R. Schumacher & Fredrick R. Schumacher
Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Strangeways Research Laboratory, Cambridge, UK
Douglas Easton, Sara Benlloch & Ali Amin Al Olama
University of Cambridge, Department of Clinical Neurosciences, Cambridge, UK
Ali Amin Al Olama
Division of Population Health, Health Services Research and Primary Care, University of Manchester, Oxford Road, Manchester, UK
Kenneth Muir
Warwick Medical School, University of Warwick, Coventry, UK
Kenneth Muir
Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Bethesda, MD, USA
Sonja I. Berndt, Stephen Chanock, Demetrius Albanes, Stephanie Weinstein, Stella Koutros, Sonja I. Berndt, Stephen Chanock, Demetrius Albanes, Stephanie Weinstein, Stella Koutros, Sonja I. Berndt, Stephen Chanock, Demetrius Albanes, Stephanie Weinstein & Stella Koutros
Department of Medical Epidemiology and Biostatistics, Karolinska Institute, Stockholm, Sweden
Fredrik Wiklund, Henrik Gronberg, Fredrik Wiklund & Henrik Gronberg
Epidemiology Research Program, American Cancer Society, 250 Williams Street, Atlanta, GA, USA
Susan M. Gapstur, Victoria L. Stevens, Susan M. Gapstur & Victoria L. Stevens
SWOG Statistical Center, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
Catherine M. Tangen
Institute of Health and Biomedical Innovation and School of Biomedical Sciences, Queensland University of Technology, Brisbane, QLD, 4059, Australia
Jyotsna Batra
University College London, Department of Applied Health Research, London, UK
Nora Pashayan
Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Strangeways Laboratory, Cambridge, UK
Nora Pashayan
Institute of Biomedicine, Kiinamyllynkatu 10, FI-20014 University of Turku, Turku, Finland
Johanna Schleutker
Tyks Microbiology and Genetics, Department of Medical Genetics, Turku University Hospital, PO Box 52, 20521, Turku, Finland
Johanna Schleutker
Division of Nutritional Epidemiology, Institute of Environmental Medicine, Karolinska Institutet, Solna, Sweden
Alicja Wolk
Department of Surgical Sciences, Uppsala University, Uppsala, Sweden
Alicja Wolk
Division of Cancer Sciences, University of Manchester, Manchester Academic Health Science Centre, Radiotherapy Related Research, Manchester NIHR Biomedical Research Centre, The Christie Hospital NHS Foundation Trust, Manchester, UK
Catharine West & Catharine West
Department of Epidemiology, Harvard School of Pubic Health, Boston, MA, USA
Lorelei Mucci & Lorelei Mucci
CeRePP, Tenon Hospital, Paris, France
Géraldine Cancel-Tassin
UPMC Sorbonne Universites, GRC N°5 ONCOTYPE-URO, Tenon Hospital, 4 rue de la Chine, Paris, France
Géraldine Cancel-Tassin
Department of Molecular Medicine, Aarhus University Hospital, Aarhus, Denmark
Karina Dalsgaard Sorensen
Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
Karina Dalsgaard Sorensen
University of Cambridge, Department of Oncology, Addenbrooke’s Hospital, Cambridge, UK
David E. Neal & David E. Neal
Cancer Research UK Cambridge Research Institute, Li Ka Shing Centre, Cambridge, UK
David E. Neal & David E. Neal
Nuffield Department of Surgical Sciences, Faculty of Medical Science, University of Oxfordm, John Radcliffe Hospital, Oxford, UK
Freddie C. Hamdy & Freddie C. Hamdy
School of Social and Community Medicine, University of Bristol, Bristol, UK
Jenny L. Donovan & Jenny L. Donovan
Cancer Epidemiology Unit, Nuffield Department of Population Health University of Oxford, Oxford, UK
Ruth C. Travis, Ruth C. Travis & Ruth C. Travis
Department of Surgical Oncology, Princess Margaret Cancer Centre, Toronto, Canada
Robert J. Hamilton
Department of Radiation Oncology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Barry S. Rosenstein
Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Barry S. Rosenstein
Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, John Vane Science Centre, London, UK
Yong-Jie Lu
Division of Urologic Surgery, Brigham and Womens Hospital, Boston, MA, USA
Adam S. Kibel
Fundación Pública Galega de Medicina Xenómica-SERGAS, Grupo de Medicina Xenómica, CIBERER, IDIS, Santiago de Compostela, Spain
Ana Vega
Centre for Research in Environmental Epidemiology (CREAL), Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain
Manolis Kogevinas
CIBER Epidemiología y Salud Pública (CIBERESP), Madrid, Spain
Manolis Kogevinas
IMIM (Hospital del Mar Research Institute), Barcelona, Spain
Manolis Kogevinas
Universitat Pompeu Fabra (UPF), Barcelona, Spain
Manolis Kogevinas
Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital/Harvard Medical School, Boston, MA, USA
Kathryn L. Penney & Kathryn L. Penney
International Hereditary Cancer Center, Department of Genetics and Pathology, Pomeranian Medical University, Szczecin, Poland
Cezary Cybulski
Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
Børge G. Nordestgaard
Department of Clinical Biochemistry, Herlev and Gentofte Hospital, Copenhagen University Hospital, Herlev, Denmark
Børge G. Nordestgaard
Institute for Human Genetics, University Hospital Ulm, Ulm, Germany
Christiane Maier
The University of Texas M. D. Anderson Cancer Center, Department of Genitourinary Medical Oncology, Houston, TX, USA
Jeri Kim
Department of Genetics, Portuguese Oncology Institute of Porto, Porto, Portugal
Manuel R. Teixeira
Biomedical Sciences Institute (ICBAS), University of Porto, Porto, Portugal
Manuel R. Teixeira
Department of Population Sciences, Beckman Research Institute of the City of Hope, Duarte, CA, USA
Susan L. Neuhausen
Ghent University, Faculty of Medicine and Health Sciences, Basic Medical Sciences, Gent, Belgium
Kim De Ruyck
Department of Surgery, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia
Azad Razack
Department of Urology, University of Washington, Seattle, WA, USA
Lisa F. Newcomb
Institute of Human Genetics, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
Marija Gamulin
Molecular Medicine Center, Department of Medical Chemistry and Biochemistry, Medical University, Sofia, Bulgaria
Radka Kaneva
Department of Oncology, Cross Cancer Institute, University of Alberta, Edmonton, Alberta, Canada
Nawaid Usmani
Division of Radiation Oncology, Cross Cancer Institute, Edmonton, Alberta, Canada
Nawaid Usmani
Molecular Endocrinology Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium
Frank Claessens
Division of Cancer Sciences, Manchester Cancer Research Centre, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, NIHR Manchester Biomedical Research Centre, Health Innovation Manchester, Univeristy of Manchester, Manchester, UK
Paul A. Townsend
Genomic Medicine Group, Galician Foundation of Genomic Medicine, Instituto de Investigacion Sanitaria de Santiago de Compostela (IDIS), Complejo Hospitalario Universitario de Santiago, Servicio Galego de Saúde, SERGAS, Santiago De Compostela, Spain
Manuela Gago Dominguez
University of California San Diego, Moores Cancer Center, La Jolla, CA, USA
Manuela Gago Dominguez
Cancer & Environment Group, Center for Research in Epidemiology and Population Health (CESP), INSERM, University Paris-Sud, University Paris-Saclay, Villejuif, France
Florence Menegaux
Clinical Gerontology Unit, University of Cambridge, Cambridge, UK
Kay-Tee Khaw
Division of Genetic Epidemiology, Department of Medicine, University of Utah School of Medicine, Salt Lake City, UT, USA
Lisa Cannon-Albright
George E. Wahlen Department of Veterans Affairs Medical Center, Salt Lake City, UT, USA
Lisa Cannon-Albright
The University of Surrey, Guildford, Surrey, UK
Hardev Pandha
Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
Stephen N. Thibodeau
Program in Genetic Epidemiology and Statistical Genetics, Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
David J. Hunter & David J. Hunter
International Epidemiology Institute, Rockville, MD, USA
William J. Blot
Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, SW7 2AZ, UK
Elio Riboli, Elio Riboli & Elio Riboli

Authors

Lang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yaohua Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xingyi Guo
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Ou Shu
View author publications
You can also search for this author in PubMed Google Scholar
Qiuyin Cai
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Shu
View author publications
You can also search for this author in PubMed Google Scholar
Bingshan Li
View author publications
You can also search for this author in PubMed Google Scholar
Ran Tao
View author publications
You can also search for this author in PubMed Google Scholar
Chong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jason B. Nikas
View author publications
You can also search for this author in PubMed Google Scholar
Yanfa Sun
View author publications
You can also search for this author in PubMed Google Scholar
Jingjing Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Monique J. Roobol
View author publications
You can also search for this author in PubMed Google Scholar
Graham G. Giles
View author publications
You can also search for this author in PubMed Google Scholar
Hermann Brenner
View author publications
You can also search for this author in PubMed Google Scholar
Esther M. John
View author publications
You can also search for this author in PubMed Google Scholar
Judith Clements
View author publications
You can also search for this author in PubMed Google Scholar
Eli Marie Grindedal
View author publications
You can also search for this author in PubMed Google Scholar
Jong Y. Park
View author publications
You can also search for this author in PubMed Google Scholar
Janet L. Stanford
View author publications
You can also search for this author in PubMed Google Scholar
Zsofia Kote-Jarai
View author publications
You can also search for this author in PubMed Google Scholar
Christopher A. Haiman
View author publications
You can also search for this author in PubMed Google Scholar
Rosalind A. Eeles
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Jirong Long
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

The PRACTICAL consortium

Rosalind A. Eeles
, Brian E. Henderson
, Christopher A. Haiman
, Zsofia Kote-Jarai
, Fredrick R. Schumacher
, Douglas Easton
, Sara Benlloch
, Ali Amin Al Olama
, Kenneth Muir
, Sonja I. Berndt
, David V. Conti
, Fredrik Wiklund
, Stephen Chanock
, Susan M. Gapstur
, Victoria L. Stevens
, Catherine M. Tangen
, Jyotsna Batra
, Judith Clements
, Henrik Gronberg
, Nora Pashayan
, Johanna Schleutker
, Demetrius Albanes
, Stephanie Weinstein
, Alicja Wolk
, Catharine West
, Lorelei Mucci
, Géraldine Cancel-Tassin
, Stella Koutros
, Karina Dalsgaard Sorensen
, Eli Marie Grindedal
, David E. Neal
, Freddie C. Hamdy
, Jenny L. Donovan
, Ruth C. Travis
, Robert J. Hamilton
, Sue Ann Ingles
, Barry S. Rosenstein
, Yong-Jie Lu
, Graham G. Giles
, Adam S. Kibel
, Ana Vega
, Manolis Kogevinas
, Kathryn L. Penney
, Jong Y. Park
, Janet L. Stanford
, Cezary Cybulski
, Børge G. Nordestgaard
, Hermann Brenner
, Christiane Maier
, Jeri Kim
, Esther M. John
, Manuel R. Teixeira
, Susan L. Neuhausen
, Kim De Ruyck
, Azad Razack
, Lisa F. Newcomb
, Marija Gamulin
, Radka Kaneva
, Nawaid Usmani
, Frank Claessens
, Paul A. Townsend
, Manuela Gago Dominguez
, Monique J. Roobol
, Florence Menegaux
, Kay-Tee Khaw
, Lisa Cannon-Albright
, Hardev Pandha
, Stephen N. Thibodeau
, David J. Hunter
, William J. Blot
& Elio Riboli

CRUK Consortium

Rosalind A. Eeles
, Zsofia Kote-Jarai
, Catharine West
, David E. Neal
, Freddie C. Hamdy
, Jenny L. Donovan
, Ruth C. Travis
& Elio Riboli

BPC3 Consortium

Brian E. Henderson
, Christopher A. Haiman
, Fredrick R. Schumacher
, Sonja I. Berndt
, Stephen Chanock
, Susan M. Gapstur
, Victoria L. Stevens
, Demetrius Albanes
, Stephanie Weinstein
, Lorelei Mucci
, Stella Koutros
, Ruth C. Travis
, Kathryn L. Penney
, David J. Hunter
& Elio Riboli

CAPS Consortium

Fredrik Wiklund
& Henrik Gronberg

PEGASUS Consortium

Sonja I. Berndt
, Stephen Chanock
, Demetrius Albanes
, Stephanie Weinstein
& Stella Koutros

Contributions

J.L. and W.Z. conceived the study. L.W. and Y.Y. contributed to the study design. L.W. performed statistical analyses and wrote the paper, with significant contributions from Y.Y. and J.L. X.G. contributed to study discussion. C.W., J.B.N., Y.S., and J.Z. contributed to statistical analyses. X.-O.S., Q.C., X.S., B.L., R.T., M.J.R., G.G.G., H.B., E.M.J., J.C., E.M.G., J.Y.P., J.L.S., Z.K.-J., C.A.H., R.A.E., and W.Z. contributed to paper revision and/or PRACTICAL data management. The PRACTICAL, CRUK, BPC3, CAPS, and PEGASUS consortia investigators contributed to the collection of the data and biological samples for the original studies. All authors have reviewed and approved the final paper.

Corresponding authors

Correspondence to Lang Wu or Jirong Long.

Ethics declarations

Competing interests

R.A.E. has received speakers bureau honoraria and has provided expert testimony for GU-ASCO, RMH FR MTG, and the University of Chicago. The remaining authors declare no competing interests.

Additional information

Peer review informationNature Communications thanks Francesca Demichelis and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Software 1

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wu, L., Yang, Y., Guo, X. et al. An integrative multi-omics analysis to identify candidate DNA methylation biomarkers related to prostate cancer risk. Nat Commun 11, 3905 (2020). https://doi.org/10.1038/s41467-020-17673-9

Download citation

Received: 11 December 2019
Accepted: 28 June 2020
Published: 06 August 2020
DOI: https://doi.org/10.1038/s41467-020-17673-9

This article is cited by

Identification of candidate DNA methylation biomarkers related to Alzheimer’s disease risk by integrating genome and blood methylome data
- Yanfa Sun
- Jingjing Zhu
- Lang Wu
Translational Psychiatry (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

DNA methylation prediction models

Associations of genetically predicted methylation with PrCa

Potential target genes of the PrCa-associated CpG sites

Associations of potential target genes with PrCa risk

Associations showing consistent direction of effect

Discussion

Methods

Study design

Building of DNA methylation prediction models

External validation of the models

Associations between predicted methylation and PrCa

Familial relative risk of PrCa explained by novel CpG sites

Validation of identified CpG sites using the UK Biobank

Functional annotation of PrCa-associated CpG sites

Determine genes associated with identified CpG sites

Associations of potential target genes with PrCa risk

Associations showing a consistent direction of effect

Functional enrichment analysis

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Consortia

The PRACTICAL consortium

CRUK Consortium

BPC3 Consortium

CAPS Consortium

PEGASUS Consortium

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links