Genetic variant in a BaP-activated super-enhancer increases prostate cancer risk by promoting AhR-mediated FAM227A expression

Genetic variants in super-enhancers (SEs) are increasingly implicated as a disease risk-driving mechanism. Previous studies have reported an associations between benzo[a]pyrene (BaP) exposure and some malignant tumor risk. Currently, it is unclear whether BaP is involved in the effect of genetic variants in SEs on prostate cancer risk, nor the associated intrinsic molecular mechanisms. In the current study, by using logistic regression analysis, we found that rs5750581T>C in 22q-SE was significantly associated with prostate cancer risk (odds ratio = 1.26, P = 7.61 × 10−5). We also have found that the rs6001092T>G, in a high linkage disequilibrium with rs5750581T>C (r2 = 0.98), is located in a regulatory aryl hydrocarbon receptor (AhR) motif and may interact with the FAM227A promoter in further bioinformatics analysis. We then performed a series of functional and BaP acute exposure experiments to assess biological function of the genetic variant and the target gene. Biologically, the rs6001092-G allele strengthened the transcription factor binding affinity to AhR, thereby upregulating FAM227A, especially upon exposure to BaP, which induced the malignant phenotypes of prostate cancer. The current study highlights that AhR acts as an environmental sensor of BaP and is involved in the SE-mediated prostate cancer risk, which may provide new insights into the etiology of prostate cancer associated with the inherited SE variants under environmental carcinogen stressors.

The associations between rs6001092 and the expression levels of genes within a 1-Mbp window.A: The expression of the significant genes in eQTL analyses between prostate tissues with TT, TG, and GG alleles in rs6001092 from the GTEx database.B: The expression of corresponding genes between tumor tissues and normal tissues in TCGA databases.The mRNA expression level of the genes was log 2 (TPM + 1) transformed and presented as median and interquartile range.Two-tailed Student's t-test was used when normal distribution was met, and Mann-Whitney U test was used otherwise.**** P < 0.000 1. Abbreviations: TCGA, The Cancer Genome Atlas; TPM, transcripts per million; ns, not significant.6 Fan L et al. J Biomed Res, 2024, 0(000)  Genetic variants in super-enhancers and prostate cancer risk Inhibition of siRNA on mRNA and protein expression levels of AHR and FAM227A.A: siAHR-1 has the best interference efficiency and was selected for both PC-3 and LNCaP cells.B: siFAM227A-3 has the best interference efficiency and was selected for both LNCaP and PC-3 cells.Data are presented as mean ± standard deviation.* P < 0.05 compared with the control group by twotailed Student's t-test.Summary of in silico analyses for rs6001092.A: Manhattan plot for association between SNPs in super-enhancers with prostate cancer risk.B: Overview of the epigenetic profiling of H3K4me1, H3K4me3, and H3K27ac chromatin modifications, distributions of DNase Ⅰ hypersensitivity clusters and transcription factor binding sites in the LD region for the human prostate cancer cell line PC-3 released by UCSC databases.C: AhR binding motif predicted through RegulomeDB.Abbreviations: SNP, single nucleotide polymorphism; H3K4me1, histone H3 lysine 4 monomethylation; H3K4me3, histone H3 lysine 4 trimethylation; H3K27ac, histone H3 lysine 27 acetylation; LD, linkage disequilibrium; ChIP-seq, chromatin immunoprecipitation sequencing.

<10SupplementaryFig. 3
Stratified analyses of demographic and clinicopathologic characteristics for the association between rs6001092 and prostate cancer risk in the dominant genetic model.a P-values adjusted for age, body mass index, smoking status, and family history of prostate cancer in logistic regression model.b P-values for the heterogeneity.Abbreviations: OR, odds ratio; CI, confidence interval; BMI, body mass index; PSA, prostate specific antigen.
The expression levels of AHR in prostate cancer tissues and cell lines.A: The mRNA expression levels of AHR in hormone-naive prostate cancer (HNPC) tissues and castrate-resistant prostate cancer (CRPC) tissues in the GEO dataset GSE70768.B: The mRNA expression levels of AHR in patients with different Gleason scores in TCGA database.C: The mRNA and protein expression levels of AHR in prostate cancer and normal cell lines.Gene expression data were log 2 transformed and presented as median and interquartile range.In prostate cancer cells, data are shown as the mean ± standard deviation values from three repeated experiments (n = 3).* P < 0.05 and *** P < 0.001 by two-tailed Student's t-test.Abbreviation: GS, Gleason score.The stratification analysis of FAM227A expression based on TCGA database.A-D: Differences in FAM227A mRNA expression levels between normal and cancer tissues stratified by tumors' Gleason scores (GS), pathologic T stage (pT), pathologic N stage (pN), and the biochemical recurrence (BCR) status, respectively.The gene expression data were log 2 (TPM + 1) transformed and presented as median and interquartile range.Two-tailed Student's t-test was used when normal distribution was met, and Mann-Whitney U test was used otherwise.* P < 0.05, ** P < 0.01, and *** P < 0.001.Abbreviation: TPM, transcripts per million.Gene Set Enrichment Analysis (GSEA) for hallmark gene sets between high-expression and low-expression of FAM227A.Significantly enriched activated and suppressed hallmarks terms.Cell viability and AhR activation after treatment with different concentrations of BaP in vitro.LNCaP and PC-3 cells were treated with the indicated concentrations of BaP for 24 h.A: Cell viability was examined by CCK-8 assay.B: The protein expression of CYP1A1 was detected by Western blotting.GAPDH was used as a loading control.Abbreviation: BaP, benzo[a]pyrene.Effects of BaP and inhibitor of AhR (CH223191) on prostate cancer cell apoptosis and the cell cycle distribution.PC-3 cells treated with BaP (10 μmol/L), CH223191 (10 μmol/L), and BaP (10 μmol/L) combined with CH223191 (10 μmol/L), respectively.A: Flow cytometry detection of the apoptosis.B: Flow cytometry detection of the cell cycle.All of the experiments were performed in triplicate (n = 3).Data are presented as mean ± standard deviation.* P < 0.05 compared with the control group by twotailed Student's t-test.Abbreviation: BaP, benzo[a]pyrene.Quantitative analysis of protein expression A: Quantitative analysis of FAM227A protein expression.LNCaP and PC-3 cells were treated with BaP (10 μmol/L).B: Quantitative analysis of CYP1A1 and FAM227A protein expression.LNCaP and PC-3 cells were treated with BaP (10 μmol/L) and CH223191 (10 μmol/L) alone or in combination.Data are presented as mean ± standard deviation.* P < 0.05 compared with the control group by two-tailed Student's t-test.ns, not significant.Abbreviation: BaP, benzo[a] pyrene.

Supplementary Table 1 Baseline characteristics in prostate cancer cases and controls
Adjusted for age, body mass index, smoking status and family history of prostate cancer in the dominant genetic model.Age at trial entry, computed from date of birth and randomization date.Abbreviations: OR, odds ratio; CI, confidence interval; BMI, body mass index.
a P for two-side χ 2 test.b Age at trial entry, computed from date of birth and randomization date.c The PSA level from the most recent PSA test the participant received prior to diagnosis.d Combined clinical and pathologic stage.a Reference allele/effect allele.b HWE Hardy Weinberg Equilibrium in control subjects.c P for additive genetic model adjusted for age, body mass index, smoking status and family history of prostate cancer in logistic regression model.d P after false discovery rate correction.Abbreviations: SNPs, single nucleotide polymorphisms; MAF, minor allele frequency; HWE, Hardy-Weinberg equilibrium; SE, super-enhancer; OR, odds ratio; CI, confidence interval.b P-value for heterogeneity.c

Supplementary Table 8 Stratification analysis of clinicopathologic variables for the association between rs6001092 and prostate cancer risk in dominant genetic model
a Adjusted for age, body mass index, smoking status and family history in the dominant genetic model.b P-value for heterogeneity.Abbreviations: OR, odds ratio; CI, confidence interval; PSA, prostate specific antigen.