Epigenome-wide gene–age interaction analysis reveals reversed effects of PRODH DNA methylation on survival between young and elderly early-stage NSCLC patients

DNA methylation changes during aging, but it remains unclear whether the effect of DNA methylation on lung cancer survival varies with age. Such an effect could decrease prediction accuracy and treatment efficacy. We performed a methylation–age interaction analysis using 1,230 early-stage lung adenocarcinoma patients from five cohorts. A Cox proportional hazards model was used to investigate lung adenocarcinoma and squamous cell carcinoma patients for methylation–age interactions, which were further confirmed in a validation phase. We identified one adenocarcinoma-specific CpG probe, cg14326354PRODH, with effects significantly modified by age (HRinteraction = 0.989; 95% CI: 0.986–0.994; P = 9.18×10–7). The effect of low methylation was reversed for young and elderly patients categorized by the boundary of 95% CI standard (HRyoung = 2.44; 95% CI: 1.26–4.72; P = 8.34×10-3; HRelderly = 0.58; 95% CI: 0.42–0.82; P = 1.67×10-3). Moreover, there was an antagonistic interaction between low cg14326354PRODH methylation and elderly age (HRinteraction = 0.21; 95% CI: 0.11–0.40; P = 2.20×10−6). In summary, low methylation of cg14326354PRODH might benefit survival of elderly lung adenocarcinoma patients, providing new insight to age-specific prediction and potential drug targeting.


INTRODUCTION
Population aging has resulted in a rapid increase in lung cancer cases as well as corresponding surgeries among elderly patients [1]. Indeed, the median age at diagnosis of lung cancer is 70 years old [2]. Further, lung cancer leads as a cause of cancer deaths among men ≥40 years old and women ≥60 years old [3].
Progression of lung cancer is, in part, due to accumulation of genomic instability as well as age-related declines in system integrity and function [4]. Thus, even for individuals exposed to similar levels of risk factors, lung cancer severity can vary as a function of individual differences in aging. Therefore, compared to predictive guidance for the overall population, effective predictive guidance for age-specific populations, especially elderly patients, is needed to better guide postoperative treatment and improve survival. Developing such guidance necessitates identifying exclusive prognostic indicators of lung cancer for the elderly.
Epigenetic mechanisms represent the molecular interface mediating gene-environment interactions throughout the lifecycle [5]. DNA methylation, a reversible epigenetic modification, correlates with tumor prognosis in almost all cancers including nonsmall cell lung cancer (NSCLC) [6][7][8][9]. DNA methylation events may potentially be cancer biomarkers as well as therapeutic targets to improve cancer treatment [10].
Alterations to DNA methylation often occur during aging [11]. One of these alterations, known as "epigenetic drift", may further contribute to tumorigenesis in the elderly [12]. Changes in DNA methylation also can contribute to senescence [13].
However, it remains largely unclear whether alterations of methylation patterns resulting from aging, accumulating environmental exposures throughout life [14], and other events also have varied effects on cancer survival during aging. Such phenomena may further explain the increased alteration of cancer mortality risk with age and may increase the effectiveness of cancer prediction and treatment.
We hypothesized that the methylation effect on cancer survival changes during aging. Thus, identifying agespecific signatures will be critical for prognosis prediction, underpinning potential preventative strategies, and improving survival for elderly patients. However, most epigenome-wide association studies are designed to identify main effects of DNA methylation and fail to provide knowledge about changes in epigenetic effects during aging. Thus, we performed an epigenome-wide methylation-age interaction analysis to identify age-specific, prognosis-associated epigenetic biomarkers using NSCLC patients from four cohorts, along with an independent population from The Cancer Genome Atlas (TCGA) to confirm our results.

RESULTS
After quality control (QC) procedures, 1,230 lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) patients with 311,891 CpG probes remained for subsequent association analysis. There were 613 (NLUAD = 492; NLUSC = 121) patients in the discovery phase, and 617 (NLUAD = 332; NLUSC = 285) patients in the validation phase. The average age was 66.4 and 66.5 years for patients in the discovery and validation phases, respectively. Most NSCLC patients were in stage I (77.5% in discovery; 63.7% in validation) (Supplementary Table 1).

AGING
We only observed two significant methylation-age interactions for LUAD patients in the discovery phase   Table 3), remained significant.

AGING
In addition, we evaluated the joint effect of cg14326354PRODH methylation level (low vs high) and age (elderly vs young) on LUAD survival ( Table 1). The group with the best survival (young patients with high methylation) was used as the reference to evaluate effects of low methylation, elderly age, and their interaction. The main effect of low cg14326354PRODH methylation was HR = 2.84 (95% CI: 1.59-5.08, P = 4.35×10 −4 ), and the main effect of elderly age was HR = 3.18 (95% CI: 1.85-5.46, P = 2.64×10 −5 ). However, the joint effect was HR = 1.86 (95% CI: 1.08-3.19, P = 2.42×10 −2 ), which was less than the product of the two main effects (2.84×3.18 = 9.03). This result indicates an antagonistic interaction between low cg14326354PRODH methylation and elderly age (HRinteraction = 0.21; 95% CI: 0.11-0.40; P = 2.20×10 −6 ).  Table 1. Joint effect and interaction of low methylation and elderly age on the prognosis of early-stage lung adenocarcinoma (LUAD). Further, cis-regulation and genome-wide transregulation analyses were conducted in the TCGA cohort. We observed significant correlation between cg14326354PRODH and PRODH expression (r = -0.23; P = 3.38 × 10 -5 ) in LUAD patients ( Figure 3A), indicating that cg14326354PRODH cis-regulated gene expression. Moreover, genome-wide trans-regulation analysis revealed that expression of 821 genes was significantly correlated with methylation level of cg14326354PRODH (Supplementary Material 3, Figure 3B). KEGG enrichment analysis based on 821 trans-regulated genes showed several significant immune-or inflammationrelated pathways, such as chemokine signaling, T cell receptor and B cell receptor signaling, and cellular pathways such as cell differentiation and cell cycle ( Figure 3C). Notably, these trans-regulated genes were also enriched in senescence-related pathways (e.g., cellular senescence) and cancer-related pathways (e.g.,

DISCUSSION
In this two-stage study using five independent cohorts, we systematically investigated methylation-age interactions on an epigenome-wide scale. Our results show an antagonistic interaction between elderly age and low methylation of cg14326354PRODH, indicating opportunities for epi-drug intervention due to the inherent reversibility of epigenetic events [16] and increasing treatment efficiency based on age-specific drug targeting.
PRODH is located in chromosome 22q11.2, a region often deleted in various human tumors. This gene encodes a mitochondrial inner membrane-associated enzyme that acts as a tumor suppressor in vitro and in vivo [17]. However, PRODH plays a paradoxical role in tumors. Hypoxia and nutrient depletion are important characteristics of the tumor microenvironment, where PRODH may serve as a tumor survival factor [18]. Indeed, PRODH supports tumor metastasis formation, and inhibiting its activity impairs cancer cell growth, indicating PRODH is a potential drug target [19]. A metabolic enzyme, PRODH can catabolize proline to pyrroline-5-carboxylate (P5C). The process can donate electrons that enter the electron transport chain to produce reactive oxygen species (ROS), which then participate in protective autophagy rather than apoptotic cell death [18].
Autophagy, a self-digestion process, plays an important role in maintaining intracellular homeostasis. Autophagy can clear intracellular abnormally folded protein and dysfunctional organelles, inhibit cell stress response, and prevent genetic damage in early phases of tumorigenesis. However, autophagy helps tumor cells survive nutritional deficiencies and hypoxic conditions when tumors develop and accumulate more mutations to promote malignant progression [20]. Further, tumorsurrounding normal cells, which are active and essential parts of the microenvironment, support tumor proliferation by autophagy. Besides, autophagy in distant organs may also support growth of tumor tissue [21]. Additionally, autophagy can act as a mechanism of tumor resistance to chemotherapy agents and lead to antagonistic effects of gefitinib combined with cisplatin in NSCLC treatment, which may contribute to poor therapeutic effectiveness and patient prognosis [22,23]. Further, our results suggest that low methylation of AGING cg14326354PRODH may potentially promote PRODH expression, further heighten autophagy to some extent [24], and then result in poor prognosis ( Figure 4).
Age is an independent risk factor for lung cancer survival [25]. Individual aging implies a higher abundance of senescent cells in aged tissues and reflects an increase in the generation of senescent cells [26]. At old age, senescent cells generate a protumorigenic microenvironment, though at young age these cells may protect against transformation into primary tumors [27]. A previous study also shows that p53 function declines during aging [28] and might promote tumor growth and decrease cancer survival [29]. Moreover, senescent cells can promote reprogramming of tumor stem cells, increase cancer stemness, and accelerate tumor growth [30]. Thus, combined with our results, increased generation of senescent may be relevant to poor NSCLC prognosis for elderly patients (Figure 4). AGING Autophagy is reduced in aging, likely through several mechanisms [31]. Lipofuscin produced during aging can destroy the function of lysosomes, restricting binding between autophages and lysosomes [32]. In addition, expression of lysosome-associated membrane glycoprotein (LAMP2a), which assists autophagy, decreases during aging and thus can inhibit autophagy [33]. Further, by guaranteeing stability of the cellular proteome and proper organelle turnover, autophagy can prevent or slow down aging and extend lifespan [34]. The antagonistic effect exists between aging and the autophagy level resulting from low methylation of cg14326354PRODH, in spite of the harmful effect of both, which could provide a possible mechanism of the cg14326354PRODH-age interaction ( Figure 4).
The 821 significant trans-regulated genes we identified were enriched in KEGG pathways including inflammation and immune-related pathways. Notably, cellular senescence was involved in these pathways, again indicting potential indirect induction of cg14326354PRODH on senescence. Meanwhile, the NF-κB pathway, with the ability to upregulate genes responsible for inflammation, cell survival, proliferation, invasion, angiogenesis, and metastasis, often plays a critical role in initiation, promotion, progression, and therapy resistance of cancers [35,36]. Further, NF-κB family members can activate or inhibit signaling pathways, leading to induction of autophagy or transcription of certain pro-autophagicregulating genes [35], and can induce senescence [37]. Because cell proliferation can be associated with both senescence and survival [38,39], we also analyzed several proliferation-associated genes retrieved from the KEGG database. Expression of these genes were significantly correlated with cg14326354PRODH methylation and affected LUAD survival, including MKI67, BTG2, KIAA1524, and CDC123 (Supplementary Table 7). Our previous study of BTG2 expression and methylation already indicated it is a prognostic biomarker of NSCLC [7]. These results also indicate the potential role of cg14326354PRODH in indirect induction of autophagy, senescence, and survival. Further functional studies are warranted to elucidate the mechanism of cg14326354PRODH and age interaction on LUAD survival.
Age represents a complex surrogate for a host of underlying phenomena, although its measurement is simple and accurate [40]. A previous study suggested that gene-age interactions may partially be surrogates for gene-gene and gene-environment interactions [41]. In a study investigating the efficacy of metronomic vinorelbine to treat patients with advanced unresectable NSCLC, age was an important factor that decreased treatment efficacy [42]. Our study might provide a novel explanation of age effects on treatment efficacy from the cg14326354PRODH-age interaction perspective. Further clinical studies will provide additional insight into cg14326354PRODH and its age-specific effects in tumors, which may lead to new age-specific biomarkers and therapeutic strategies that improve prediction accuracy and treatment efficacy.

AGING
Our study has several strengths. First, this is the first epigenome-wide study to investigate the interaction between DNA methylation and age on NSCLC survival, which provides new evidence to account for the missing heritability of complex diseases [43] and may further reveal the role of age in heterogeneity of NSCLC prognosis and treatment efficacy. Second, to identify stable and reliable biomarkers, a two-stage study design along with Bonferroni correction and sensitivity analysis was used to exhaustively search for interactions, which is quite conservative in controlling for false positives. Finally, with a large sample size to analyze DNA methylation-age interactions, our study has improved statistical power to identify complex associations with small-medium effect size.
Nonetheless, several limitations also need to be acknowledged. First, we did not observe robustly significant methylation-age interactions on survival for LUSC, which may be due to limited sample size and thus insufficient power. However, there was no significant heterogeneous effect between LUAD and LUSC groups (Supplementary Table 8). Second, the association was no longer significant in young LUAD patients when the analysis used UN standards to define age groups. However, we still observed a significant association in patients <57 years old. This effect might be because >62% (240/385) of young patients defined using the UN standard (57-65 years) attenuated the effect of cg14326354PRODH methylation. Therefore, high methylation of cg14326354PRODH might benefit survival of young LUAD patients. Third, although widespread methylation-age interactions may exist, we only identified one interaction, which may be due to the most conservative correction method used in the discovery phase and limited statistical power in the validation phase due to low event rate of survival time in the TCGA population. We may need longer time to followup early-stage patients in TCGA for their events to occur. Nevertheless, the interaction between cg14326354PRODH and age was successfully confirmed, indicating it was a conservative and robust association. Fourth, our analysis was based on the assumption of linear additive interaction, and new statistical models can be used to properly capture non-linear methylationage interactions. Last, the cis-regulation pattern of cg14326354PRODH requires more biological evidence, although methylation is believed to play a crucial role in regulating gene expression [44] and further influence disease gene function [45]. Therefore, our findings should be interpreted with caution, and functional experiments are warranted to confirm these associations.
In conclusion, low methylation of cg14326354PRODH benefited survival of elderly LUAD patients. Our results have implications for not only age-specific prediction of cancer survival, but also possible methylation-specific drug targeting.

Lung cancer study populations
Only early-stage (stage I-II) LUAD and LUSC patients were included in our study. DNA methylation data was harmonized from five cohorts: Harvard, Spain, Norway, Sweden, and TCGA.

Harvard
Since 1992, patients have been recruited at Massachusetts General Hospital (MGH) and histologically confirmed as primary NSCLC [46]. We profiled 151 early-stage patients from this cohort. During curative surgery, tumor specimens were collected with complete resection and snap-frozen. A MGH pathologist evaluated each specimen for tumor cell amount (tumor cellularity > 70%) and quality. Specimens were classified histologically according to World Health Organization (WHO) criteria. The Institutional Review Boards at Harvard T.H. Chan School of Public Health and MGH approved the study. All patients provided written informed consent.

Spain
The Spain cohort included 226 early-stage NSCLC patients recruited from eight sub-centers in 1991-2009 [47]. Tumor DNA was extracted from fresh-frozen tumor specimens and further checked for integrity and quantity. Patients provided written consent, and tumors were surgically collected. The study was approved by the Bellvitge Biomedical Research Institute Institutional Review Boards.

Norway
The Norway study population consisted of 133 earlystage NSCLC patients from Oslo University Hospital-Riks Hospitalet, Norway, in 2006-2011 [48]. Tumor tissues were snap-frozen in liquid nitrogen and stored at -80°C until DNA isolation. The project was developed with approval of the Oslo University Institutional Review Board and Regional Ethics Committee (S-05307). All patients provided informed consent.

Sweden
Tumor DNA was collected from 103 early-stage NSCLC patients, including 80 LUAD and 23 LUSC patients, at the Skane University Hospital in Lund, Sweden [49]. The study was developed under the approval of the Regional Ethical Review Board in Lund, Sweden (registration no. 2004/762 and 2008/702). All patients provided written informed consent.
AGING TCGA A total of 332 LUAD and 285 LUSC cases with full DNA methylation, survival time, and covariate data were included. Level-1 HumanMethylation450 DNA methylation data of early-stage NSCLC patient were downloaded on October 01, 2015.
Quality control for DNA methylation data DNA methylation was assessed using Illumina Infinium HumanMethylation450 BeadChips (Illumina Inc.). Raw image data were imported into Genome Studio Methylation ModuleV1.8 (Illumina Inc.) to calculate methylation signals and to perform normalization, background subtraction, and QC. Unqualified probes were excluded if they fit any of the following criteria: (i) failed detection (P > 0.05) in ≥5% samples, (ii) coefficient of variance (CV) <5%, (iii) all samples methylated or all unmethylated, (iv) common single nucleotide polymorphisms located in probe sequence or in 10-bp flanking regions, (v) cross-reactive probes [50], or (vi) data did not pass QC in all cohorts. Samples with >5% undetectable probes were excluded. Methylation signals were further processed for quantile normalization (betaqn function in R package minfi [51]), type I and II probe correction (BMIQ function in R package lumi [52]), and adjusted for batch effects (ComBat function in R package sva) [53]. Details of QC process are described Supplementary Figure 6.

Quality control for gene expression data
The TCGA workgroup completed mRNA sequencing data processing and QC. Raw counts were normalized using RNA-sequencing by expectation maximization (RSEM). Level-3 gene quantification data were downloaded from the TCGA data portal and were further checked for quality. Expression of genes was extracted and log2-transformed before analysis. Normalization results were then evaluated through boxplots of the distribution of gene expression across all samples (Supplementary Figure 7).

Statistical analysis
Statistical analysis flow is presented in Figure 1. Patients from Harvard, Spain, Norway, and Sweden study cohorts were assigned to the discovery phase, while patients in TCGA were assigned to the validation phase.
In the discovery phase, histology-stratified analysis and Cox proportional hazards model adjusted for age, smoking status, sex, clinical stage, and study center were applied to test the methylation-age interaction effect on overall survival in LUAD and LUSC patients using the R package survival [54]. Hazard ratio (HR) and 95% confidence interval (CI) were described per 5% level of methylation decrement. The P-value threshold for multiple testing was established using the Bonferroni method, which set the significance level to 0.05 divided by number of tests. This way, overall type I error was controlled at the 0.05 level. In our study, significance level of interaction analysis was defined as 1.60×10 -07 = 0.05/311,891. Interactions with P ≤ 1.60×10 -07 were screened out and then further confirmed in the validation phase. Robustly significant probes were retained if they fit both of the following criteria: (i) interaction P ≤ 0.05, and (ii) direction of interaction effects was consistent across both phases. In sensitivity analysis, patients were excluded if their methylation values were out of range mean ± 3×standard deviation (SD) on logit2-transformed scale. Genome-wide expression correlation analysis was performed to identify potential trans-regulation genes in TCGA. KEGG enrichment analysis of potential transregulation genes (Bonferroni adjusted P < 0.05) was conducted using R package clusterProfiler [55].
Continuous variables were summarized as mean ± SD; categorical variables were described as n (%). Kaplan-Meier survival curves were used to illustrate survival difference between patients in low and high methylation groups (categorized by median value). We used two classification criteria to define young and elderly patients: (i) the UN standard (1956) of 65 years old as the cut-off value to distinguish elderly and young people [56], (ii) and a cut-off value calculated based on BoCI of HR of the CpG probe. All statistical analyses were performed in R version 3.5.1 (The R Foundation).  Patients from Harvard, Spain, Norway, and Sweden cohorts were assigned to discovery phase; patients in TCGA were assigned to validation phase. HR: hazard ratio; 95% CI: 95% confidence interval Supplementary Table 7. Correlation analysis of association between cg14326354PRODH methylation and proliferationassociated gene expression in lung adenocarcinoma (LUAD) patients using The Cancer Genome Atlas data, as well as survival analysis of proliferation-associated genes (from KEGG database).

Abbreviations
Correlation coefficient (r), 95% CI, and P-values were derived from Pearson correlation analysis; survival analysis HR, 95% CI, and P-values were derived from Cox proportional hazards model. HR: hazard ratio; 95% CI: 95% confidence interval.