Integrative analysis of clinical and epigenetic biomarkers of mortality

Abstract DNA methylation (DNAm) has been reported to be associated with many diseases and with mortality. We hypothesized that the integration of DNAm with clinical risk factors would improve mortality prediction. We performed an epigenome‐wide association study of whole blood DNAm in relation to mortality in 15 cohorts (n = 15,013). During a mean follow‐up of 10 years, there were 4314 deaths from all causes including 1235 cardiovascular disease (CVD) deaths and 868 cancer deaths. Ancestry‐stratified meta‐analysis of all‐cause mortality identified 163 CpGs in European ancestry (EA) and 17 in African ancestry (AA) participants at p < 1 × 10−7, of which 41 (EA) and 16 (AA) were also associated with CVD death, and 15 (EA) and 9 (AA) with cancer death. We built DNAm‐based prediction models for all‐cause mortality that predicted mortality risk after adjusting for clinical risk factors. The mortality prediction model trained by integrating DNAm with clinical risk factors showed an improvement in prediction of cancer death with 5% increase in the C‐index in a replication cohort, compared with the model including clinical risk factors alone. Mendelian randomization identified 15 putatively causal CpGs in relation to longevity, CVD, or cancer risk. For example, cg06885782 (in KCNQ4) was positively associated with risk for prostate cancer (Beta = 1.2, P MR = 4.1 × 10−4) and negatively associated with longevity (Beta = −1.9, P MR = 0.02). Pathway analysis revealed that genes associated with mortality‐related CpGs are enriched for immune‐ and cancer‐related pathways. We identified replicable DNAm signatures of mortality and demonstrated the potential utility of CpGs as informative biomarkers for prediction of mortality risk.

Cognitive Epidemiology (CCACE), which is supported by funding from the BBSRC, the Medical Research Council (MRC), and the University of Edinburgh as part of the cross-council Lifelong Health and Wellbeing initiative (MR/K026992/1). W.D.H. is supported by a grant from Age UK (Disconnected Mind Project). The VA Normative Aging Study is sponsored by the CSP/ERIC program of the U.S. Department of Veterans Affairs and is a research component of the Massachusetts Veterans Education Research and Information Center (MAVERIC). The Atherosclerosis Risk in Communities study has been funded in whole or in part with Federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, Department of Health and Human Services, under Contract nos. 75N92022D00001, 75N92022D00002, 75N92022D00003, 75N92022D00004, and 75N92022D00005. Funding was also supported by 5RC2HL102419 and R01NS087541. The authors thank the staff and participants of the ARIC study for their important contributions. J.M. is supported by a grant from NIH HL135075. W.D.H. is supported by a Career Development Award from the Medical Research Council (MRC) [MR/T030852/1] for the project titled "From genetic sequence to phenotypic consequence: genetic and environmental links between cognitive ability, socioeconomic position, and health."

| INTRODUC TI ON
Despite substantial evidence of heritability of human longevity (h 2 = 10-30%), genome-wide association studies (GWAS) have reported few loci associated with human longevity Pilling et al., 2017;Timmers et al., 2019;van den Berg et al., 2017). DNA methylation (DNAm), the covalent binding of a methyl group to the 5′ carbon of cytosine-phosphate-guanine (CpG) dinucleotide sequences, reflects a wide range of environmental exposures and genetic influences at the molecular level and altered DNAm has been shown to regulate gene expression (Jones & Takai, 2001). Recent studies have reported DNAm patterns associated with age in humans (Hannum et al., 2013;Horvath, 2013;Levine et al., 2018;Lu et al., 2019). Estimates of biological age based on DNAm referred to as "epigenetic age" or "DNAm age" have been validated in numerous studies, although the functions of these age-associated CpGs are largely unknown Lu et al., 2019;.
DNAm age also has been shown to be predictive of many age-related diseases and of all-cause mortality (Chen et al., 2016;Dugué et al., 2018;Levine et al., 2018;Lu et al., 2019;. Despite the association of DNAm age with a variety of ageassociated outcomes, age-related CpGs are different from those that are most strongly associated with mortality. Relatively few DNAm studies have focused on mortality as the primary outcome (Colicino et al., 2020;Svane et al., 2018;Zhang et al., 2017). Moreover, due to sample size limitations, most DNAm mortality studies have not typically investigated cause-specific mortality such as death due to cardiovascular disease (CVD) and cancer. Additionally, little is known about the prediction performance of DNAm-based mortality models and whether or not such approaches improve mortality prediction above and beyond established clinical risk factors.
We hypothesized that inter-individual variation in DNAm is associated with all-cause mortality risk and with cause-specific mortality, and that we could build models incorporating CpGs that would improve mortality prediction beyond established clinical risk factors. In this study, we report the results of a meta-analysis of epigenome-wide association studies (EWAS) of all-cause mortality and cause-specific mortality including death from CVD and cancer in up to 15,013 individuals from 15 prospective cohort studies in which DNAm was measured in whole blood. We built all-cause mortality risk prediction models using penalized regression and machine learning methods and integrated DNAm and established mortality clinical risk factors and validated the models' performance. Additionally, using Mendelian randomization, we identified putatively causal CpGs for mortality.
Last, we investigated the downstream gene expression and pathway changes of the mortality-related CpGs by testing associations between DNAm and gene expression. Figure 1 summarizes the multistep study design.

| 5 of 22
HUAN et Al. Table 1 presents the major clinical characteristics of the 15,013 study participants including 11,684 European ancestry (EA, mean age 65, 55% women) and 3329 African ancestry (AA, mean age 59, 70% women) participants from 15 cohorts ( Table S1 summarizes
Overall genomic inflation in meta-analysis (λ) was estimated at 1.15 or less, indicating low inflation and low risk of false-positive findings.
Even though cohort-specific analysis showed slightly higher genomic inflation in some cohorts (λ > 1.5 in two cohorts, Table S4), forest plots show that the results were not driven by results from one or several cohorts (Fig. S1). Sensitivity analysis results including metaanalysis after correcting for λ in each cohort, meta-analysis after excluding results from two cohorts with λ > 1.5, and meta-analysis after excluding RS cohort are included in Tables S5 and S6. Results of the sensitivity analysis remained consistent with the main results in terms of direction and effect estimates with Pearson's correlation r = 0.99 (in EA, corrected for λ in each cohorts), r = 1.00 (in EA, after removing two cohorts with λ > 1.5), r = 1.00 (in EA, after removing RS) and r = 1.00 (in AA, corrected for λ in each cohorts).
Methylation at the remaining 26 (15%) CpGs was positively associated with mortality, with HRs >1 (range 1.13 to 1.32). The 177 CpGs are annotated to 121 genes and 43 intergenic regions.
Because ARIC had longer follow-up than the other cohorts, in sensitivity analysis, we truncated ARIC follow-up at 15 years. The HRs for the significant CpGs (at p < 1 × 10 −5 ) remained consistent with the main results in terms of direction and effect estimates with Pearson's correlation r = 1.00 and r = 0.99 in EA and AA participants, respectively (Tables S2-S3 and Fig. S2).

| Associations of DNAm with CVD death and cancer death
In comparison with results for all-cause mortality, fewer CpGs were associated with CVD death (at p < 1 × 10 −7 , n = 4 in EA, and n = 15 in AA) and cancer death (n = 0 in EA, and n = 1 in AA); Tables S7-S8 report the corresponding results at p < 1 × 10 −5 . Among the 163 all-cause mortality-related CpGs identified in EA participants at p < 1 × 10 −7 , 41 CpGs were associated with CVD death, 16 with cancer death and 5 with both (at p < 0.05/163, Table S2). Among the 17 CpGs identified in AA participants at p < 1 × 10 −7 , 15 were associated with CVD death, 9 with cancer death and 8 with both (at p < 0.05/17, Table   S3). Figure 2 shows the effect sizes and direction of effect for the top CpGs associated with all-cause mortality, and their consistency with the results of analyses of CVD death and cancer death. If a CpG was positively correlated with all-cause mortality, it also was positively correlated with CVD death and cancer death, and vice versa.

| Mortality prediction model
To investigate whether DNAm can be used to predict mortality risk, we constructed prediction models for all-cause mortality and evaluated their prediction of all-cause mortality, CVD death, and cancer death. To ensure unbiased validation, we split the EA cohorts into separate discovery and replication sets ( Figure 1 shows the analysis flowchart). The discovery cohorts consisted of 8288 participants (including 2173 deaths from all causes) from 10 cohorts, excluding FHS (n = 2427) and ARIC (n = 969), which were used as replication cohorts. The meta-analysis of the discovery set identified 74 CpGs at p < 1 × 10 −7 , 158 CpGs at p < 1 × 10 −6 , 357 CpGs at p < 1 × 10 −5 , 931 CpGs at p < 1 × 10 −4 , 2717 CpGs at p < 1 × 10 −3 , and 28,323 CpGs at p < 0.05. We evaluated three types of input features: (a) clinical risk factors only (i.e., clinical risk factor models); (b) CpGs identified in the meta-analysis of the discovery set (i.e., CpG models); and (c) the input features including both CpGs and clinical risk factors (i.e., integrative models). We also compared four prediction methods including Elastic net-Cox proportional hazards (Elasticcoxph;Friedman et al., 2010), Random survival forest (RSF) (Ishwaran et al., 2008), Cox-nnet (Ching et al., 2018), and DeepSurv (Katzman et al., 2018) (see Methods for details). In general, the four prediction methods did not show major differences in predicting mortality outcomes as assessed by multiple evaluation metrics (Table S9 lists the evaluation metrics across all four methods). To simplify the presentation of results, we focused on the Elastic-coxph method (Figure 3).

| Clinical risk factors strongly predict all-cause mortality and CVD death
The C-index of the clinical risk factor models (age, sex, and 12 clini-

| DNAm predicts mortality independently of age and clinical risk factors
The models using all-cause mortality-related CpGs identified in the discovery cohorts as the sole input feature (the CpG model) were predictive of all-cause mortality, CVD death, and cancer death in the replication set. As shown in Fig. S3, when more discovery CpGs were added to the model, the prediction performance metrics did not always improve. In FHS, the models with discovery CpGs at p < 1 × 10 −3 showed the best predictive performance for all-cause mortality (C-index = 0.77) and CVD death (C-index = 0.82), but the model with discovery CpGs at p < 1 × 10 −5 showed the best predictive performance for cancer death (excluding prevalent cancer . The final CpG models that were trained using all FHS participants are provided in Table S12 including 76 CpGs to predict all-cause mortality and CVD death, and in Table S13 including 56 CpGs to predict cancer death (excluding prevalent cancer cases 3.0 × 10 −5 ).

F I G U R E 2
Effect sizes (log hazards ratios) and 95% confidence intervals of CpGs related to mortality identified by meta-analysis, comparing the results for all-cause mortality, CVD death, and cancer death. (a) Results of meta-analysis of European ancestry (EA); (b) Results of meta-analysis of African ancestry (AA). These figures showed the CpGs associated with all-cause mortality identified by the metaanalysis, which were also associated with either CVD death or cancer death passing Bonferroni-corrected threshold. Figure 1a shows 51 CpGs in EA, including 41 CpGs associated with CVD death, 16 with cancer death, and 5 with both. Figure 1b shows 16 CpGs in AA, including 15 CpGs associated with CVD death, 8 with cancer death, and 7 with both 2.5.3 | The integrative model (trained by CpGs and clinical risk factors) moderately improved upon the clinical risk factor model for all-cause mortality and CVD death, and substantially improved the prediction of cancer death As shown in Table 3, the integrative models demonstrated robustness for predicting mortality outcomes, with a good C-index, HR, and low brier error rate. The final integrative models trained using data from all FHS participants are provided in Table S14

F I G U R E 3
Kaplan-Meier estimates of mortality risk scores with respect to mortality outcomes in ARIC study. (a) Survival curves with respect to all-cause mortality; (b) survival curves with respect to CVD death; (c) survival curves with respect to cancer death. The results were obtained from ARIC European ancestry participants with follow-up truncated at 15 years. For cancer death, we excluded samples who had any type of cancer before blood drawn for DNA methylation measurements. The mortality risk scores for (a) and (b) were computed by the model (Table S10), and for (c) was computed by the model (Table S11) and 0.76 (ARIC) for cancer death. Kaplan-Meier survival curves for the mortality risk scores (split into high-, middle-, and low-risk groups) in the ARIC EA cohort (computed by the integrative models using clinical risk factors and CpGs at discovery p < 1 × 10 −6 , Tables S14-S15) illustrate the higher death rate for those with a higher mortality risk score (log-rank p < 1 × 10 −6 , Figure 4). In comparison with the clinical risk factor models, the integrative models slightly improved prediction of all-cause mortality (0.7% increase in C-index with addition of CpGs in FHS and 2% increase in ARIC), and of CVD death (2% increase in C-index in FHS, but no increase in ARIC We also tested the mortality prediction models' performance using the entire ARIC EA data (without truncation, Note: The clinical risk factor models were trained by using clinical risk factors as the sole input features. The CpG Models were trained by using CpGs selecting in the discovery meta-analysis. The integrative model was trained by using both clinical risk factors and CpGs selecting in the discovery meta-analysis. The Clinical Risk Factor Model used to predict all-cause mortality and CVD death was shown in Table S10, and to predict cancer death (trained in samples excluding prevalent cancer cases) was shown in Table S11. The CpG model used to predict all-cause mortality and CVD death was shown in Table S12, and to predict cancer death (trained in samples excluding prevalent cancer cases) was shown in Table S13. The integrative model used to predict all-cause mortality and CVD death was shown in Table S14, and to predict cancer death (trained in samples excluding prevalent cancer cases) was shown in Table S15. a HR, C-index and IBS values in FHS reflect the average values of 10 times cross-validation. b The results were obtained from ARIC European ancestry participants with follow-up truncated at 15 years.  (Table S18).

| Associations of DNAm with genetic variants and Mendelian randomization analysis
Among the 177 all-cause mortality-related CpGs (union of EA and AA results at p < 1 × 10 −7 ), 123 CpGs had significant associations with genetic variants (i.e., cis-or trans-meQTL variants). meQTL variants for 80 CpGs could be linked to 618 GWAS Catalog (Buniello et al., 2019) index SNPs associated with 432 complex traits or diseases (Table S19).
The genes located at these CpGs or cis-eQTM mRNAs were not enriched for any biological processes or pathways. For the 719 allcause mortality-related CpGs at p < 1 × 10 −5 , genes located at CpG sites were enriched for multiple immune functions, cellular response to organic substance, and negative regulation of cell communica-  (Kanehisa & Goto, 2000), p < 0.05, Table S21). There were 79 cis-DNAm-mRNA pairs (63 CpGs and 67 mRNAs, Table S22). Our study is one of the largest EWAS of mortality to date (Colicino et al., 2020;Svane et al., 2018;Zhang et al., 2017), and it revealed many replicable DNAm signatures for all-cause mortality.

| DISCUSS ION
Our results are consistent with those from previous EWAS of allcause mortality; the vast majority of CpGs (85% in our study, 84% in , and 67% in (Colicino et al., 2020)) were inversely associated with mortality suggesting a greater mortality risk with lower CpG methylation. Our study identified more CpGs in EA cohorts (n = 163) than in AA cohorts (n = 17). As shown in Table 2, the effect sizes (i.e., HR) of mortality-related CpGs in EA and AA participants were quite similar. We speculate that our study identified many more CpGs in EA participants than AA participants due the greater statistical power of the larger EA sample size. Using different DNAm data normalization methods (such as Noob (Triche Jr et al., 2013), SWAN (Maksimovic et al., 2012), BMIQ (Teschendorff et al., 2013), and Dasen (Pidsley et al., 2013), see File S1) in different cohorts may also affect the reproducibility of the results. For the 177 all-cause mortality-related CpGs (union of EA and AA results at p < 1 × 10 −7 ), we examined their overlap with trait-associated F I G U R E 4 Hazard ratios per standard deviation increment with 95% confidence intervals for mortality. (a) With respect to all-cause mortality; (b) with respect to CVD death; and (c) with respect to cancer death. The results were obtained from ARIC European ancestry participants with follow-up truncated at 15 years. For cancer death, samples who had any type of cancer before blood drawn for DNA methylation measurements were excluded. Cox regression models were used to relate mortality outcomes to inversely transformed mortality risk scores computed by Integrative models (Tables S12-S13) and CpG models (Tables S10-S11), and inversely transformed DNAm age including GrimAge (Lu et al., 2019), PhenoAge (Levine et al., 2018), Horvath Age (Horvath, 2013), and Hannum Age (Hannum et al., 2013). Adj age and sex indicated the association further adjusted for age and sex. Adj age, sex and risk factors indicated the association further adjusted for age, sex and the other clinical risk factors CpGs in the EWAS catalog (   Table   S20). Among the four CpGs passing a Bonferroni-corrected threshold in MR analyses, cg06885782 in KCNQ4 was reported to be associated with risk for prostate cancer (beta = 1.2, p MR = 4.1 × 10 −4 ) and negatively associated with longevity (beta = −1.9, p MR = 0.02).

KCNQ4 (potassium voltage-gated channel subfamily Q member 4)
was previously reported to be associated with age-related hearing impairment (Van Eyken et al., 2006), and it contains genetic variants associated with all-cause mortality and survival free of major diseases (Walter et al., 2011). cg07094298 in the gene body of TNIP2 was previously identified as causal for lung cancer. A recent study reported TNIP2-ALK fusion in lung adenocarcinoma . cg04907244 (in TSS1500 of SNORD93) was identified as causal for prostate cancer by MR. SNORD93 and its methylation was reported to be associated with several cancer types including uveal melanoma (Gong et al., 2017), breast cancer (Patterson et al., 2017), and renal clear cell carcinoma (Zhao et al., 2020). Pathway analysis further supported a role of mortality-related CpGs in relation to cancer risk. The intragenic CpGs were enriched for genes in cancer pathways, possibly as a consequence of the expression of nearby genes (cis-eQTMs analysis, Table S21) related to immune function.  et al., 2013), and GrimAge (Lu et al., 2019) were associated with mortality before accounting for risk factors. Only GrimAge, however, remained associated with mortality after adjusting for clinical risk factors. In contrast, the other three DNAm age models were no longer associated with mortality ( Figure 4). One possible explanation is that the three DNAm age predictors (i.e., PhenoAge, Horvath Age, and Hannum Age) identify CpGs associated with age, but are not specific for all-cause or cause-specific mortality risk. Of note, the CpGs that serve as DNAm mortality predictors and those that predict DNAm age in the three models do not overlap. Among the top CpGs (N = 177) associated with all-cause mortality in our EWAS, only cg00687674 in TMEM84 is included in PhenoAge (Levine et al., 2018), and only cg19935065 in DNTT appears in Hannum Age (Hannum et al., 2013).
GrimAge may have outperformed the other three DNAm age models in predicting mortality because the CpGs that it uses are associated with the levels of 80 CVD-related blood proteins, and with lifestyle and clinical risk factors (such as smoking), and mortality (Ho et al., 2018;Shah et al., 2019;Yao et al., 2018). However, because the CpGs in the GrimAge model are not disclosed (i.e., they are proprietary), we were unable to determine whether any of the mortality-related CpGs in our study overlap with CpGs in the GrimAge model. Of note, our mortality prediction models (both the CpG only model and the integrative model that included CpGs and the clinical risk factors) outperformed GrimAge in prediction of mortality outcomes.

We tested and compared four prediction methods including
Elastic-coxph (Friedman et al., 2010), a regression-based method, and three machine learning methods (Ching et al., 2018;Ishwaran et al., 2008;Katzman et al., 2018). The machine learning models did not outperform Elastic-coxph (Table S9 and Fig. S2). The clinical risk factor model trained by machine learning methods did not perform well in independent external replication. For example, the C-index of the clinical risk factor model for all-cause mortality was 0.67 using RSF 17 versus 0.75 using Elastic-coxph in ARIC participants. Based on this metric, the machine learning methods did not outperform the regressionbased methods when there were relatively few features as inputs.
The primary outcome of our study was all-cause mortality. We did not train prediction models for CVD death or cancer death, but we tested the prediction ability of the all-cause mortality predictor on CVD death and cancer death. The CpGs in the model were restricted to all-cause mortality-related CpGs. As shown in Figure 1, the top DNAm signatures for all-cause mortality showed the same direction of effect for CVD death and cancer death. It is possible that some CpGs show opposite directions in relation to CVD death and cancer death, but we did not train separate models for these outcomes. Therefore, developing separate prediction models for CVD death and cancer death with a very large sample size would be an important next step.

| CON CLUS IONS
In conclusion, the ancestry-stratified epigenome-wide meta-analyses in 15 population-based cohorts identified replicable DNAm signatures of all-cause and cause-specific mortality. The top mortalityassociated CpGs were linked with genes involved in immune-and cancer-related pathways, and were reported to be associated with human longevity, CVD risk factors, and several types of cancer. We constructed and validated DNAm-based prediction models that predicted mortality risk independent of established clinical risk factors. for ARIC (mean 20.0 for EA and 18.6 for AA). The protocol for each study was approved by the institutional review board of each cohort. Further details for each cohort were included in File S1.

| Mortality ascertainment and clinical phenotypes
Outcomes including death from all causes, deaths from CVD, and deaths from cancer were prospectively ascertained in each cohort. Survival status and details of death were ascertained using multiple strategies, including routine contact with participants for health history updates, surveillance at the local hospital, review of obituaries in the local newspaper, and National Death Index queries. Death certificates, hospital and nursing home records prior to death, and autopsy reports were requested and reviewed. Date and cause of death were determined separately for each cohort following review of all available medical records and /or were register-based.
The clinical and lifestyle risk factors (referred to as clinical risk factors for simplicity thereafter) used as covariates in this study included age, sex, body mass index (BMI), smoking, alcohol consumption, physical activity, educational attainment, and prevalent diseases including hypertension, coronary heart disease (CHD), heart failure, stroke, type II diabetes, and cancer. Fourteen clinical risk factors were chosen based on prior knowledge; most of these are key CVD risk factors.
The clinical risk factors were ascertained at the time of blood draw for DNAm measurements. BMI was calculated as weight (kg) divided by height squared (m 2 ). Educational attainment (years of educational schooling), physical activity (frequency, intensity, or the metabolic equivalent of task [MET] scores), smoking status (yes/no, or cigs/day), and alcohol consumption (drinks per day) were self-reported or ascertained by an administered questionnaire at routine research clinic visits. Diabetes was defined as a measured fasting blood glucose level of >125 mg/dl or current use of glucose-lowering prescription medication. Hypertension was defined as a measured systolic blood pressure (BP) ≥140 mm Hg or diastolic BP ≥90 mm Hg or use of antihypertensive prescription medication. Cancer was defined as the occurrence of any type of cancer excluding non-melanoma skin cancer.

| DNA methylation measurements and quality control
For each cohort, DNA was extracted from whole blood and bisulfiteconverted using a Zymo EZ DNA methylation kit. DNAm was measured using the Illumina Infinium HumanMethylation450 (450K) BeadChip platform (Illumina Inc., San Diego, CA). Each cohort conducted independent laboratory DNAm measurement, quality control (including sample-wise and probe-wise filtering, and probe intensity background correction; see File S1).

| Cohort-specific epigenome-wide association analysis
The correction of methylation data for technical covariates was cohort specific. Each cohort performed an independent investigation to select an optimized set of technical covariates (e.g., batch, plate, chip, row, and column), using measured or imputed blood cell type fractions, surrogate variables, and/or principal components. Most cohorts had previous publications using the same dataset for EWAS of different traits, such as EWAS of alcohol drinking and smoking (Mendelson et al., 2017;Michailidou et al., 2017). In this study, those cohorts used the same strategies as they did previously for correcting for technical

| Meta-analysis
The meta-analysis was performed for all-cause mortality, CVD death, and cancer death in EA (n = 11,684) and AA (n = 3329) participants, respectively, using inverse variance-weighted random-effects models implemented in metagen() function R packages (https://rdrr. io/cran/meta/man/metag en.html). We chose a random-effects model because of the heterogeneity in follow-up length and population demographics in the different cohorts (Table S1). We excluded the EWAS results for a study with <20 deaths. We excluded probes mapping to multiple locations on the sex chromosomes or with an underlying SNP (MAF > 5% in 1000 Genome Project data) at the CpG site or within 10bp of the single base extension. In addition, the meta-analysis was constrained to methylation probes passing filtering criteria in five or more cohorts (see File S1), which resulted in ~400,000 CpGs that were included in the final analyses. The statistical significance threshold was p < 0.05/400,000 ≈ 1 × 10 −7 .
Three types of sensitivity analyses were performed including (1) correcting for λ values in each cohorts (Devlin et al., 2001), (2) excluding two cohorts with λ > 1.5 from the meta-analysis, and (3) excluding results of RS, because the cohort-specific analysis in RS having a strange distribution of top hits. There were 157 CpGs identified at p < 1e-7 in the RS cohort-specific analysis. The number is much more than the number of all-cause mortality-associated CpGs identified in the other cohorts.

| Mortality prediction models
Mortality prediction models based on clinical risk factors and with the addition of DNAm were built and tested in EA cohorts. The analysis flowchart is shown in Figure 1. To ensure unbiased validation, we split the EA cohorts into discovery and replication sets. The discovery cohorts consisted of 8288 participants from 10 cohorts, excluding FHS (n = 2427) and ARIC (n = 969
Elastic-coxph is a Cox regression model regularized with elastic net penalty (Friedman et al., 2010). Performing this method requires to identify best values of two parameters, α and λ. We tuned each model by iterating over a number of α and λ values under crossvalidation. α indicated linearly combined penalties of the lasso (α=0) and ridge (α=1) regression. λ is the shrinkage parameter, when λ =0 indicated no shrinkage, and as λ increases, the coefficients are shrunk ever more strongly. Effectively this will shrink some coefficients close to 0 for optimizing a set of features. The α value was set to 0.5, and the λ value was set to lambda.min when training models.
RSF is an ensemble tree model that is based on the random forest method for survival analysis (Ishwaran et al., 2008). The optimized values of parameters in RSF models, including the number of trees (nTrees=100) and nodeSize =15, were chosen by iterating over a number of values which maximized the accuracy of RSF models tested in the replication sets under cross-validation. RSF can compute feature importance scores for feature selection.
Cox-nnet is an artificial neural network-based method for survival analysis (Ching et al., 2018). Cox-nnet includes two layer neural network: one hidden layer and one output layer. The output layer was used to perform Cox regression based on the activation levels of the hidden layer. Cox-nnet could also compute feature importance scores for feature selection. For each model training, the L2 regularization parameter is optimized using the L2CVProfile Python function by iterating over a number of values under cross-validation.
DeepSurv is a deep learning-based survival prediction method (Katzman et al., 2018). DeepSurv uses a multi-layer feed forward neural network, of which the hidden layers consist of a fully con- The 2427 FHS participants were randomly split into 5 equal sets (n=485 or 486 in each set), and each set included approximately equal numbers of deaths. We then used 3 of the 5 sets (60%) for model training and the remaining 2 sets (40%) for model testing. In doing so, we obtained 10 combinations. In each training / testing combination, we constructed a model using the training data, and then used the model to generate a mortality risk score based on the testing data. We assessed associations of the predicted mortality risk score (after inversely normal transformation) with all-cause mortality, CVD death, and cancer death in the testing data using time-to-event proportional hazards models. This data partitioning and cross-validation strategy was only used to assess the robustness of prediction models when using different features and methods, and to select the optimized parameters for training models. The final models reported were built on all FHS participants using the optimized parameters. We also repeated the same analysis steps using FHS participants without cancer at baseline (n=2038; 238 deaths from all causes, 70 from CVD, and 42 from cancer).

| Validation
The prediction models built using all FHS participants were tested in ARIC EA participants for the prediction of mortality outcomes.
We performed tests on all-cause mortality and CVD death on all ARIC EA participants truncated at 15 years of follow-up, and tests on cancer death after excluding prevalent cancer. We further tested the all-cause mortality prediction model in the CARDIA study. The CARDIA study has 12 years of follow-up, during which there were 27 deaths from all causes in 905 participants with DNA methylation.

| Evaluation of model performance
We used four evaluation metrics to assess model performance, including the concordance index (C-index) (Harrell Jr et al., 1996), hazards ratio of predicted risk score (inversely transformed) for prediction of mortality, the integrated brier score (IBS) (Brier, 1950), and Kaplan-Meier (KM) survival curves for high-, medium-, and low-risk groups (Kaplan & Meier, 1958). The C-index reflects the percentage of individuals whose predicted survival times are correctly ordered. A C-index of 0.50 reflects no improvement in prediction over chance. The brier score measures the mean of the difference between the observed and the estimated survival beyond a certain time. The brier score ranges between 0 and 1, and a larger score indicates higher inaccuracy. The integrated brier score is the brier score averaged over the entire time interval.

| DNAm Age
We compared the prediction performance of DNAm age with our
Instrumental variables (IVs) for each CpG site consisted of independent cis-meQTLs pruned at linkage disequilibrium (LD) r 2 < 0.01, retaining only one cis-meQTL variant with the lowest SNP-CpG p value in each LD block. LD proxies were defined using 1000 genomes imputation in EA. Inverse variance-weighted (IVW) MR tests were performed on CpGs with at least three independent cis-meQ TL variants, which is the minimum number of IVs needed to perform multiple instruments MR. The multiple instruments improved the precision of IV estimates and allowed the examination of underlying IV assumption (Palmer et al., 2012). Among 177 all-cause mortalityrelated CpGs at p < 1 × 10 −7 , MR tests were performed on 17 CpGs having ≥3 independent cis-meQTL SNPs. To test the validity of IVW-MR results, we performed heterogeneity and MR-EGGER pleiotropy tests for all IVs. The statistical significance threshold for MR is p MR < 0.05/17, and both p heter and p pleio were required to be >0.05. with DNAm as a fixed effect, and batch as a random effect by fitting LME models. Residuals (DNAm_resid) were retained. The gene expression values were adjusted for age, sex, predicted blood cell fraction, a set of technical covariates, the two top PCs and 25 SVs, with gene expression as a fixed effect, and batch as a random effect by LME, and residuals (mRNA_resid) were retained. Then, linear regression models were used to assess pair-wise associations between DNAm_resid and mRNA_resid. SVs were calculated using the SVA package in R. A cis-CpG-mRNA pair was defined as a CpG residing ± 1 Mb of the TSS of the corresponding gene encoding the mRNA (cis-eQTM). The annotations of CpGs and transcripts were obtained from annotation files of the HumanMethylation450K BeadChip and the Affymetrix exon array S1.0 platforms. We estimated that there were 1.6 × 10 8 potential cis-CpG-mRNA pairs. We only used cis-eQTMs in this study because trans-eQTMs were not replicated in independent external studies. The statistical significance threshold was p < 3 × 10 −10 (0.05/1.6 × 10 8 ).

| Gene ontology and pathway enrichment analysis
Gene ontology and pathway enrichment analyses were performed on the genes annotated in relation to the 177 all-cause mortalityrelated CpGs at p < 1 × 10 −7 or p < 1 × 10 −5 as well as the cis-eQTM genes associated with those CpGs. To improve focus in this study, we only used results of KEGG and Gene Ontology-biological process (GO-BP) terms. Enrichment tests used gometh function in miss-Methy R package, which can take into account two types of bias in DNA methylation study: (1) the differing number of probes per gene present on the array, and (2) CpGs that are annotated to multiple genes (Maksimovic et al., 2021).

ACK N OWLED G EM ENTS
The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute; the National Institutes of Health; or the U.S.

Department of Health and Human Services. For a list of all the inves-
tigators who have contributed to WHI science, please visit: https:// s3-us-west-2.amazo naws.com/www-whi-org/wp-conte nt/uploa ds/ WHI-Inves tigat or-Long-List.pdf.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.  Women's Health Initiative (WHI). All study participants provided written informed consent.

DATA AVA I L A B I L I T Y S TAT E M E N T
The DNA methylation data and phenotype data are available in dbGaP for some of the cohorts in this study (https://www.ncbi.nlm.