DNA methylation-based measures of biological age: meta-analysis predicting time to death

Estimates of biological age based on DNA methylation patterns, often referred to as “epigenetic age”, “DNAm age”, have been shown to be robust biomarkers of age in humans. We previously demonstrated that independent of chronological age, epigenetic age assessed in blood predicted all-cause mortality in four human cohorts. Here, we expanded our original observation to 13 different cohorts for a total sample size of 13,089 individuals, including three racial/ethnic groups. In addition, we examined whether incorporating information on blood cell composition into the epigenetic age metrics improves their predictive power for mortality. All considered measures of epigenetic age acceleration were predictive of mortality (p≤8.2×10−9), independent of chronological age, even after adjusting for additional risk factors (p<5.4×10−4), and within the racial/ethnic groups that we examined (non-Hispanic whites, Hispanics, African Americans). Epigenetic age estimates that incorporated information on blood cell composition led to the smallest p-values for time to death (p=7.5×10−43). Overall, this study a) strengthens the evidence that epigenetic age predicts all-cause mortality above and beyond chronological age and traditional risk factors, and b) demonstrates that epigenetic age estimates that incorporate information on blood cell counts lead to highly significant associations with all-cause mortality.


INTRODUCTION
DNA methylation-based biomarkers, often referred to as "epigenetic age" or "epigenetic clock", are robust estimators of chronological age of an individual [1][2][3][4]. For example, a measure of epigenetic age based on levels of methylation in 353 CpG dinucleotide markers (cytosine linked to guanine by a phosphate group) allow the estimation of the age of an individual. This estimate is consistent across most types of biological specimens, including whole blood, brain, breast, kidney, liver, lung, and saliva and cell types, including CD4+ T cells, monocytes, B cells, glial cells, and neurons [3].
Recent studies suggested that epigenetic age is associated with age-related health outcomes above and beyond chronological age. For example, we and others have shown that individuals whose epigenetic age was greater than their chronological age (i.e., individuals exhibiting epigenetic "age acceleration") were at an increased risk for death from all causes, even after accounting for known risk factors [5][6][7]. Further, we recently showed that the offspring of semi-supercentenarians (subjects who reached an age of 105-109 years) have a lower epigenetic age than age-matched controls [8]. Based on these findings, it has been hypothesized that epigenetic age captures some aspect of biological age and the resulting susceptibility to disease and multiple health outcomes. A first step in testing this hypothesis is to test whether epigenetic age predicts longevity in multiple populations and across ethnic groups.
In many studies epigenetic age is estimated from DNA derived from blood samples. It is well known that blood cell composition changes with age and some of these changes might be independent predictors of mortality [9][10][11][12]. Thus, it is of interest to understand whether considering information on blood cell composition in measures of epigenetic age improves their predictive power for mortality.
Here, we evaluated the ability to predict time to death for blood-based epigenetic age measures, both published and novel measures that incorporate information on blood cell composition. Due to the well documented age-related changes in blood cell composition, we distinguished epigenetic measures of age that were independent of changes in blood cell composition (cell-intrinsic measures), and measures that incorporated age-related changes in blood cell composition ("extrinsic" measures). By increasing the number of independent cohort studies, we more than doubled the number of mortality events available for analysis, which allowed for detailed subgroup analyses including those based on race/ethnicity.

Cohort studies
Our meta-analysis included 13 population-based cohorts. An overview of the cohorts is provided in Table 1. Our study involved 3 racial/ethnic groups: non-Hispanic whites (n=9,215), Hispanics (n=431), and Blacks (n=3,443). Detailed descriptions of each cohort can be found in the Supplemental Materials.

Epigenetic age estimation
We used two methods for estimating the epigenetic age of each blood sample (Table 2). First, we used the approach by Horvath (2013) based on 353 CpGs, as described in [3] and Methods. Second, we used the approach by Hannum et al. (2013) based on 71 CpGs [2]. Both epigenetic age estimates were correlated with chronological age at the time of blood draw (Table 1) with biweight midcorrelation coefficients ranging from 0.65 to 0.89. But birth cohorts were excluded from this correlation analysis because it is not meaningful to calculate correlations with chronological age in this situation. The Horvath and Hannum estimates were also highly correlated with each other (r=0.76) even though the underlying sets of CpGs share only 6 CpGs in common. (Supplementary Table 1).

Estimated blood cell counts that relate to chronological age
We estimated the abundance of ten blood cell types based on observed DNA methylation patterns (Methods)exhausted/senescent CD8+ T cells (CD8+CD28-CD45RA-), CD8+ naïve, CD8+ total, CD4+ naïve, CD4+ total, natural killer cells, B cells, monocytes, granulocytes, and plasmablasts. To study age-related changes in blood cell composition, we correlated these estimated blood cell counts with chronological age in all of the cohort studies (Supplementary Table 2). Our results are congruent with findings from flow cytometric studies that demonstrate that the abundance of naïve CD8+ T cells decreases with age (reflecting thymic involution), whereas exhausted/senescent CD8+ T cells increase with age [9][10][11][12]. The last 3 columns report robust correlation coefficients (biweight midcorrelation) between chronological age and two epigenetic age estimates (Horvath and Hannum). * Median (25 th percentile -75 th percentile) † Biweight midcorrelation coefficient of chronological age with epigenetic age using the Horvath method. ‡ Biweight midcorrelation coefficient of chronological age with epigenetic age using the Hannum method. www.aging-us.com

Measures of epigenetic age acceleration
Despite high correlations, epigenetic age can deviate substantially from chronological age at the individual level. The difference between epigenetic age and chro-nological age can be used to define "delta age" but the resulting measure exhibits a negative correlation with chronological age. By contrast, all of our measures of epigenetic age acceleration are defined such that they are uncorrelated with chronological age. Description of the differences between epigenetic age and age acceleration measures. Column "Correlation with blood counts" relates to Supplementary Table 4. Column "Conserved in breast tissue" relates to Figure 1. www.aging-us.com An overview of several measures of epigenetic age acceleration is presented in Table 2. One such measure (denoted as AgeAccel) is defined as the residual that results from regressing epigenetic age on chronological age. Thus, a positive value of AgeAccel indicates that the epigenetic age is higher than expected, based on chronological age. These Horvath and Hannum based measures of age acceleration are denoted by AgeAccel Horvath and AgeAccel Hannum , respectively. For the sake of brevity and consistency with other publications from our group, we abbreviate AgeAccel Horvath as AgeAccel.
AgeAccel Hannum and to a lesser extent AgeAccel were previously shown to correlate with blood cell counts [5]. Thus, we distinguished two broad categories of measures of epigenetic age acceleration when dealing with DNA methylation from blood or peripheral blood mononuclear cells (PBMCs): intrinsic and extrinsic epigenetic measures, which are independent of, or enhanced by blood cell count information, respectively. We define intrinsic epigenetic age acceleration (IEAA) as the residual resulting from regressing epigenetic age on chronological age and measures of blood cell counts (Methods). By definition, IEAA is not correlated with chronological age and is weakly correlated with estimated measures of blood cell counts (Supplementary Table 4). IEAA is meant to capture cell-intrinsic properties of the aging process that exhibit some pre-servation across various cell types and organs. Compared to our other measures of age acceleration, IEAA, adapted from the Horvath measure of epigenetic age, exhibited significant correlations with epigenetic age acceleration in breast tissue (r=0.48, p=0.0011, Figure 1B) and saliva (r=0.67, p=8.8x10 -9 , Figure 1F). By contrast, an analogous measure of IEAA based on the Hannum measure showed much weaker correlations (r=0.073 in breast and r=0.41 in saliva Figure 1D, 1H). For this reason, we focused on the Horvath measure of IEAA.
The age-related changes to blood cell composition (Supplementary Table 4) can be leveraged to capture aspects of immunosenescence. Using these measures, we derived a novel extrinsic epigenetic age acceleration (EEAA) measure by up-weighting the blood cell count contributions of AgeAccel Hannum (Methods and  Supplementary Table 4).
Descriptive statistics (minimum, maximum, median) of the measures of epigenetic age acceleration can be found in Supplementary Table 3.

Cox regression models of all-cause mortality
We used Cox regression models to assess the predictive value of our measures of epigenetic age acceleration for all-cause mortality. All of our Cox models were adjusted     Figure 2). Further, when both EEAA and AgeAccel Hannum were included in the same Cox model, only EEAA remained significant in the WHI data and FHS univariate models. Since these results indicate that EEAA outperforms the closely related measure AgeAccel Hannum when it comes to mortality prediction, we removed the latter from subsequent analyses.

Interpreting effect sizes and variance of epigenetic age acceleration
Subjects differed substantially in terms of their measures of epigenetic age acceleration, e.g. EEAA ranged from -28 to 28 years in the WHI (standard deviation =6.4 years, Supplementary Table 3).
About five percent of the participants of the WHI exhibited an EEAA value larger than 10, which is associated with a 48% increased hazard of death as can be seen from the following calculation. The HR of EEAA is 1.040 if EEAA=1 (Figure 2c) but it is HR=1.48=(1.040) 10 if EEAA=10. Negative values of age acceleration were associated with a lower hazard of mortality. For example, 20% of subjects had an EEAA value less than -5, which is associated with an 18% decrease in the hazard of death (HR=0.82=1.04 -5 ).

Subgroup analysis
With few exceptions, we found that the associations between EEAA and time to death remained highly signi- www.aging-us.com ficant in subgroups stratified by race, sex, follow-up duration, body mass index, smoking status, physical activity ( Table 3) and in subgroups stratified by prevalent disease at baseline such as cancer, coronary artery disease, hypertension and type 2 diabetes ( Table  4). Only one subgroup led to an insignificant finding (p>0.05) in our univariate model analysis: namely subjects with less than 5 years of follow up ( Table 3). For multivariate models, we failed to observe significant associations for the following subgroups: i) less than 5 years of follow up, ii) between 5 and 10 years of follow up, iii) current smokers, iv) obese individuals, v) Hispanics, vi) individuals with cancer, and vii) subjects with coronary artery disease. The insignificant results in multivariate models in cancer patients or CAD patients might reflect the relatively low sample sizes or that epigenetic age acceleration is dwarfed by other predictors of mortality in subjects with severe diseases. Hazard ratio estimates remained highly consistent across all subgroups examined.
We did not observe significant differences in the estimated hazard ratios across any subgroup (Tables 3  and 4). Specifically, racial/ethnic differences in HR were not observed (interaction p=0.62 in ageadjustment models and p=0.14 in full models). Overall, these subgroup analysis results confirm that epigenetic age acceleration is an independent predictor of earlier mortality even after adjusting for possible confounders and within major subgroups of the population.

Hazard ratio of death versus follow up time and median age
The large number of cohorts allowed us to relate cohort characteristics (such as median age or median follow up removing time) to strength of association with mortality. We did not find a statistically significant relationship between the hazard ratio of death for the median age of the cohort or the follow up time ( Figure  4).

Robustness analysis
To assess the robustness of our findings, we also carried out a leave-one-out analysis by re-running the metaanalysis after removing data from individual cohorts. The resulting p-values are highly robust with respect to a single data set from the analysis (Supplementary Table 5). In our study, we used a fixed effects meta-analysis method for the sake of consistency with previous analyses [5]. However, our results remain qualitatively the same after using a random effects meta-analysis method (Supplementary Figure 4).

DISCUSSION
The current study corroborates previous findings regarding the predictive power of DNA methylationbased biomarkers of age for mortality [5,6,8]. We further examined novel variants of these measures that are either independent of blood cell counts or are enhanced by changes in blood cell sub-populations. We showed that the extrinsic measure EEAA out-performs previous measures of age acceleration when it comes to predicting all-cause mortality. Furthermore, the associations between epigenetic age acceleration and mortality did not differ significantly across subgroups of race/ethnicity, sex, BMI, smoking status, physical activity status, or major chronic diseases. The consistency of the associations across multiple subgroups lends support to the notion that epigenetic age acceleration captures some aspect of biological aging over and above chronological age and other risk factors.
The development of suitable measures of biological age has been a key goal in the field of aging research [13]. Many biomarkers of age have been posited including epigenetic alterations of the DNA (e.g., DNA methylation), transcriptomic changes in blood [14], telomere length [15], whole-body function such as gait speed (reviewed in [16]). The current study does not aim to replace existing blood based biomarkers, but rather, we aimed to demonstrate that it complements existing markers. Above all, this study shows that epigenetic age captures an aspect of biological age, as assessed through lifespan, above and beyond chronological age, blood cell composition, and a host of traditional risk factors of mortality.
The measures of epigenetic age acceleration are attractive because they are highly robust and because their measurement only involve DNA methylation data. While actual flow cytometry data will always be preferable to imputed blood cell count data (based on DNA methylation data), the measures of age acceleration do not require the measurement of flow data. Rather, measures of intrinsic and extrinsic epigenetic age used blood cell count estimates resulting from DNA methylation data. The measure of extrinsic age acceleration EEAA reflects aspects of immunosenescence because, by construction, it correlates with age-related changes in blood cell composition, such as T lymphocyte populations, which underlie much of the age-related decline in the protective immune response [9][10][11][12]. Thus, the high predictive significance of EEAA for all-cause mortality probably reflects the fact that it assesses multiple aspects of the biological age of the immune system including both changes in blood cell www.aging-us.com composition and cell-intrinsic epigenetic changes. It has been known for decades that poor T cell functioning is predictive of mortality [17].
The findings surrounding the predictive utility of intrinsic epigenetic age acceleration are biologically compelling and point to a new frontier in aging research. Our study strongly suggests IEAA is reflective of an intrinsic epigenetic clock that is associated with mortality independent of chronological age, changes in blood cell composition, and traditional risk factors of mortality. IEAA probably captures a cell-type independent component of the aging process for the following reasons. First, IEAA is moderately preserved across different tissues and cell types collected from the same subject ( Figure 1). Second, IEAA but not EEAA is predictive of lung cancer [18]. Third, only IEAA and AgeAccel relate to centenarian status [8].
Overall, our results inform the ongoing debate about whether epigenetic biomarkers of age capture an aspect of biological age. While epigenetic processes are unlikely to be the only mediators of chronological age on mortality-in fact, multiple risk factors have stronger effects on mortality-our results suggest that at least one of the mediating processes relates to the epigenetic age of blood tissue and that this process is independent of age-dependent changes in blood cell composition. Future studies will be useful for gaining a mechanistic understanding of this intrinsic epigenetic aging process.

Measures of epigenetic age
We used an epigenetic biomarker of age based on 353 CpG markers as one measure of epigenetic age because: a) it is an accurate measurement of age across multiple tissues [3]; b) we previously showed that it is predictive of all-cause mortality [5]; c) it correlated with measures of cognitive/physical fitness and neuro-pathology in the elderly [19,20]; and d) it was associated with conditions that are of interest in aging research including Down's syndrome [21], Huntington's disease [22], Parkinson's disease [23], obesity [24], HIV infection [25], menopause [26], centenarian status [27], ethnicity and sex [28], and cellular senescence [3,29]. This epigenetic age estimator not only lends itself to measuring aging effects in elderly subjects; but also applies to prenatal brain samples [30] and blood samples from minors [31]. Epigenetic age is defined as the predicted value of age based on the DNA methylation levels of 353 CpGs. Mathematical details and software tutorials for estimating epigenetic age can be found in the additional files of [3]. All of the described epigenetic measures of aging and age acceleration are implemented in our freely available software (https://dnamage.genetics. ucla.edu) [3].

DNA methylation age estimate by Hannum et al (2013)
We also used an alternative measure of epigenetic age developed by Hannum et al (2013) [2]. The resulting age estimate is based on the 71 CpGs and coefficient values from the third supplementary table [2]. The authors developed this age prediction method by using an elastic net regression model for predicting chronological age based on DNA methylation levels from whole blood. Table 2 provides an overview of our measures of epigenetic age acceleration. The universal measure of age acceleration (AgeAccel), which is valid for a wide range of tissue types, is defined as the residual resulting from a linear regression model that regresses the Horvath estimate of epigenetic age on chronological age. Thus, a positive value for AgeAccel indicates that the observed epigenetic age is higher than that predicted, based on chronological age. AgeAccel has a relatively weak correlation with blood cell counts [25], but it still relates to estimated blood cell counts, as seen in Supplementary Table 4.

Measures of epigenetic age acceleration
To estimate "pure" epigenetic aging effects that are not influenced by differences in blood cell counts ("intrinsic" epigenetic age acceleration, IEAA), we obtained the residual resulting from a multivariate regression model of epigenetic age on chronological age and various blood immune cell counts (naive CD8+ T cells, exhausted CD8+ T cells, plasmablasts, CD4+ T cells, natural killer cells, monocytes, and granulocytes) imputed from methylation data.
Extrinsic epigenetic age acceleration measures capture both cell intrinsic methylation changes and extracellular changes in blood cell composition. Our measure of EEAA is defined using the following three steps. First, we calculated the epigenetic age measure from Hannum et al [2], which already correlated with certain blood cell types [5]. Second, we increased the contribution of immune blood cell types to the age estimate by forming a weighted average of Hannum's estimate with 3 cell types that are known to change with age: naïve (CD45RA+CCR7+) cytotoxic T cells, exhausted (CD28-CD45RA-) cytotoxic T cells, and plasmablasts using the Klemera-Doubal approach [32]. The weights www.aging-us.com used in the weighted average are determined by the correlation between the respective variable and chronological age [32]. The weights were chosen on the basis of the WHI data. Thus, the same (static) weights were used for all data sets. EEAA was defined as the residual variation resulting from a univariate model regressing the resulting age estimate on chronological age. By construction, EEAA is positively correlated with the estimated abundance of exhausted CD8+ T cells, plasmablast cells, and a negative correlated with naive CD8+ T cells. Blood cell counts were estimated based on DNA methylation data as described in the next section. By construction, the measures of EEAA track both age related changes in blood cell composition and intrinsic epigenetic changes. None of our four measures of epigenetic age acceleration are correlated with chronological age.

Estimating blood cell counts based on DNA methylation levels
We estimate blood cell proportions using two different software tools. Houseman's estimation method [33], which is based on DNA methylation signatures from purified leukocyte samples, was used to estimate the proportions of cytotoxic (CD8+) T cells, helper (CD4+) T, natural killer, B cells, and granulocytes. The software does not allow us to identify the type of granulocytes in blood (neutrophil, eosinophil, or basophil) but we note that neutrophils tend to be the most abundant granulocyte (~60% of all blood cells compared with 0.5-2.5% for eosinophils and basophils). To estimate the percentage of exhausted CD8+ T cells (defined as CD28-CD45RA-), plasmablasts, and the number (count) of naïve CD8+ T cells (defined as CD45RA+CCR7+), we used the "Horvath method" [25], which is implemented in the advanced analysis option of the epigenetic age calculator software [3]. We and others have shown that imputed blood cell counts have moderately high correlations with corresponding flow cytometric data, e.g. r=0.86 for naïve CD4+ T cells, r=0.68 for naïve CD8+T, and r=0.49 for exhausted CD8+ T cells [28].

Cox regression models and meta-analysis
Here, we used Cox models for analyzing the censored survival time data (from the age at blood draw until age at death or last follow-up). We regressed the censored survival times on covariates using Cox regression models implemented in the R function coxph in the survival package. The resulting coefficient values (interpreted as log hazard ratios) and standard errors were combined using the R software package metafor [34]. The meta-analysis was carried out with the R command rma (with arguments method="FE" to get fixed effects estimates). The forest plots were created using the R function forest (with argument atransf=exp to exponentiate the estimate of the log hazard ratios).

Sample exclusions
In addition to cohort-specific quality checks, we further excluded individuals who had ever been diagnosed with leukemia (ICD-9: 203-208), reported receiving chemotherapy, and whose methylation beta value distributions deviated substantially from a gold standard (according to the quality statistic corSampleVSgold standard<0.80 from the online age calculator [35][36][37]).
Forest University School of Medicine, Winston-Salem, NC) Sally Shumaker.