Quantification of the pace of biological aging in humans through a blood test: The DunedinPoAm DNA methylation algorithm

Biological aging is the gradual, progressive decline in system integrity that occurs with advancing chronological age, causing morbidity and disability. Measurements of the pace of aging are needed to serve as surrogate endpoints in trials of therapies designed to prevent disease by slowing biological aging. We report a blood DNA-methylation measure that is sensitive to variation in the pace of biological aging among individuals born the same year. We first modeled longitudinal change in 18 biomarkers tracking organ-system integrity across 12 years of follow-up in the Dunedin birth cohort. Rates of change in each biomarker were composited to form a measure of aging-related decline, termed Pace of Aging. Elastic-net regression was used to develop a DNA-methylation predictor of Pace of Aging, called DunedinPoAm for Dunedin (P)ace (o)f (A)ging (m)ethylation. Validation analyses showed DunedinPoAm was associated with functional decline in the Dunedin Study and more advanced biological age in the Understanding Society Study, predicted chronic disease and mortality in the Normative Aging Study, was accelerated by early-life adversity in the E-risk Study, and DunedinPoAm prediction was disrupted by caloric restriction in the CALERIE trial. DunedinPoAm generally outperformed epigenetic clocks. Findings provide proof-of-principle for DunedinPoAm as a single-time-point measure of a person’s pace of biological aging.


INTRODUCTION
Aging of the global population is producing forecasts of rising burden of disease and disability (Harper, 2014). Because this burden arises from multiple age-related diseases, treatments for single diseases will not address the burden challenge (Goldman et al., 2013).
Geroscience research suggests an appealing alternative: treatments to slow aging itself could prevent or delay the multiple diseases that increase with advancing age, perhaps with a single therapeutic approach (Gladyshev, 2016;Kaeberlein, 2013). Aging can be understood as a gradual and progressive deterioration in biological system integrity (Kirkwood, 2005). This deterioration is thought to arise from an accumulation of cellular-level changes. These changes, in turn, increase vulnerability to diseases affecting many different organ systems (Kennedy et al., 2014;López-Otín et al., 2013). Animal studies suggest treatments that slow the accumulation of cellular-level changes can extend healthy lifespan (Campisi et al., 2019;Kaeberlein et al., 2015). However, human trials of these treatments are challenging because humans live much longer than model animals, making it time-consuming and costly to follow up human trial participants to test treatment effects on healthy lifespan. This challenge will be exacerbated in trials that will give treatments to young or middle-aged adults, with the aim to prevent the decline in system integrity that antedates disease onset by years. Involving young and midlife adults in healthspan-extension trials has been approved for development by the National Advisory Council on Aging (2019 CTAP report to NACA). In midlife trials of treatments to slow aging, called geroprotectors (Moskalev et al., 2016), traditional endpoints such as disease diagnosis or death are too far in the future to serve as outcomes. Translation of geroprotector treatments to humans could be aided by measures that quantify the pace of deterioration in biological system integrity in human aging. Such measures could be used as surrogate endpoints for healthy lifespan extension (Justice et al., 2016(Justice et al., , 2018Moskalev et al., and Raj, 2018). Methylation-clock algorithms have been developed to identify methylation patterns that characterize individuals of different chronological ages. However, a limitation is that individuals born in different years have grown up under different historical conditions (Schaie, 1967). For example, people born 70 years ago experienced more exposure to childhood diseases, tobacco smoke, airborne lead, and less exposure to antibiotics and other medications, and lower quality nutrition, all of which leave signatures on DNA methylation (Bell et al., 2019).
As a result, the clocks confound methylation patterns arising from early-life exposures to methylation-altering factors with methylation patterns related to biological aging during adulthood. An alternative approach is to study individuals who were all born the same year, and find methylation patterns that differentiate those who have been aging biologically faster or slower than their same-age peers. The current article reports four steps in our work toward developing a blood DNA methylation measure to represent individual variation in the pace of biological aging.

In
Step 1, which we previously reported , we collected a panel of 18 blood chemistry and organ-system function biomarkers at three successive waves of the Dunedin Longitudinal Study of a 1972-73 population-representative one-year birth cohort (N=1037). We used repeated-measures data collected when Study members were aged 26, 32, and 38 years old to quantify rates of biological change. We modelled the rate of change in each biomarker and calculated how each Study member's personal rate-of-change on that biomarker differed from the cohort norm. We then combined the 18 personal rates of change across the panel of biomarkers to compute a composite for each Study member that we called the Pace of Aging. Pace of Aging represents a personal rate of multi-organ-system decline over a dozen years. Pace of Aging was normally distributed, and showed marked variation among Study members who were all the same chronological age, confirming that individual differences in biological aging do emerge already by age 38, years before chronic disease onset.

In
Step 2, which we previously reported, we validated the Pace of Aging against known criteria. As compared to other Study members who were the same chronological age but had slower Pace of Aging, Study members with faster Pace of Aging performed more poorly on tests of physical function; showed signs of cognitive decline on a panel of dementia-relevant neuropsychological tests from an early-life baseline; were rated as looking older based on facial photographs; and reported themselves to be in worse health .
Subsequently, we reported that faster Pace of Aging is associated with early-life factors important for aging: familial longevity, low childhood social class, and adverse childhood experiences (Belsky et al., 2017a), and that faster Pace of Aging is associated with older scores on Brain Age, a machine-learning-derived measure of structural MRI differences characteristic of different age groups . Notably, Pace of Aging was not well-correlated with published epigenetic age clocks, which were designed to measure how old a person is rather than how fast they are aging biologically .

In
Step 3, which we report here, we distill the Pace of Aging into a measurement that can be obtained from a single blood sample. Here we focused on blood DNA methylation as an accessible molecular measurement that is sensitive to changes in physiology occurring in multiple organ systems (Birney et al., 2016;Bolund et al., 2017;Chambers et al., 2015;Chu et al., 2017;Hedman Åsa K. et al., 2017;Ma et al., 2019;Mill and Heijmans, 2013;Morris et al., 2017;Wahl et al., 2017). We used data about the Pace of Aging from age 26 to 38 years in the Dunedin Study along with whole-genome methylation data at age 38 years. Elastic-net regression was applied to derive an algorithm that captured DNA methylation patterns linked with variation among individuals in their Pace of Aging. The algorithm is hereafter termed "DunedinPoAm".
DunedinPoAm is qualitatively different from previously published DNA methylation measures of aging that were developed by comparing older individuals to younger ones. Those measures, often referred to as "clocks," are state measures. They estimate how much aging has occurred in an individual up to the point of measurement. DunedinPoAm is a rate measure. It is based on comparison of longitudinal change over time in 18 biomarkers of organ-system integrity among individuals who are all the same chronological age. DunedinPoAm estimates how fast aging is occurring during the years leading up to the time of measurement. Rather than a clock that records how much time has passed, DunedinPoAm is designed to function as a outcome data using Cox proportional hazard regression. For analysis of repeated-measures longitudinal DNA methylation data in the Normative Aging Study, we used generalized estimating equations to account for non-independence of repeated observations of individuals (Ballinger, 2004), following the method in previous analysis of those data (Gao et al., 2018), and econometric fixed-effects regression (Wooldridge, 2012) to test within-person change over time. For analysis in E-Risk, which include data on twin siblings, we clustered standard errors at the family level to account for non-independence of data. For analysis of longitudinal change in clinical-biomarker biological age in CALERIE, we used mixed-effects growth models (Singer and Willett, 2003) following the method in our original analysis of those data (Belsky et al., 2017b).
For regression analysis, methylation measures were adjusted for batch effects by regressing the measure on batch controls and predicting residual values. Dunedin Study, Understanding Society, E-Risk, and CALERIE analyses included covariate adjustment for sex (the Normative Aging Study included only men). Understanding Society, Normative Aging Study, and CALERIE analyses included covariate adjustment for chronological age. (Dunedin and E-Risk are birthcohort studies and participants are all the same chronological age.) Sensitivity analyses testing covariate adjustment for estimated leukocyte distributions and smoking are reported in Supplemental Tables 3-7.

Capturing Pace of Aging in a single blood test
We derived the DunedinPoAm algorithm using data from Dunedin Study members for whom age-38 DNA methylation data were available (N=810). We applied elastic-net regression (Zou and Hastie, 2005) using Pace of Aging between ages 26 to 38 years as the criterion. We included all methylation probes that appear on both the Illumina 450k and EPIC arrays as potential predictor variables. We selected this overlapping set of probes for our analysis to facilitate application of the derived algorithm by other research teams using either chip. We fixed the alpha parameter to 0.5, following the approach reported by Horvath (Horvath, 2013).
This analysis selected a set of 46 CpG sites (Supplemental Table 1

, Supplemental Materials
Section 2). The 46-CpG elastic-net-derived DunedinPoAm algorithm, applied in the age-38 Dunedin DNA methylation data, was associated with the longitudinal 26-38 Pace of Aging measure (Pearson r=0.56, Supplemental Figure 1). This is likely an overestimate of the true outof-sample correlation because the analysis is based on the same data used to develop the DunedinPoAm algorithm; bootstrap-cross-validation analysis estimated the out-of-sample correlation to be r=0.33; Supplemental Materials Section 3).  Cognitive Functioning. We evaluated cognitive functioning from tests of perceptual reasoning, working memory, and processing speed, which are known to show aging related declines already by the fifth decade of life (Hartshorne and Germine, 2015;Park et al., 2002).

DunedinPoAm in midlife predicted future functional limitations
Study members with faster age-38 DunedinPoAm performed more poorly on all age-45 cognitive tests (r=0.14-0.28, p<0.001 for all). Effect-sizes are graphed in the light-blue bars of  (Belsky et al., 2017a;Deary and Batty, 2006). Therefore, to evaluate whether associations between DunedinPoAm and cognitive test performance at age 45 might reflect reverse causation instead of early cognitive decline, we next conducted analysis of cognitive decline between adolescence and midlife. We evaluated cognitive decline by comparing Study-members' cognitive-test performance at age 45 to their cognitive-test performance three decades earlier when they were ages 7-13 years. Cognitive performance was measured from composite scores on the Covariate adjustment to models for estimated cell counts (Houseman et al., 2012) and did not change results. Covariate adjustment for smoking history at age 38 years modestly attenuated some effect-sizes and attenuated DunedinPoAm associations with cognitive decline to near zero. Results for all models are reported in Supplemental Table 3.
In comparison to DunedinPoAm, effect-sizes for associations with functional limitations were smaller for the Horvath, Hannum, and Levine epigenetic clocks and, in the cases of the Horvath and Hannum clocks, were not statistically different from zero at the alpha=0.05 threshold for most outcomes. Effect-sizes are reported in Supplemental Table 3

Evaluating DunedinPoAm and other methylation clocks in the Understanding Society study
To test variation in DunedinPoAm and to compare it with published methylation measures of biological aging, we conducted analysis using data on N=1,175 participants aged 28-95 years (M=58, SD=15; 42% male) in the UK Understanding Society cohort. In this mixedage sample, the mean DunedinPoAm was 1.03 years of biological aging per each calendar year (SD=0.07).
We first tested if higher DunedinPoAm levels, which indicate faster aging, were correlated with older chronological age. Mortality rates increase with advancing chronological age, although there may be some slowing at the oldest ages (Barbi et al., 2018). This suggests the hypothesis that the rate of aging increases across much of the adult lifespan. Consistent with this hypothesis, Understanding Society participants who were of older chronological age tended to have faster DunedinPoAm (r=0.11, [0.06-0.17], p<0.001; Figure 3 Panel A). We also compared DunedinPoAm with three methylation measures of biological age: the epigenetic clocks proposed by Horvath, Hannum, and Levine (Hannum et al., 2013;Horvath, 2013;Levine et al., 2018). These epigenetic clocks were highly correlated with chronological age in the Understanding Society sample (Horvath Clock r=0.91,Hannum Clock r=0.92,Levine Clock r=0.88).
Next, to test if DunedinPoAm captured similar information about aging to published epigenetic clocks, we regressed each of the published clocks on chronological age and predicted residual values, following the procedure used by the developers of the clocks. These residuals are referred to in the literature as measures of "epigenetic age acceleration." None of the 46 CpGs included in the DunedinPoAm algorithm overlapped with CpGs in these epigenetic clocks.
Nevertheless, DunedinPoAm was moderately correlated with epigenetic age acceleration measured from the clocks proposed by Hannum (r=0.24) and Levine (r=0.30). DunedinPoAm was less-well correlated with acceleration measured from the Horvath clock (r=0.06).
Associations among DunedinPoAm and the epigenetic clocks in the Understanding Society sample are shown in Figure 3 Panel B.
Finally, we tested correlations of DunedinPoAm with (a) a measure of biological age derived from blood chemistry and blood pressure data, and (b) a measure of self-rated health.
We computed biological age from Understanding Society blood chemistry and blood pressure data following the Klemera and Doubal method (KDM)  and the procedure described by Levine  In comparison to DunedinPoAm, effect-sizes for associations with self-rated health and KDM Biological Age were smaller for the epigenetic clocks and, in the cases of the Horvath and Hannum clocks, were not statistically different from zero at the alpha=0.05 threshold (Figure 3 Panels C and D). Effect-sizes are reported in Supplemental Table 4 and plotted Supplemental To test if faster DunedinPoAm was associated with morbidity and mortality, we analyzed data from N=771 older men in the Veterans Health Administration Normative Aging Study (NAS; at baseline, mean chronological age=77, SD=7).
We first tested if higher DunedinPoAm levels, which indicate faster aging, were associated with increased risk of mortality. During follow-up from 1999-2013, 46% of NAS participants died over a mean follow-up of 7 years (SD=7). Those with faster DunedinPoAm at baseline were at increased risk of death (Hazard Ratio (HR)=1.29 [1.16-1.45], p<0.001; Figure 4) Covariate adjustment to models for estimated cell counts (Houseman et al., 2012) and smoking status did not change results, with the exception that the effect-size for DunedinPoAm was attenuated below the alpha=0.05 threshold of statistical significance in smoking-adjusted analysis of chronic disease incidence. Results for all models are reported in Supplemental Table   5.
In comparison to DunedinPoAm, effect-sizes for associations with mortality and chronic disease were smaller for the epigenetic clocks and were not statistically different from zero in many of the models (Supplemental Table 5 and Supplemental Figure 4).

Childhood exposure to poverty and victimization were associated with faster DunedinPoAm in young adults in the E-Risk Study
To test if DunedinPoAm indicated faster aging in young people with histories of exposure thought to shorten healthy lifespan, we analyzed data from N=1,658 members of the E-Risk Longitudinal Study. The E-Risk Study follows a 1994-95 birth cohort of same-sex twins.
Blood DNA methylation data were collected when participants were aged 18 years. We analyzed two exposures associated with shorter healthy lifespan, childhood low socioeconomic status and childhood victimization. Socioeconomic status was measured from data on their parents' education, occupation, and income . Victimization was measured from exposure dossiers compiled from interviews with the children's mothers and home-visit assessments conducted when the children were aged 5, 7, 10, and 12 (Fisher et al., 2015). The dossiers recorded children's exposure to domestic violence, peer bullying, physical and sexual harm by an adult, and neglect. 72% of the analysis sample had no victimization exposure, 21% had one type of victimization exposure, 4% had two types of exposure, and 2% had three or more types of exposure. Differences in DunedinPoAm across strata of childhood socioeconomic status and victimization are graphed in Figure 5.
In comparison to DunedinPoAm, effect-sizes for associations with childhood socioeconomic circumstances and victimization were smaller for the epigenetic clocks and, in the cases of the Horvath and Hannum clocks, were not statistically different from zero at the alpha=0.05 threshold. Effect-sizes are reported in Supplemental Table 6 and plotted in Supplemental Figure 5.

DunedinPoAm measured at baseline in the CALERIE randomized trial predicted future rate of aging measured from clinical-biomarker data
The CALERIE Trial is the first randomized trial of long-term caloric restriction in nonobese adult humans. CALERIE randomized N=220 adults on a 2:1 ratio to treatment of 25% caloric restriction (CR-treatment) or control ad-libitum (AL-control, as usual) diet for two years . We previously reported that CALERIE participants who were randomized to CR-treatment experienced a slower rate of biological aging as compared to participants in the AL-control arm based on longitudinal change analysis of clinical-biomarker data from the baseline, 12-month, and 24-month follow-up assessments (Belsky et al., 2017b). Among control participants, the rate of increase in biological age measured using the Klemera-Doubal method (KDM) Biological Age algorithm was 0.71 years of biological age per 12-month follow-up interval. (This slower-than-expected rate of aging could reflect differences between CALERIE Trial participants, who were selected for being in good health, and the nationally representative NHANES sample in which the KDM algorithm was developed (Belsky et al., 2017b).) In contrast, among treatment participants, the rate of increase was only 0.11 years of biological age per 12month follow-up interval (difference b=-0.60 [-0.99, -0.21]). We subsequently generated DNA methylation data from blood DNA that was collected at the baseline assessment of the CALERIE trial for a sub-sample (N=68 AL-control participants and 118 CR-treatment participants). We used these methylation data to calculate participants' DunedinPoAm values at study baseline.
We then tested if baseline DunedinPoAm could predict participants' future rate of biological aging as they progressed through the trial.
We first replicated our original analysis within the methylation sub-sample. Results were the same as in the full sample (Supplemental Table 7). Next, we compared DunedinPoAm between CR-treatment and AL-control participants. As expected, there was no group difference at baseline (AL M=1.00, SD=0.05; CR M=1.01, SD=0.06, p-value for difference =0.440). Finally, we tested if participants' baseline DunedinPoAm was associated with their rate of biological aging over the 24 months of follow-up, and if this association was modified by randomization to caloric restriction as compared to ad libitum diet. For AL-control participants, faster baseline DunedinPoAm predicted faster biological aging over the 24-month follow-up, although in this small group this association was not statistically significant at the alpha=0.05 level (b=0.

DISCUSSION
Breakthrough discoveries in the new field of geroscience suggest opportunities to extend healthy lifespan through interventions that slow biological processes of aging (Campisi et al., 2019). To advance translation of these interventions, measures are needed that can detect changes in a person's rate of biological aging . We previously showed that the rate of biological aging can be measured by tracking change over time in multiple indicators of organ-system integrity . Here, we report data illustrating the potential to streamline measurement of Pace of Aging to an exportable, inexpensive and non-invasive blood test, and thereby ease implementation of Pace of Aging measurement in studies of interventions to slow processes of biological aging.
We conducted machine-learning analysis of the original Pace of Aging measure using elastic-net regression and whole-genome blood DNA methylation data. We trained the algorithm to predict how fast a person was aging. We called the resulting algorithm "DunedinPoAm" for "(m)ethylation (P)ace (o)f (A)ging". There were four overall findings: First, while DunedinPoAm was not a perfect proxy of Pace of Aging, it nevertheless captured critical information about Dunedin Study members' healthspan-related characteristics. Across the domains of physical function, cognitive function, and subjective signs of aging, Study members with faster DunedinPoAm at age 38 were worse off seven years later at age 45 and, in repeated-measures analysis of change, they showed signs of more rapid decline. Effect-sizes were equal to or greater than those for the 18-biomarker 3-time point Alternatively, integration of methylation data with additional molecular datasets (Hasin et al., 2017;Zierer et al., 2015) may be needed to achieve precise measurement of Pace of Aging from a single time-point blood sample.

Second, DunedinPoAm analysis of the Understanding Society and Normative Aging
Study samples provided proof-of-concept for using DunedinPoAm to quantify biological aging.
Age differences in DunedinPoAm parallel population demographic patterns of mortality risk. In the Understanding Society sample, older adults had faster DunedinPoAm as compared to younger ones. In the Normative Aging Study sample, participants' DunedinPoAm values increased as they aged. These observations are consistent with the well-documented acceleration of mortality risk with advancing chronological age (Robine, 2011). However, it sets DunedinPoAm apart from other indices of biological aging, which are not known to register this acceleration (Finch and Crimmins, 2016;Li et al., 2020). DunedinPoAm may therefore provide a novel tool for testing how the rate of aging changes across the life course and whether, as demographic data documenting so-called "mortality plateaus" suggest, processes of aging slow down at the oldest chronological ages (Barbi et al., 2018).
DunedinPoAm is related to but distinct from alternative approaches to quantification of biological aging. DunedinPoAm was moderately correlated with aging rates measured by the epigenetic clocks proposed by Hannum et al. and Levine et al. (Hannum et al., 2013; as well as KDM Biological Age derived from clinical biomarker data , and with self-rated health. Consistent with findings for the measured Pace of Aging , DunedinPoAm was only weakly correlated with the multi-tissue clock proposed by Horvath. DunedinPoAmwas more strongly correlated with a clinical-biomarker measure of biological age, with self-rated health, with functional testperformance and decline, and with morbidity and mortality as compared to the epigenetic clocks. Third, DunedinPoAm is already variable by young adulthood and is accelerated in young people at risk for eventual shortened healthspan. E-Risk young adults who grew up in socioeconomically disadvantaged families or who were exposed to victimization early in life already showed accelerated DunedinPoAm by age 18, consistent with epidemiological observations of shorter healthy lifespan for individuals with these exposures (Adler and Rehkopf, 2008;Danese and McEwen, 2012). We previously found that Dunedin Study members with histories of early-life adversity showed accelerated Pace of Aging in their 30s (Belsky et al., 2017a). DunedinPoAm analysis of the E-Risk cohort suggests effects may be already manifest at least a decade earlier. DunedinPoAm may therefore provide a useful index that can be applied to evaluate prevention programs to buffer at-risk youth against health damaging effects of challenging circumstances.
Fourth, DunedinPoAm analysis of the CALERIE trial provided proof-of-concept for using DunedinPoAm to quantify biological aging in geroprotector intervention studies. DunedinPoAm measures the rate of aging over the recent past. Control-arm participants' baseline DunedinPoAm correlated positively with their clinical-biomarker pace of aging over the two years of the trial, consistent with the hypothesis that their rate of aging was not altered. In contrast, there was no relationship between DunedinPoAm and clinical-biomarker pace of aging for caloric-restriction-arm participants, consistent with the hypothesis that caloric restriction altered participants' rate of aging. Ultimately, data on DunedinPoAm for all CALERIE participants (and participants in other geroprotector trails) at trial baseline and follow-up will be needed to establish utility of DunedinPoAm as a surrogate endpoint. In the mean-time, these data establish potential to use DunedinPoAm as a pre-treatment covariate in geroprotector trials to boost statistical power (Kahan et al., 2014) or to screen participants for enrollment, e.g. to identify those who are aging more rapidly and may therefore show larger effects of treatment.
We acknowledge limitations. Foremost, DunedinPoAm is a first step toward a singleassay cross-sectional measurement of Pace of Aging. The relatively modest size of the Dunedin cohort and the lack of other cohorts that have the requisite 3 or more waves of repeated biomarkers to measure the Pace of Aging limited sample size for our machine-learning analysis to develop methylation algorithms. As Pace of Aging is measured in additional cohorts, more refined analysis to develop DunedinPoAm-type algorithms will become possible. In addition, our work thus far has not addressed population diversity in biological aging. The Dunedin cohort in which DunedinPoAm was developed and the Understanding Society and E-Risk cohorts and CALERIE trial sample in which it was tested were mostly of white Europeandescent. Follow-up of DunedinPoAm in more diverse samples is needed to establish crosspopulation validity. Finally, because methylation data are not yet available from CALERIE followup assessments, we could not test if intervention modified DunedinPoAm at outcome.
Ultimately, to establish DunedinPoAm as a surrogate endpoint for healthspan, it will be necessary to establish not only robust association with healthy lifespan phenotypes and modifiability by intervention, but also the extent to which changes in DunedinPoAm induced by intervention correspond to changes in healthy-lifespan phenotypes (Prentice, 1989).
Within the bounds of these limitations, our analysis establishes proof-of-concept for DunedinPoAm as a single-time-point measure that quantifies Pace of Aging from a blood test. It can be implemented in Illumina 450k and EPIC array data, making it immediately available for testing in a wide range of existing datasets as a complement to existing methylation measures of aging. Critically, DunedinPoAm offers a unique measurement for intervention trials and natural experiment studies investigating how the rate of aging may be changed by behavioral or drug therapy, or by environmental modification. DunedinPoAm may be especially valuable to . Graphs for changes in balance, grip-strength, physical limitations, cognition, and facial aging are binned scatterplots. Plotted points reflect average xand y-coordinates for "bins" of approximately ten Study members. Fitted slopes show the association estimated from the raw, un-binned data. The y-axis scale on graphs of balance, gripstrength, and physical limitations shows change scores (age 45 -age 38) scaled in terms of age-38 standard deviation units. The y-axis scale on the graph of cognitive change shows the difference in IQ score (age 45 -baseline). The graph of change in facial aging shows the change in z-score between measurement intervals (age 45 -age 38). Effect-sizes reported on the graphs are standardized regression coefficients interpretable as Pearson r. Models included covariate adjustment for sex. The graph for self-rated health plots the predicted probability (fitted slope) and 95% confidence interval (shaded area) of incident fair/poor health at age 45.
The effect-size reported on the graph is the incidence-rate ratio (IRR) associated with a 1-SD increase in DunedinPoAm estimated from Poisson regression. The model included covariate adjustment for sex.

A. B.
z-score

Figure 5. DunedinPoAm levels by strata of childhood socioeconomic status (SES) and victimization in the E-Risk Study.
Panel A (left side) plots means and 95% CIs for DunedinPoAm measured at age 18 among E-Risk participants who grew up low, middle, and high socioeconomic status households. Panel B (right side) plots means and 95% CIs for DunedinPoAm measured at age 18 among E-Risk participants who experienced 0, 1, 2, or 3 or more types of victimization through age 12 years.

Figure 6. Change in KDM Biological Age over 24-month follow-up by treatment condition and baseline DunedinPoAm in the CALERIE Trial.
Fast DunedinPoAm is defined as 1 SD above the sample mean. Slow DunedinPoAm is defined as 1 SD below the cohort mean. Slopes are predicted values from mixed effects regression including a 3-way interaction between trial condition, time, and continuous DunedinPoAm at baseline. The figure shows that in the Ad Libitum (AL) arm of the trial, participants with fast DunedinPoAm at baseline experience substantially more change in KDM Biological Age from baseline to follow-up as compared to AL participants with slow DunedinPoAm. In contrast, there was little difference between participants with fast as compared to slow DunedinPoAm in the Caloric Restriction (CR) arm of the trial. The Dunedin Study is a longitudinal investigation of health and behavior in a complete birth cohort. Study members (N=1,037; 91% of eligible births; 52% male) were all individuals born between April 1972 andMarch 1973 in Dunedin, New Zealand (NZ), who were eligible based on residence in the province and who participated in the first assessment at age 3. The cohort represents the full range of socioeconomic status on NZ's South Island and matches the NZ National Health and Nutrition Survey on key health indicators (e.g., BMI, smoking, GP visits) . The cohort is primarily white (93%) . Assessments were carried out at birth and ages 3, 5, 7, 9, 11, 13, 15, 18, 21, 26, 32, 38 and, most recently, 45 years, when 94% of the 997 study members still alive took part. At each assessment, each study member is brought to the research unit for a full day of interviews and examinations. Study data may be accessed through agreement with the Study investigators (https://moffittcaspi.trinity.duke.edu/research-topics/dunedin). Dunedin Study analysis to develop the Pace of Aging algorithm included 18 biomarkers measured at the age 26, 32, and 38 assessments: (in order of listing in Figure 3 of ) glycated hemoglobin, cardiorespiratory fitness, waist-hip ratio, FEV1/FVC ratio, FEV1, mean arterial pressure, body mass index, leukocyte telomere length, creatinine clearance, blood urea nitrogen, lipoprotein (a), triglycerides, gum health, total cholesterol, white blood cell count, high-sensitivity C-reactive protein, HDL cholesterol, ApoB100/ApoA1 ratio.

Supplemental Table 1. Physical and cognitive functioning and subjective signs of aging measures in the Dunedin Study
Physical Functioning (N=800 with DunedinPoAm data) Balance Balance was measured using the Unipedal Stance Test as the maximum time achieved across three trials of the test with eyes closed (Bohannon et al., 1984;Springer et al., 2007;Vereeck et al., 2008). Gait Speed Gait speed (meters per second) was assessed with the 6-m-long GAITRite Electronic Walkway (CIR Systems, Inc) with 2-m acceleration and 2-m deceleration before and after the walkway, respectively. Gait speed was assessed under 3 walking conditions: usual gait speed (walk at normal pace from a standing start, measured as a mean of 2 walks) and 2 challenge paradigms, dual-task gait speed (walk at normal pace while reciting alternate letters of the alphabet out loud, starting with the letter "A," measured as a mean of 2 walks) and maximum gait speed (walk as fast as safely possible, measured as a mean of 3 walks). We calculated the mean of the 3 individual walk conditions to generate our primary measure of composite gait speed (Rasmussen et al., 2019).

Steps in Place
The 2-min step test was measured as the number of times a participant lifted their right knee to mid-thigh height (measured as the height half-way between the knee cap and the iliac crest) in 2 minutes at a self-directed pace (Jones and Rikli, 2002;Rikli and Jones, 1999).
Chair Stands Chair rises were measured as the number of stands a participant completed in 30 seconds from a seated position Jones and Rikli, 2002).

Grip Strength
Handgrip strength was measured for the dominant hand (elbow held at 90°, upper arm held tight against the trunk) as the maximum value achieved across three trials using a Jamar digital dynamometer (Mathiowetz et al., 1985;Rantanen T et al., 1999).

Motor Coordination
At ages 38 and 45, we measured motor functioning as the time to completion of the Grooved Pegboard Test with the dominant hand.

Physical Limitations
Physical limitations were measured with the 10-item RAND 36-Item Health Survey 1.0 physical functioning scale (Ware and Sherbourne, 1992). Participant responses ("limited a lot", "limited a little", "not limited at all") assessed their difficulty with completing various activities, e.g., climbing several flights of stairs, walking more than 1 km, participating in strenuous sports, etc. Scores were reversed to reflect physical limitations so that a high score indicates more limitations.

Decline in Physical Functioning
Tests of balance and grip strength and interviews about physical limitations were completed at both the age-38 and age-45 Dunedin Study assessments. We measured decline across the 7-year measurement interval by subtracting the age-38 test score from the age-45 test score. Cognitive Functioning (N=795 with mPoA data) Cognitive Functioning The Wechsler Adult Intelligence Scale-IV (WAIS-IV) (Wechsler, 2008) was administered to the participants at age 45 years, yielding the IQ. In addition to full scale IQ, the WAIS-IV measures four specific domains of cognitive function: Processing Speed, Working Memory, Perceptual Reasoning, and Verbal Comprehension. Cognitive Decline IQ is a highly reliable measure of general intellectual functioning that captures overall ability across differentiable cognitive functions. We measured IQ from the individually administered Wechsler Intelligence Scale for Children-Revised (WISC-R; averaged across ages 7, 9, 11, and 13) (Wechsler, 2003) and the Wechsler Adult Intelligence Scale-IV (WAIS-IV; age 45) (Wechsler, 2008). We measured IQ decline by comparing scores from the WISC-R and the WAIS-IV. Subjective Signs of Aging (N=802 with mPoA data) Self-rated Health Study members rated their health on a scale of 1-5 (poor, fair, good, very good, or excellent). Facial Aging Facial Aging is the subjective perception of aged appearance based on a facial photograph and is proposed as a clinically-useful marker of mortality risk (Christensen et al., 2009). Facial Aging measurement in the Dunedin Study was based on ratings by an independent panel of 8 raters of each participant's facial photograph Shalev et al., 2014). Facial Aging was based on two measurements of perceived age. First, Age Range was assessed by an independent panel of 4 raters, who were presented with standardized (non-smiling) facial photographs of participants and were kept blind to their actual age. Raters used a Likert scale to categorize each participant into a 5-year age range (i.e., from 20-24 years old up to 70+ years old) (interrater reliability = .77). Scores for each participant were averaged across all raters. Second, Relative Age was assessed by a different panel of 4 raters, who were told that all photos were of people aged 45 years old. Raters then used a 7item Likert scale to assign a "relative age" to each participant (1="young looking", 7="old looking") (interrater reliability = .79). The measure of perceived age at 45 years, Facial Age, was derived by standardizing and averaging Age Range and Relative Age scores. Subjective Decline Self-rated Health and Facial Aging were measured at both the age-38 and age-45 assessments. We measured decline in self-rated health as incident fair/poor health reported at the age-45 assessment. We measured acceleration in Facial Aging by computing the difference in Facial Aging Z-scores between the age-45 and age-38 assessments.
Understanding Society is an ongoing panel study of the United Kingdom population (https://www.understandingsociety.ac.uk/). During 2010-12, participants were invited to take part in a nurse's exam involving a blood draw. Of the roughly 20,000 participants who provided clinical data in this exam, methylation data have been generated for just under 1,200. We analyzed data from 1,175 participants with available methylation and blood chemistry data. Documentation of the methylation (University of Essex, n.d.) and blood chemistry (University of Essex, n.d.) data resource is available online (https://www.understandingsociety.ac.uk/sites/default/files/downloads/documentation/healt h/user-guides/7251-UnderstandingSociety-Biomarker-UserGuide-2014.pdf).
Klemera-Doubal method (KDM) Biological Age. We measured KDM Biological age from blood chemistry, systolic blood pressure, and lung-function data using the algorithm proposed by  trained in data from the NHANES following the method originally described by  and using the dataset compiled by Hastings (Hastings et al., 2019). We included 8 of Levine's original 10 biomarkers in the algorithm: albumin, alkaline phosphatase (log), blood urea nitrogen, creatinine (log), C-reactive protein (log), HbA1C, systolic blood pressure, and forced expiratory volume in 1 second (FEV1). We omitted total cholesterol because of evidence this biomarker shows different directions of association with aging in younger and older adults (Arbeev et al., 2016). Cytomegalovirus optical density was not available in the Understanding Society database.
Self-rated Health. Understanding Society participants rated their health as excellent, very-good, good, fair, or poor. We standardized this measure to have Mean=0, Standard Deviation=1 for analysis.

The Normative Aging Study (NAS)
is an ongoing longitudinal study on aging established by the US Department of Veterans Affairs in 1963. Details of the study have been published previously (Bell et al., 1972). Briefly, the NAS is a closed cohort of 2,280 male veterans from the Greater Boston area enrolled after an initial health screening to determine that they were free of known chronic medical conditions. Participants have been re-evaluated every 3-5 years on a continuous rolling basis using detailed on-site physical examinations and questionnaires. DNA from blood samples was collected from 771 participants beginning in 1999. We analyzed blood DNA methylation data from up to four repeated assessments conducted through 2013 (Gao et al., 2019a;. Of the 771 participants with DNA methylation data, n=536 (46%) had data from 2 repeated assessments and n=178 (23%) had data from three or four repeated assessments. We restricted the current analysis to participants with at least one DNA methylation data point. The NAS was approved by the Department of Veterans Affairs Boston Healthcare System and written informed consent was obtained from each subject before participation.
Mortality. Regular mailings to study participants have been used to acquire vital-status information and official death certificates were obtained from the appropriate state health department to be reviewed by a physician. Participant deaths are routinely updated by the research team and the last available update was on 31 December 2013.. During follow-up, n=355 (46%) of the 771 NAS participants died.
Chronic Disease Morbidity. We measured chronic disease morbidity from participants medical histories and prior diagnoses (Gao et al., 2019b(Gao et al., , 2019cLepeule et al., 2018;Nyhan et al., 2018). We counted the number of chronic diseases to compose an ordinal index with categories of 0, 1, 2, 3, or 4+ of the following comorbidities: hypertension, type-2 diabetes, cardiovascular disease, chronic obstructive pulmonary disease, chronic kidney disease, and cancer.
The Environmental Risk Longitudinal Twin Study tracks the development of a birth cohort of 2,232 British participants. The sample was drawn from a larger birth register of twins born in England andWales in 1994-1995. Full details about the sample are reported elsewhere (Moffitt and E-risk Team, 2002). Briefly, the E-Risk sample was constructed in 1999-2000, when 1,116 families (93% of those eligible) with same-sex 5-year-old twins participated in home-visit assessments. This sample comprised 56% monozygotic (MZ) and 44% dizygotic (DZ) twin pairs; sex was evenly distributed within zygosity (49% male). Families were recruited to represent the UK population of families with newborns in the 1990s, on the basis of residential location throughout England and Wales and mother's age. Teenaged mothers with twins were overselected to replace high-risk families who were selectively lost to the register through nonresponse. Older mothers having twins via assisted reproduction were under-selected to avoid an excess of well-educated older mothers. The study sample represents the full range of socioeconomic conditions in the UK, as reflected in the families' distribution on a neighborhood-level socioeconomic index (ACORN [A Classification of Residential Neighborhoods], developed by CACI Inc. for commercial use): 25.6% of E-Risk families lived in "wealthy achiever" neighborhoods compared to 25.3% nationwide; 5.3% vs. 11.6% lived in "urban prosperity" neighborhoods; 29.6% vs. 26.9% lived in "comfortably off" neighborhoods; 13.4% vs. 13.9% lived in "moderate means" neighborhoods, and 26.1% vs. 20.7% lived in "hardpressed" neighborhoods. E-Risk underrepresents "urban prosperity" neighborhoods because such households are likely to be childless.
Home-visits assessments took place when participants were aged 5, 7, 10, 12 and, most recently, 18 years, when 93% of the participants took part. At ages 5, 7, 10, and 12 years, assessments were carried out with participants as well as their mothers (or primary caretakers); the home visit at age 18 included interviews only with participants. Each twin was assessed by a different interviewer. These data are supplemented by searches of official records and by questionnaires that are mailed, as developmentally appropriate, to teachers, and co-informants nominated by participants themselves. The Joint South London and Maudsley and the Institute of Psychiatry Research Ethics Committee approved each phase of the study. Parents gave informed consent and twins gave assent between 5-12 years and then informed consent at age 18. Study data may be accessed through agreement with the Study investigators (https://moffittcaspi.trinity.duke.edu/research-topics/erisk).
Childhood Socioeconomic Status (SES) was defined through a standardized composite of parental income, education, and occupation . The three SES indicators were highly correlated (r=0.57-0.67) and loaded significantly onto one factor. The populationwide distribution of the resulting factor was divided in tertiles for analyses.
Childhood Victimization. As previously described , we assessed exposure to six types of childhood victimization between birth to age 12: exposure to domestic violence between the mother and her partner, frequent bullying by peers, physical and sexual harm by an adult, and neglect.
CALERIE. The CALERIE trial is described in detail elsewhere . Briefly, N=220 normal-weight (22.0 ≤ BMI < 28 kg/m 2 ) participants (70% female, 77% white) aged 21-50 years at baseline were randomized to caloric restriction or ad libitum conditions with a 2:1 ratio (n=145 to caloric restriction, n=75 to ad libitum). "Ad libitum" (normal) caloric intake was determined from two consecutive 14-day assessments of total daily energy expenditure using doubly labeled water . Average percent caloric restriction over six-month intervals was retrospectively calculated by the intake-balance method with simultaneous measurements of total daily energy expenditure using doubly labeled water and changes in body composition (Racette et al., 2012;Wong et al., 2014). Over the course of the trial, participants in the caloric-restriction arm averaged 12% reduction in caloric intake (about half the prescribed reduction). Participants in the ad libitum condition reduced caloric intake by <2% . CALERIE data are available at https://calerie.duke.edu/samples-dataaccess-and-analysis.
Klemera-Doubal method (KDM) Biological Age. KDM Biological age was measured according to the procedure described in our previous article (Belsky et al., 2017). Briefly, we computed KDM Biological Age from CALERIE blood chemistry and blood pressure data using the algorithm proposed by  trained in data from the NHANES following the method originally described by Levine  and NHANES data from years matched to the timing of the CALERIE Trial. We included 8 of Levine's original 10 biomarkers in the algorithm: albumin, alkaline phosphatase (log), blood urea nitrogen, creatinine (log), C-reactive protein (log), HbA1C, systolic blood pressure, and total cholesterol. Cytomegalovirus optical density and lung function were not measured in CALERIE. We supplemented the algorithm with data on uric acid and white blood cell count.

Sensitivity analyses.
We tested sensitivity of mPoA to alternative methods of normalizing DNA methylation data. We normalized data using the 'methylumi' and 'minfi' packages and computed correlations between mPoA measures derived from these two datasets. The correlation was r=0.94. The elastic net model selected 46 CpGs to compose the mPoA. One of these CpGs, cg11897887, has been identified as an mQTL (Volkov et al., 2016). To evaluate sensitivity of results to the exclusion of this SNP, we computed a version of the mPoA excluding this CpG and repeated analysis. This version of the score was correlated with the full mPoA at r=1. Results were the same in analyses with both versions (available from the authors upon request).
Another CpG selected in the elastic net, cg05575921, is located within the gene AHRR, previously identified as a methylation site modified by tobacco exposure and associated with lung cancer and other chronic disease, e.g. (Fasanelli et al., 2015;Reynolds et al., 2015). We tested sensitivity of results to the exclusion of this probe using the method described above. This version of the score was correlated with the full mPoA at r=0.94. Again, results were the same in analyses with both versions (available from the authors upon request).

Bootstrap repetition analysis to estimate out-of-sample correlation between mPoA and longitudinal Pace of Aging.
The Dunedin Study is the only dataset to include measured 12-year longitudinal Pace of Aging. To estimate the out-of-sample correlation between mPoA and the original Pace of Aging measure, we conducted 90/10 crossfold validation analysis. We randomly selected 90% of the cohort to serve as the "training" sample in which the mPoA algorithm was developed. We used the remaining 10% to form a "test" sample to estimate the correlation between mPoA and Pace of Aging. We repeated this analysis across 100 bootstrap repetitions. In each repetition, we randomly sampled 90% of the cohort to use in the training analysis and reserved the remaining 10% for testing.
The mPoA algorithms developed across the 100 bootstrap repetitions included different sets of CpGs (range of 21-209 CpGs selected, M=54, SD=27 CpGs). However, the resulting algorithms were highly correlated (mean pairwise r=0.90, SD=0.14). The average correlation between the 90%-trained mPoA and longitudinal Pace of Aging in the 10% test samples was Contact the corresponding author for details of the DNA methylation algorithm used to calculated DunedinPoAm: daniel.belsky@columbia.edu

Supplemental Table 2. Comparison of age-38 DundedinPoAm and age 26-38 Pace of Aging effect-sizes for analysis of healthspan-related characteristics.
The table shows effect-sizes for analysis of healthspan-related characteristics at age 45 years. mPoA was measured from blood DNA methylation collected when Study members were aged 38 years. Pace of Aging was measured from longitudinal change in 18 biomarkers across measurements made at ages 26, 32, and 38 years. Sample restricted to N=810 Study members with data on mPoA and Pace of Aging. Effect-sizes correspond to the analysis reported in Supplemental Figure 1B and are reported in terms of standard deviation differences in the age-45 outcome associated with a 1 standard deviation increase in mPoA (i.e. effect-sizes are interpretable as Pearson r). All models were adjusted for sex.  Ferris, 1978) completed by participants at each assessment wave. We classified participants as being current, former, or never smokers (Gao et al., 2019a). Pack years is the total number of cigarettes smoked across the participants lifetime in units equivalent to the number of cigarettes smoked during a year of smoking 1 pack (20 cigarettes) per day. Age acceleration residuals for epigenetic clocks were calculated by regressing epigenetic age on chronological age and predicting residual values. All methylation measures were standardized to M=0 SD=1 for analysis. Effect-sizes thus reflect risk associated with a 1-SD increase in the methylation measure.
Supplemental Table 6. Effect-sizes for associations of socioeconomic status (SES) and victimization exposure with DunedinPoAm and epigenetic clocks at age 18 in the E-Risk Study.
The table shows effect-sizes reported as standardized regression coefficients (b) and 95% confidence intervals (CIs) from models in which childhood family socioeconomic status (SES) and victimization were predictor variables and the dependent variables were the DNA methylation measures. Model 1 included covariate adjustment for sex. Model 2 additionally included covariates for estimated cell counts. (Chronological age was the same for all twins in the birth cohort.) Models 3 and 4 adjusted for smoking status. Only 33% of the analysis sample had ever smoked. Model 3 included covariates adjusting for smoking status at age 18 (never, former, current). Model 4 included nonsmokers only. Methylation measurements were derived from blood DNA methylation collected when Study members were aged 18 years and were residualized for plate number prior to analysis. All E-Risk participants are the same chronological age. Epigenetic clock measures were differenced from this chronological age prior to analysis. Effect-sizes are reported in terms of standard deviation differences in the outcome associated with a 1 standard deviation increase in methylation measures. All models were adjusted for sex. Standard errors were clustered at the family level to account for nonindependence of twin data.   Figure 3. Effect-sizes for associations of DunedinPoAm and epigenetic clocks with KDM Biological Age and self-rated health in Understanding Society. Effect-sizes are standardized regression coefficients (interpretable as Pearson r). Effect-sizes are computed for "epigenetic-age acceleration" values of the clocks (i.e. epigenetic ages residualized for chronological age). The dependent variable in analysis of KDM Biological Age was the difference between KDM Biological Age and chronological age. Self-rated health scores were reversed for analysis so that higher values correspond to ratings of poorer health. All models were adjusted for chronological age and sex.

KDM Biological Age
Self-rated Health (reversed) -.  Figure 4. Effect-sizes for association of DunedinPoAm and epigenetic clocks with mortality and morbidity in the Normative Aging Study. Time-to-event analysis of mortality and chronic disease incidence were conducted using Cox proportional hazard models to estimate hazard ratios (HRs) in N=771 participants. Repeated-measures analysis of chronic disease prevalence was conducted using Poisson regression to estimate incidence rate ratios (IRRs) within a generalized estimating equations framework to account for the nonindependence of repeated observations of individuals (N=1,488 observations of 771 individuals). All models included covariate adjustment for chronological age. Age acceleration residuals for epigenetic clocks were calculated by regressing epigenetic age on chronological age and predicting residual values. All methylation measures were standardized to M=0 SD=1 for analysis. Effect-sizes thus reflect risk associated with a 1-SD increase in the methylation measure.  Figure 5. Effect-sizes for associations of childhood socioeconomic status and victimization exposure with DunedinPoAm and epigenetic clocks. Effect-sizes are standardized regression coefficients and 95% confidence intervals from models in which childhood family socioeconomic status (SES) and victimization were predictor variables and the dependent variables were the methylation measures. All E-Risk participants were the same chronological age (18 years) at the time of blood collection. Epigenetic clock measures were differenced from this chronological age prior to analysis. Effect-sizes are reported in terms of standard deviation differences in the methylation measures associated with a 1 standard deviation increase in the predictor measure. All models were adjusted for sex. Standard errors were clustered at the family level to account for non-independence of twin data.  Figure 6. Effect-sizes for associations of DunedinPoAm and epigenetic clocks with future rate of change in KDM Biological Age in Ad Libitum (AL) control group and Caloric Restriction (CR) intervention group participants in the CALERIE Trial. Effect-sizes are stratified marginal effects computed from regressions of predicted slopes of change in KDM Biological Age on treatment condition, baseline values of the methylation measures, and the interaction of treatment condition and the methylation measures. Effect-sizes for association of baseline DunedinPoAm with rate of change in KDM Biological Age over follow-up are plotted separately by treatment condition (AL for Ad Libitum control, plotted as solid-colored circles, CR for caloric restriction, plotted as hollow circles). Effect-sizes reflect the predicted increase in the rate of annual change in KDM Biological Age over the 2 years of follow-up associated with a 1 SD increase in the methylation measure. For example, for DunedinPoaM, a value of 0.2 for participants in the AL control condition indicates that having DunedinPoAm 1 SD higher at baseline is associated with an increase in the aging rate of 0.2 years of physiological change per 12 months of follow-up. Models included covariate adjustment for sex and chronological age at baseline.