Development of the VISAGE enhanced tool and statistical models for epigenetic age estimation in blood, buccal cells and bones

DNA methylation is known as a biomarker for age with applications in forensics. Here we describe the VISAGE (VISible Attributes through GEnomics) Consortium’s enhanced tool for epigenetic age estimation in somatic tissues. The tool is based on eight DNA methylation markers (44 CpGs), bisulfite multiplex PCR followed by sequencing on the MiSeq FGx platform, and three statistical prediction models for blood, buccal cells and bones. The model for blood is based on six CpGs from ELOVL2, MIR29B2CHG, KLF14, FHL2, TRIM59 and PDE4C, and predicts age with a mean absolute error (MAE) of 3.2 years, while the model for buccal cells includes five CpGs from PDE4C, MIR29B2CHG, ELOVL2, KLF14 and EDARADD and predicts age with MAE of 3.7 years, and the model for bones has six CpGs from ELOVL2, KLF14, PDE4C and ASPA and predicts age with MAE of 3.4 years. The VISAGE enhanced tool for age estimation in somatic tissues enables reliable collection of DNA methylation data from small amounts of DNA using a sensitive multiplex MPS assay that provides accurate estimation of age in blood, buccal swabs, and bones using the statistical model tailored to each tissue.


INTRODUCTION
The use of epigenome-wide association studies has provided a deeper insight into individual differences and fluctuations in DNA methylation levels in the human genome. Many studies have identified agerelated differentially methylated regions (DMRs) and sites (DMSs) that have the potential to predict epigenetic age in various human tissues [e.g. [1][2][3][4]. These studies indicated that epigenetic age is highly DNA-based age prediction in forensic applications can be used for intelligence purposes to gain information from unknown individuals who have left their DNA at a crime scene or whose remains are subjected to genetic identification. Perpetrators, who remain unknown to the investigating authorities from their forensic DNA profiles, cannot be identified with standard DNA profiling techniques. In recent years, forensic DNA phenotyping has emerged as an approach that can obtain information of an unknown crime scene sample donor on their appearance and bio-geographic ancestry from crime scene DNA [9]. This allows for focused police investigations to help characterize unknown perpetrators, where estimating age from crime scene DNA would also be highly informative. Age is an important phenotypic feature that manifests itself through a set of visible attributes that are difficult to hide or modify, and therefore can be very effective in narrowing down the number of potential suspects in the context of forensic DNA phenotyping. In particular, aging is reflected in several features of human visible characteristics like hair greying, hair loss, facial wrinkles and other signs of aging skin [10]. Thus, reliable DNA-based prediction of appearance traits as a forensic intelligence tool is ideally accompanied with age estimation.
A number of methods that use DNA methylation markers to predict age have been reported. Such methods can be classified into two main categories according to the number of CpG sites included. Large scale methods incorporate hundreds of loci [2,3,5,11] and require DNA microarray technology to collect the data necessary for using predictive algorithms. Since forensic genetics is very demanding in terms of the sensitivity of the methods applied, smaller sets of markers that can be analyzed using lower DNA input methods are more suitable. Most of the epigenetic age prediction methods proposed in the forensic field have been designed to predict age in blood. Weidner et al. (2014) developed a 3-CpG model involving genes ASPA, ITGA2B and PDE4C, and using pyrosequencing technology that allowed age estimation with a mean absolute error (MAE) of 4.12 years. Several smaller age predictive tests were then proposed in forensics [12,13]. The test developed by Zbieć-Piekarska offered a set of markers suitable to predict age from blood in Europeans and Asians [14]. The method involved analysis of five DMSs in ELOVL2, TRIM59, C1orf132/MIR29B2CHG, KLF14 and FHL2, that provided prediction accuracy in blood with a MAE of 3.7 and 4.2 years in Polish and Korean populations, respectively [12,14]. Importantly, these markers also showed similar accuracy for age prediction in saliva (MAE = 3.6 years) [15] and buccal swabs (MAE = 4.3 years) [15]. Moreover, some of these markers also showed a correlation with age in bones and teeth [16][17][18]. Blood (especially bloodstains), buccal swabs, and skeletal remains are commonly analyzed in forensic laboratories for human identification. Age estimation methods developed for these tissues may provide additional information to assist with the identification process. The VISAGE consortium has implemented these five markers in a basic tool for sensitive multiplex PCR amplification of bisulfite converted DNA followed by a massively parallel sequencing (MPS) on a MiSeq FGx instrument [19]. Indeed, due to the quantitative character of DNA methylation variation and the wellknown method-to-method bias of DNA methylation analysis, predictive models based on data generated with one method, including public datasets based on DNA methylation microarrays [12,15] cannot be easily adopted to interpret DNA methylation data generated with another method. Methylation analysis methods widely vary in forensic use and have hampered comparisons of the efficiency of qPCR [20], SNaPshot [21] and MPS [22].
MPS offers a universal solution to DNA variation analysis that can be applied to study DMSs as well as the established variation of single nucleotide polymorphism (SNPs) and short tandem repeats (STRs) [23]. Differences exist between MPS analysis of both marker types since the completely quantitative nature of DMS analysis contrasts with the mainly qualitative nature of SNP and STR analysis, so the MPS multiplex capacity is markedly lower for DMS analysis compared to SNP and STR analysis. Genotyping of bisulfiteconverted DNA has become the standard for DNA methylation analysis. Although the design of targeted PCR-based MPS tests is difficult to apply to bisulfiteconverted DNA, small scale multiplexing is possible and this could be a solution to the problem of measuring DNA methylation in forensic genetic tests, offering the right balance between sensitivity, throughput and reliability. The forensic community has made the first steps towards implementation of MPS for DNA methylation analysis [24][25][26].
In this study, we have advanced the development of epigenetic age estimation in forensics and present the VISAGE enhanced tool for age estimation of DNA from somatic tissues, combining eight age-informative AGING DNA methylation markers into a bisulfite multiplex PCR for simultaneous targeted MPS and three new statistical models to predict age in blood, buccal cells and bones.

Assay development and optimization
Eight age-informative DNA methylation markers containing 44 CpG sites (Supplementary Table 1) were selected from the literature based on their reported age correlation in different somatic tissues such as blood, buccal cells and bones [11,12,[27][28][29][30][31], and were successfully combined into one multiplex PCR assay for bisulfite-converted DNA. Five of these selected markers were included in the previously described VISAGE basic tool for estimating age from blood [19]. Primer pairs targeting three newly selected markers, EDARADD, PDE4C and ASPA were added sequentially to the multiplex assay of the VISAGE basic age tool [19] to monitor the effect of each added marker individually and to finally achieve a functioning multiplex for all eight markers ( Table 1).
The design was first tested with artificially methylated DNA standards (N = 10) at an optimum input DNA amount for bisulfite conversion of 200 ng using the MiSeq Reagent Kit v2. Sequence read depth at all 44 CpGs covered by the primer design of the eight DNA methylation markers exceeded the minimum of 1,000 reads (mean = 42,012.1 ± 21,282.7 paired reads) set for accurate methylation quantification [32]. Measured methylation values of the differentially methylated DNA standards showed robust quantification with an average difference between duplicates of 1.3% ± 2.1% (one CpG per marker; Supplementary Figure 1). As expected from the lower PCR product yields for PDE4C, CpG positions located in the target sequence of this marker produced low read depths (mean = 3,386.3 ± 1,630.4 paired reads) compared to the other markers. Figure 1A illustrates normalized read depth of one CpG site per marker, clearly indicating the lower sequence output of PDE4C C5. Furthermore, the read depths of the CpGs located in PDE4C (7 CpGs) and ELOVL2 (9 CpGs) were lower for positions that were not covered during sequencing from both ends ( Figure 1C, 1D). This results from the read length of the used sequencing kit (2 x 150 cycles), which is not sufficient to cover the whole targeted region of the two longest amplicons from both directions (ELOVL2: 267 bp and PDE4C: 215 bp).
Aiming at a balanced PCR multiplex, the concentrations of the PDE4C primers were increased equimolarly and more balanced amplicon yields were obtained at an assay concentration of 1 µM (Supplementary Figure 2). The altered PCR multiplex was tested using the MiSeq Reagent Kit v3 and methylated DNA standards (N = 12). Read depth of PDE4C CpGs was increased to 23,005.7 ± 4,659.5 paired reads. When comparing normalized read depths to earlier generated data, the performance of PDE4C was enhanced, but a fully balanced sequence read distribution was not achieved. This was observed in the relative decrease of read depth at MIR29B2CHG ( Figure 1A). However, on average the v3 kit led to an approximately doubled sample coverage (v3: 701,330.8 ± 147,289.9 versus v2: 348,679.2 ± 62,647.1 paired reads) and an increase in mean read depth of 45,109.4 paired reads (overall mean = 87,121.5 ± 54,854.7, Figure 1B). The number of reads at target CpG positions on MIR29B2CHG still yielded a mean of 20,2435 ± 8,741 paired reads. Additionally, the longer read length of the v3 kit enabled constant read depths at all the CpGs of ELOVL2 and PDE4C across the targeted sequences ( Figure 1C, 1D). The sequence quality control showed that all 44 target CpG positions had a misincorporation rate below 0.4% (mean = 0.04%) and the calculated bisulfite conversion efficiency exceeded 99.6% for all samples.

Evaluation of MPS assay performance
Seven differentially methylated DNA standards were assessed with the final optimized assay using the v3 kit. The methylation quantification of all 44 CpGs versus the expected methylation is shown in Figure 2. Measured methylation levels at most CpGs were close to the line of identity, indicating good overall concordance between experimentally determined and actual methylation levels. CpG sites at MIR29B2CHG and EDARADD showed an overestimation of methylation levels, while CpGs at ELOVL2 exhibited a bias towards unmethylated Cs. The methylation quantification obtained appeared to be robust with an average difference between duplicates of 1.9% ± 1.2% across markers and methylation levels ranging from 5% to 75% (Supplementary Table 2). However, two outliers were detected: both the 50% methylated sample at PDE4C positions and the 25% methylated sample at MIR29B2CHG showed higher variation (12% and 7.5%, respectively) between duplicates. Potential variability between the target CpG sites throughout the same amplicon was explored by calculating the maximum difference in methylation at one marker and sample (excluding ASPA, N = 1). Overall, these differences were low with a mean of 1.8% ± 2% across markers and ratios. The highest variation was detected at KLF14 with 3.9% ± 2.4% when comparing amplicons, which at the same time showed the most stable methylation quantification (mean difference between duplicates = 0.8% ± 1%). Further evaluation of DNA standards at AGING Differences of mean methylation obtained for duplicates from 100 ng to 1 ng DNA input at all 44 CpGs were compared to the mean methylation level obtained for the 200 ng reference sample. The average differences and standard deviations per marker are shown in Figure  3. At 100 ng and 50 ng DNA inputs, methylation levels were close to those of the reference with 1.6% ± 1.1% and 1.7% ± 1.4% differences across the eight markers, respectively. Variability increased slightly from 50 ng to 20 ng (3.4% ± 3.9%) and from 20 ng to 10 ng (4.4% ± 2.7%). In particular, PDE4C showed higher variation at 10 ng with a mean difference of 11% to the optimum DNA input. A more detailed analysis of the 10 ng replicates showed an increased difference between duplicates (median = 6.0%) compared to the higher DNA input samples of the sensitivity study (median difference = 1.5% at 200 ng to 3.0% at 20 ng). Higher deviations from methylation values obtained for the reference DNA input were observed for the low quantity 10 ng sample (2.6 nM library) possibly introducing greater variation. Additionally, higher base misincorporation rates were observed within the PDE4C amplicon sequence of this duplicate (Supplementary Figure 3B). At 1 ng DNA input, methylation quantification became unreliable with extensive deviation from the values of the reference DNA input (14.0% ± 20.5%).

Development of prediction model for blood, buccal cells and bones
DNA methylation data generated with the MPS tool in blood (N = 160), buccal swab (N = 160) and bone (N = 161) DNA samples were randomly divided into a training set (N = 112 each) and testing set (N = 48 blood and buccal cells and 49 bones) by retaining comparable distributions of age and sex between both sets as far as possible ( Table 2). The correlation of DNAm in AGING particular CpG sites with age in blood, buccal cells and bones was investigated in the appropriate training sets using univariate linear regression analysis. All 44 CpG sites at the eight DNA methylation markers covered by the MPS tool showed a statistically significant correlation with age in blood (Supplementary Table 1). For 26 CpG sites (59%) located in ELOVL2, FHL2, TRIM59 and PDE4C very high β values were observed (>0.9) with single CpGs explaining R 2 > 80% of the variation observed in age in blood samples (>90% for 8 CpG sites within ELOVL2 and FHL2). The highest statistical significance was noted for ELOVL2 C9 (β = 0.963; P-value = 9.724×10 -65 ; R 2 = 0.928). As with blood, all cytosines also showed significant correlation with age in buccal cells, but only in 13 of the sites (30%) were β values > 0.9 recorded. The highest significance was achieved for PDE4C C5 (β = 0.965; Pvalue = 3.648×10 -65 ); by itself explaining R 2 = 93.1% of the variation observed in age in buccal cells ). Since DNA methylation-age correlations may show non-linear patterns, various types of data transformation were also tested. Curve estimation analysis indicated a non-linear pattern for CpG sites within ELOVL2 with power transformation best fitting the DNA methylation data for blood and buccal cells and thus power transformed data were used in modelling ( Figure 4). AGING DNA methylation data generated with the MPS tool for 112 blood samples, 112 buccal cell samples and 112 bone samples were next used to train the three respective models. Because of the lack of the data in PDE4C C5 or ELOVL2 C2, two samples from the training sets were rejected and therefore the final number of samples used to train the models for buccal cells and bones was 111. The multivariate stepwise   linear regression method was applied to select markers from the available set of 44 CpGs from eight genes and to train the final models. The data for the selected CpG positions in ELOVL2 were power transformed before prediction analysis and this treatment proved to improve age prediction for DNA from blood and buccal cells. The analysis of blood samples showed that the optimal age model for blood centered on 6 CpG sites from six genes, ELOVL2, MIR29B2CHG, KLF14, TRIM59, FHL2 and PDE4C (Table 3). Effect sizes expressed by β values were different and except for marker MIR29B2CHG C1 (β = -0.234), were positively correlated with age. The largest effect size in the model was attributed to the power transformed ELOVL2 C7 position (β = 0.328; P-value = 3.24×10 -7 ) and the smallest one to TRIM59 C8 (β = 0.096; P-value = 4.48×10 -4 ). This model explains 98.2% of age variation (R 2 ) observed in the training set and predicts age with an accuracy of MAE = 2.2 years in the training set and MAE = 3.2 years in the testing set ( Figure 5 and Table  4). Table 5 outlines the optimal model for buccal cells comprising 5 CpG sites from 5 genes, PDE4C, MIR29B2CHG, ELOVL2, KLF14 and EDARADD. The largest β value was observed for marker PDE4C C5 (β = 0.351; P-value = 1.29×10 -7 ) and this position was found to explain most of the age variation explained by the model (R 2 = 93.1%). Power transformed ELOVL2 C9 had the second largest effect (β = 0.244; P-value = 4.81×10 -5 ). Negative and weak correlation was observed for a CpG in EDARADD not included in the blood model (β = -0.098). The final model explains R 2 = 97.5% of variation observed in the training set and predicts age with an accuracy of MAE = 2.5 years in the training set and MAE = 3.7 years in the testing set ( Figure 5 and Table 4). Table 6 shows the optimal model for bones comprising six CpGs from four genes, ELOVL2, PDE4C, KLF14   Figure 5 and Table 4).

AGING
Our data show an increase in the MAE value with increased age of the sample donors for blood and buccal cell models. The highest MAE value was observed in the 3 rd (age 41-60 years) and 4 th (>60 years) age categories (Table 4). When age was analyzed as a continuous variable it was significantly correlated with MAE in both tissue types (Pearson correlation P-value of 0.001 and 3.86×10 -4 for blood training and testing sets, respectively and P-values of 0.011 and 2.52×10 -4 for buccal cells training and testing sets, respectively). This effect was not seen in bones, neither in the training (P-value = 0.122) nor the testing set (P-value = 0.070). AGING Table 4. MAE in different age categories. Training set  Testing set  Training set  Testing set  Training set  Testing set  1 1

Age prediction in blood from methylation data obtained with the VISAGE basic tool
To enable age prediction from methylation data collected with the previously reported VISAGE basic tool for age estimation from blood, comprising a 5plex MPS assay [19], we re-trained our 112-sample containing blood training set for the five CpGs based on data generated with the VISAGE enhanced tool. This model predicts age in the training set with an accuracy of MAE = 2.7 and in the testing set with MAE = 3.8. The lower accuracy achieved with the 5markler model compared to the full 6-marker model (see above) can be explained by the addition of PDE4C in the VISAGE enhanced model. Moreover, when performing age prediction modelling using data for these five markers obtained from buccal cells and bones, we achieved higher errors for buccal cells with MAE = 3.9 and 4.3 for training and testing sets, respectively, and bones with MAE = 4.7 and 4.0 years for training and testing sets, respectively. Notably, the buccal and bone models based on the eight DNA methylation markers of the VISAGE enhanced tool achieved more accurate age prediction than those based on the five markers in the VISAGE basic tool for blood, lacking the three additional markers, covering all three somatic tissues in the VISAGE enhanced tool.

AGING
To check the compatibility of the two VISAGE age tools, the VISAGE basic tool for estimating age from blood and the VISAGE enhanced tool for estimating age from somatic tissues including blood, buccal cells and bones, methylation results obtained for the seven artificially methylated DNA standards generated with the 8-plex assay of the VISAGE enhanced tool were compared to those produced with the 5-plex assay of the VISAGE basic tool [19]. Both assays showed very similar amplification bias for DNA methylation quantifications at the overlapping five CpGs (Supplementary Figure 4A). When subtracting methylation levels obtained with the VISAGE 8-plex assay from values obtained for the 5-plex assay, the 5plex assay of the VISAGE basic tool appeared to produce slightly higher methylation values at the five relevant CpG sites (mean differences = 2.4%) than the 8-plex assay of the VISAGE enhanced tool did. The absolute differences between methylation quantifications at the same expected DNA methylation level were on average 3.0% ± 2.7% across the seven DNA standards and markers for the two assays (Supplementary Figure 4B). To explore whether these differences were significantly different, observed methylation values with the 8-plex enhanced tool assay were plotted against those of the 5-plex basic tool (Supplementary Figure 5). Regression models showed no statistically significant performance differences between the two assays (Bonferroni corrected P-values: ELOVL2 C7: 0.681, KLF14 C1: 1.000, MIR29B2CHG C1: 1.000, FHL2 C2: 0.756, TRIM59 C7: 1.000). These results indicate that the blood model developed here for the five CpGs in the VISAGE basic tool for age estimation from blood [19] can be used for estimating age in blood, based on data generated with the VISAGE basic tool. However, due to the lower errors achieved with the blood model based on the 8-plex data, use of the VISAGE enhanced tool, including 8-plex data and the 6-marker predictive model, is advised to estimate age from blood.

Accuracy of developed predictive models in predicting age in various human tissues
To assess the performance of the developed prediction models in predicting age in human tissues other than blood, buccal cells or bones, we also applied the VISAGE enhanced age tool to 24 DNA samples collected postmortem from various tissues of 22 males and 2 females ranging in age from 21 to 73 years at the time of death (mean age = 41.0 ± 12.5). Table 7 summarizes the results of these experiments. This limited dataset confirmed that the developed blood model can accurately predict age in blood samples of deceased individuals (MAE = 3.1). However, the prediction accuracy obtained with postmortem cartilage and muscle samples estimated with the models individually developed for blood, buccal cells and bones was unsatisfactory (Table 7). Univariate association testing conducted using linear regression in this small set of 24 samples showed positive signals of association for TRIM59, FHL2, MIR29B2CHG, ELOVL2 and PDE4C in muscle and cartilage (data not shown). Although we found our eight DNA methylation markers were associated with age in the vast majority of CpG sites in different tissue types, detailed analysis of the DNA methylation-age correlation for particular CpG sites showed a different pattern of DNA methylation changes in different tissues (Supplementary Figure 6), and this underlies the high predictive error observed for muscle and cartilage when using the developed models.

DISCUSSION
Although the use of hundreds of CpGs for age estimation delivers small prediction errors [2,7], such marker densities are currently impossible in forensic applications due to the lack of suitably sensitive DNA technology. Smaller sets of DNA methylation age markers that can be analyzed with forensically suitable technology typically predict age in different tissues with different accuracies. To develop a universal epigenetic age prediction tool for forensic applications, the VISAGE enhanced tool for age estimation from somatic tissues represents a significant step forward in age estimation for criminal investigations. This tool includes a MPS-based assay for eight DNA methylation markers (44 CpGs) and three different statistical models appropriate for blood, buccal cells and bones as DNA sources. The eight markers previously demonstrated age correlation in various forensically relevant DNA sources [12,13,15,28,30,33]. The data we used here to train and test the prediction models were generated with the developed MPS tool, as method-to-method bias prevents the usage of datasets available from the literature.
A crucial step in the optimization of the MPS assay of the VISAGE enhanced age tool was the development of the multiplex PCR for bisulfite converted DNA. The severe chemical treatment during bisulfite conversion not only leads to DNA degradation and loss, but also markedly reduces the complexity of the DNA sequence. Consequently, primer specificity is potentially reduced and the formation of primer dimers is favored [34]. Additionally, most targeted CpGs were located in CpG islands (except ASPA and EDARADD) that represent regions difficult to amplify. We successfully combined ASPA, EDARADD and PDE4C with the five markers of the VISAGE basic assay (ELOVL2, MIR29B2CHG, KLF14, FHL2, TRIM59) into an 8-plex PCR assay for the final VISAGE forensic MPS tool described here. Samples processed with the re-configured assay using the MiSeq Reagent Kit v3, which offers increased read length and is commercially available in forensic quality, yielded high quality sequences and high read depths for all eight amplicons. The observed variability between read depths at targeted CpGs most likely results from differences in primer pair efficiencies that could not be fully balanced by adjusting primer concentrations ( Figure 1A, 1B). Assay performance was confirmed with DNA methylation standards that allowed the assessment of methylation quantification for robustness and linearity. Since the introduction of bisulfite conversion for DNA methylation analysis, PCR bias towards the unmethylated or methylated template molecule has been repeatedly described [35][36][37], highlighting the difficulties in achieving a completely methylation-independent amplification. To avoid accentuating such bias, mismatch primers were used for CpG sites within primer sequences. Although methylation quantification of most markers was close to the line of identity, we observed a bias towards methylated DNA templates (MIR29B2CHG and EDARADD) as well as an underestimation of methylation levels (ELOVL2). Interestingly, the avoidance of a CpG or the inclusion of one or two CpGs in the primer sequences did not appear to noticeably change the strength of methylation bias. The observed PCR bias appeared stable, with minor differences between duplicates (mean = 1.9% ± 1.2%), which is important for reliability of age prediction. The most challenging marker for multiplex development was PDE4C, for which a higher variability (4.8% ± 3.7%) was observed in comparison to other markers. Nevertheless, methylation quantification of samples used for predictive modelling showed that PDE4C has a wide DNA methylation range and a large effect size, which makes it a reliable marker for all three age prediction models. Assuming that the range of DNA methylation throughout a person's lifespan determines the required minimum accuracy for a marker, the highest accuracy would be needed for KLF14. In line with this consideration, the methylation quantification at optimum DNA input for KLF14 showed the smallest differences between duplicates. For further analysis of the used DNA methylation controls, we compared maximum differences in methylation values obtained at the same target and sample. Surprisingly, the highest variability was observed at 100% methylation level (3.8% ± 2.6%) and not, as expected at 50% methylation. As the fully methylated DNA standards represent an artificial system, some variability may be inherent in the methylation controls.
Like all quantitative methods, DNA methylation analysis to attempt chronological age prediction is impacted by stochastic effects when the DNA input amount is low [38]. This poses a limitation to the application of quantitative DNA methylation to forensic samples that contain only minute amounts of DNA. Furthermore, the DNA is not necessarily equally distributed in such tissues, which adds additional variation due to a sampling effect. Additionally, DNA loss during bisulfite conversion and further stochastic processes during the subsequent multiplex PCR step count up to this variation. Previous studies suggested that 20 ng to 10 ng DNA template used for PCR are required for a reliable methylation quantification [38,39], although a higher sensitivity (10 ng DNA input for bisulfite conversion ~2 ng at PCR) has recently been reported [26]. We tested a dilution series of the initial DNA amount used for bisulfite conversion to perform a first sensitivity evaluation of the VISAGE enhanced tool age assay. When comparing mean quantified DNA methylation levels to the optimum DNA input, we observed a robust quantification down to 20 ng DNA input for most markers. According to previous studies investigating DNA loss during bisulfite conversion, the DNA amount used for PCR would be estimated to be from 8.8 ng (45% DNA loss [40]) to 11.8 ng (26% DNA loss [41]), which is in agreement with considerations regarding low DNA quantities. Results of this sensitivity study clearly indicate that methylation quantification of 1 ng samples was unreliable with increased differences to the reference DNA input as well as an increased variability between targeted CpGs of the same amplicon. Additionally, samples at 1 ng DNA input showed CpGs that appeared completely methylated (e.g. at ELOVL2, PDE4C) or unmethylated (e.g. PDE4C), indicating an amplification bias towards the methylated or unmethylated DNA template during PCR. Overall, the results from development and optimization of the VISAGE enhanced tool for age estimation from somatic cells showed promising results for future application in routine forensic DNA analyses. However, small differences between sequencing runs, may affect the final age prediction outcome, as recently described by Han et al.
(2020) [42]. The study reported a shift in DNA methylation levels between sequencing runs that led to an increased MAE in an independent validation set. The extent of such variabilities and the impact on age prediction, particularly when analyzing low DNA input samples, needs to be further addressed before application in real casework. Further validation studies including inter-laboratory exercises will bring deeper insight into the assay's robustness, reproducibility and sensitivity.
The developed 8-plex MPS assay was used to collect DNA methylation data necessary for the development of the three age prediction models. The model for blood comprises six CpGs from ELOVL2, MIR29B2CHG, KLF14, FHL2, TRIM59 and PDE4C and predicts age with MAE of 3.2 years. As expected, ELOVL2 was the top ranked marker, which alone explains 95% of variation in this age model. Non-linear age-DNA methylation correlation was observed in this DMRs, in line with other studies [28,43]. The correlation of DNA methylation in ELOVL2 with age in blood was first suggested by Garagnani et al. (2012) [44] and was soon confirmed in independent studies, making ELOVL2 the most important epigenetic age predictor in a range of fields including forensics [3,4,39,45]. The five remaining predictors include blood markers widely validated in studies of different populations from Europe and Asia [13-15, 22, 30, 46].
The model for buccal cells includes five CpG sites from PDE4C, ELOVL2, MIR29B2CHG, KLF14 and EDARADD and predicts age with a MAE of 3.7 years. PDE4C was the top ranked marker and alone explained 93.1% of variation in age. Early studies suggested PDE4C as an age predictor in blood and saliva [27,47] and it was rapidly adopted in age prediction models for blood [11,28]. In our study, this marker had a higher predictive value in buccal cells, which confirms the conclusions of the study of Eipel et al. (2016) [31], whose markers showed higher correlation with age in saliva and buccal cells than in blood. ELOVL2, MIR29B2CHG and KLF14 come from the five best predictors for blood selected by the study of Zbieć-Piekarska [12], which had been shown to be suitable for predicting age in saliva and buccal cells in Asian populations [15]. The buccal cell tissue marker EDARADD, was included in the first age prediction algorithm developed for saliva [27] and replicated in other studies that investigated blood, saliva and buccal cells [28,[48][49][50][51].
Furthermore, we present here an age prediction model for bones which is based on only six CpGs from four In agreement with other studies, an increased MAE in age predictions of older individuals was observed in blood and buccal cells, which can be explained by a combination of genetic and environmental factors influencing the individual rate of aging [6,43,52]. Therefore, we calculated the MAE for different age groups, which allows to account for the corresponding age category in the interpretation of real casework. In addition, the MAE range can be provided along with the predicted age (e.g. from 2.2 to 5.5 years for blood).
Our study confirms the importance of ELOVL2 and PDE4C for epigenetic age prediction and provides further evidence that MIR29B2CHG (ranked 3 rd in the blood model and 2 nd in the buccal cell model), is a valuable age predictor in forensics. In the first two markers, hypermethylation with age is observed, and in the third hypomethylation. All three are characterized by a wide range of DNA methylation levels during an individual's lifespan (50-70%). KLF14 is characterized by the lowest range of DNA methylation over a lifespan but this marker is consistently suitable in all three predictive models [21,51,53]. ASPA was chosen only for use as a bone age predictor in our study, while FHL2 and TRIM59 were selected exclusively as blood age predictors, although correlation with other tissue types has been demonstrated for these markers. Our study aimed to select a universal set of DMSs for epigenetic age prediction in various somatic tissues, but since the pattern of DNA methylation change differed in various cell types tested, the DNA methylation data collected for individual tissues had to be incorporated into separate training sets for tailored prediction models. In addition, our pilot study of postmortem samples, which analyzed tissues from 24 deceased persons, indicated the models developed did not predict cartilage and muscle age correctly.
In summary, the study outlined here presents a complete tool for estimating a person's age from DNA in forensic applications that deal with low amounts of DNA from the three forensically relevant tissue types of blood, buccal cells and bones. The VISAGE enhanced tool for age estimation in somatic tissues comprises a single bisulfite MPS assay targeting 44 CpGs from eight carefully selected DNA methylation markers and three separate predictive models for blood, buccal cells and bones. The MPS assay provided reliable and reproducible methylation quantifications, enabling accurate age prediction in samples down to a minimum of 20 ng of DNA. The three individual tissue models provide a good balance of marker number and accuracy given the capacity limitations of the DNA methylation measurement technology used. Future work could focus on increasing the model testing datasets to investigate the reliability of reported error estimates for the three models. It will also be useful to gauge the performance of the age prediction models for data produced using the VISAGE enhanced tool with additional forensically relevant somatic tissues. Notably, DNA methylation variation in non-somatic tissues, such as semen, is known to differ from that in somatic tissues; the development of an epigenetic tool for age estimation in semen is currently in progress by the VISAGE Consortium.

Selection of DNA methylation markers for age prediction in somatic tissues
Five age markers previously described in Zbieć-Piekarska et al. (2015) [12] were used as the basis for developing the VISAGE enhanced tool for age estimation of forensic DNA from somatic cells. We performed a comprehensive literature search and extended the original marker set comprising ELOVL2, MIR29B2CHG, TRIM59, KLF14 and FHL2 with the three additional markers of EDARADD, PDE4C and ASPA. It has been shown in multiple studies that these three markers have considerable capacity to further improve prediction of age in buccal cells/saliva and have been demonstrated to correlate with age in other somatic cells including bones [11,12,[27][28][29][30][31]. The eight marker combination selected for inclusion in the expanded VISAGE MPS multiplex comprised a total of 44 individual CpG sites (Supplementary Table 1).

Multiplex PCR
Development of the multiplex PCR assay for targeted bisulfite sequencing used primer designs established for the VISAGE basic test [19] (ELOVL2, KLF14, TRIM59, FHL2 and MIR29B2CHG) with primers for the three additional markers (PDE4C, ASPA and EDARADD), either newly designed using MethPrimer [54] and PrimerSuite [55] or gathered from the literature. When CpG sites within the primer sequences were unavoidable, a deliberate mismatch was introduced. In the specific cases of PDE4C and MIR29B2CHG, degenerate primers carrying a Y at CpG positions were also designed and tested, but no increase in amplicon yield was observed (data not shown) and consequently, mismatch primers were utilized in the final multiplex PCR. All newly tested primer pairs are listed in Supplementary Table 3. The formation of non-specific PCR products and primer dimers was evaluated in silico using BiSearch [34] and AutoDimer [56]. Primer sequences and final multiplex PCR concentrations are listed in Table 1.
PCR optimizations were performed with DNA extracted from 10 ml EDTA venous whole blood from three samples using the Blood Maxi Kit (Qiagen, Hilden, Germany) and quantified by real-time quantitative PCR [57].

AGING
Technologies, Santa Clara, CA, USA). The final multiplex PCRs were carried out using the thermocycler steps: initial denaturation at 95° C for 15 min; 38 cycles of 95° C for 10 s, 57° C for 30 s, 72° C for 30 s; final elongation at 72° C for 10 min.
PCR products were also assessed with Sanger sequencing to verify all amplicons before massively parallel sequencing as described in [58]. In brief, reactions were carried out using BigDye Terminator v1.1 Cycle Sequencing kit (Thermo Fisher Scientific -TFS, Waltham, MA, USA) in 10 µl reaction volumes and 0.3 µM primer (listed in Table 1). The thermal cycling comprised steps: 96° C for 1 min; 25 cycles of 95° C for 15 s, 50° C for 5 s and 60° C for 4 min. Cycle sequencing products were purified using centrifugation over Sephadex G-100 columns (Amersham, Little Chalfont, UK). Electrophoresis of sequencing products was performed on an ABI3500 instrument (TFS) using standard settings. Raw sequences were analysed with the Sequencer 5.1 (Gene Codes Corporation, Ann Arbor, MI, USA) software and assembled with an inhouse prepared reference (bisuflite converted reference for targeted amplicons).   Figure 3A), whereas a low level of contamination at TRIM59 cannot be fully excluded. However, read depth was very low compared to the overall read depth at this amplicon (overall mean = 141,623.1 ± 33,757.9 paired reads).

Samples used in predictive modelling
Peripheral blood was collected in EDTA-tubes from 160 unrelated, healthy individuals: 80 males and 80 females in

DNA extraction and quantification
DNA from blood was extracted using a modified salting out procedure [59], PrepFiler Express™ Forensic DNA Extraction Kit (TFS) or standard phenol-chloroform method. Previously used DNA extracts were stored frozen, at 4° C or room temperature (the percentage of methylation detected from different storage conditions was checked randomly and compared with previous Pyrosequencing results) [12]. DNA from buccal swabs and postmortem samples including bones was extracted using a silica-based method with Sherlock AX kit (A&A Biotechnology, Gdansk, Poland). Bone surfaces were cleared of soft tissue with a sterile scalpel and the entire exterior was abraded with a grindstone attached to a Dremel rotary tool to remove potential contaminants. Before the extraction bone pieces (~1cm 3 ) were treated with 15% bleach for 1min, repeatedly shaken with 100% ethanol and distilled water (dH2O), and finally subjected to UV irradiation. The thoroughly dried samples were pulverized using a FreezerMill 6750 apparatus (Spex CertiPrep, NJ, USA) and EDTA decalcification applied to each of the samples. DNA concentration was measured in all samples using Qubit dsDNA HS Assay Kit with the Qubit instrument.  Table 4).

MPS data analysis
Alignment was carried out relative to a custom reference genome containing only the targeted sequences (Supplementary Table 5) using an adapted Burrows-Wheeler alignment for bisulfite converted DNA sequencesbwa-meth [60]. An additional quality control step was performed on the raw data (fastq) for the samples used for predictive modelling, which was reviewed in detail with FastQC software [61]. Bam file creation, sorting, filtering and indexing was performed with Samtools [62]. Alignments of all samples were inspected using the Integrative Genomics Viewer (IGV) [63]. Total numbers of read information was extracted from amplicon positions using bam-readcount with minimum mapping quality and minimum base quality thresholds set to 30 [https://github.com/genome/bamreadcount]. At target CpG sites, obtained C reads were divided by the sum of C reads and T reads to calculate beta values. Observed methylation values refer to percentage beta values. Bisulfite conversion efficiency of samples was estimated by calculating the percentage of mean reversed beta values from all non-CpG-Cs (T reads divided by the sum of C reads and T reads). Total coverage refers to the sum of the number of reads per amplicon (one CpG site per amplicon was selected). Read depth was normalized by dividing the read depth at target positions by the total coverage. Only CpG sites with the minimum number of 1,000 reads were accepted for further analyses including the prediction modelling that was then applied to the data. For statistical analyses Microsoft Excel and R [https://www.r-project.org/] [64] were used.

Statistical analysis and prediction modelling
The correlation between age and DNA methylation levels at the 44 investigated CpGs in eight genes was analyzed in a training set of 112 carefully selected DNA samples from blood, 112 samples from buccal cells and 112 samples from bones ( Table 2). The effect size of particular loci was defined with standardized regression coefficients (β). The linearity of DNA methylation-age correlation was verified for all the tested CpGs. A clear non-linear pattern of correlation was noted for ELOVL2, which is in agreement with previous studies [28,43] and therefore, DNA methylation data for this marker were power transformed before multivariate linear regression analysis was applied. The proportion of age variance explained by individual predictors and their cumulative impact was assessed based on the calculation of R 2 coefficients. The same datasets of 112 blood, 111 buccal cells samples and 111 bone samples were used to develop linear regression age prediction models. The selection of a set of optimal markers was performed by stepwise linear regression with probability of F statistic, based on a statistical test of the improvement in model error, used as a criterion for predictors entry/removal. The developed models were further tested using an independent set of 48 blood samples, 48 buccal cells samples and 49 bone samples ( Table 2). The potential applicability of the developed models to predict age in some other human cell types was verified in a study involving three tissue types collected from 24 deceased individuals (aged 21-73). This experiment involved blood, cartilage and muscle samples. All the analyses were conducted using PS IMAGO PRO 5.1 software (IBM SPSS Statistics 25).

DNA methylation data comparison to the VISAGE basic assay
Data generated using the VISAGE basic prototype tool from 0%, 5%, 10%, 25%, 50%, 75% and 100% methylated DNA standards (Run1; 200 ng DNA input to bisulfite conversion using the Premium bisulfite kit [19]) was re-analyzed with bam-readcount. The mean methylation values from duplicates were used to calculate the differences between the two assays and to test for statistically significant differences between the two assays. Statistical testing was carried out in R using the "linearHypothesis" function implemented in the package "cars". To control for the family-wise error in multiple hypothesis testing, P-values were adjusted using the Bonferroni method.

AUTHOR CONTRIBUTIONS
WB and WP conceived the study with contributions from CP and MK. EP and WB were responsible for marker' selection and supervised the development of the predictive models. AFR contributed to marker selection. AH developed assay design, performed laboratory work and data analysis, CX contributed to assay design, laboratory work and data analysis, MdP and HN contributed to laboratory work and data analysis. AP, AW, RP, MS were responsible for the collection and preparation of blood and buccal swabs.