Identifying Methylation Patterns in Dental Pulp Aging: Application to Age-at-Death Estimation in Forensic Anthropology

Age-at-death estimation constitutes one of the key parameters for identification of human remains in forensic investigations. However, for applications in forensic anthropology, many current methods are not sufficiently accurate for adult individuals, leading to chronological age estimates erring by ±10 years. Based on recent trends in aging studies, DNA methylation has great potential as a solution to this problem. However, there are only a few studies that have been published utilizing DNA methylation to determine age from human remains. The aim of the present study was to expand the range of this work by analyzing DNA methylation in dental pulp from adult individuals. Healthy erupted third molars were extracted from individuals aged 22–70. DNA from pulp was isolated and bisulfite converted. Pyrosequencing was the chosen technique to assess DNA methylation. As noted in previous studies, we found that ELOVL2 and FHL2 CpGs played a role in age estimation. In addition, three new markers were evaluated—NPTX2, KLF14, and SCGN. A set of CpGs from these five loci was used in four different multivariate regression models, providing a Mean Absolute Error (MAE) between predicted and chronological age of 1.5–2.13 years. The findings from this research can improve age estimation, increasing the accuracy of identification in forensic anthropology.


Introduction
In forensic cases involving skeletal remains, the creation of a biological profile is essential to search missing persons reports, helping to determine potential matches and to assist in the identification of the victim [1]. Biological profiles commonly include sex, age, ancestry, and height. Thus, age estimation is one of the key components that is used for the identification of the remains. However, while age can be determined very accurately in childhood because it is based on the forensic anthropological assessment of growth and development [2][3][4][5][6][7], this estimation is less accurate in adult individuals. In adults, age assessment is based on degenerative changes in bones and teeth. This estimation requires a combination of different methods, such as the application of Lamendin's formula in uniradicular teeth (canines and incisors) [8]; pubic symphysis assessment [9]; auricular surface evaluation [10]; fourth rib [11]. Using these methodologies can result in a general estimate of age with an accuracy of ±10 years. However, these results can be affected by endogenous and exogenous factors, pathological conditions, and in mass disasters, fragmentary remains [1].
The application of biochemical techniques, based on the natural process of aging, may provide a potential solution to this problem [12]. Until recently, the racemization of aspartic acid in dentin seemed to be the most accurate technique to determine the age of adult individuals [13][14][15][16]. However, epigenetics, the study of heritable changes in gene expression [17,18], has been shown to be a valuable and more accurate tool for age estimation [19]. Epigenetic DNA methylation, which involves the addition of a methyl group to a cytosine nucleotide in cytosine/guanine regions, known as CpG islands, has been shown to be particularly useful [20]. This methodology has its roots in prior studies involving the creation of "epigenetic clocks", which were based on the correlation between methylation patterns and age [21][22][23]. From these studies, several authors have tried to accurately determine age from biological fluids (blood, saliva, semen) [24][25][26][27][28]. However, these studies were developed from the point of view of criminalistics: determination of the age of perpetrator from body fluids at the crime scene. There have been a few studies performed for forensic anthropological purposes, such as estimation of the age of skeletal remains. The first study was carried out by Bekaert et al. [25] on dentin samples, applying pyrosequencing and obtaining an accuracy of ±4.9 years. Giuliani et al. [29] analyzed the methylation status of three genes in different dental tissues (cementum, dentin, and pulp) obtaining accuracies between 1.2 and 7.1 years, by applying EpiTYPER, a MALDI-TOF mass spectrometry-based method. Recently, Márquez-Ruiz et al. [30] also applied pyrosequencing to determine the age in teeth based on three genes, achieving accuracies between 4.8 and 6.9 years. Correia-Dias et al. [31] applied two methodologies for methylation analysis-Sanger sequencing and SNaPshot-obtaining accuracies of 2.6 in bones and 11.4 years in teeth with the former, and 7.2 years in bones and 7.1 years in teeth with the latter. These last two studies [30,31] analyzed methylation on the whole tooth, not distinguishing between tissues. Recent work comparing methylation differences between types of dental tissues [29] indicate that isolating specific dental structures, such as pulp, might improve accuracy.
Teeth are the hardest structures in the human body, and they can survive after everything else has decomposed [15]. In particular, dental pulp is protected from external insults by hard tissues. Pulp is located in the central region of the tooth and is formed by a stromal tissue containing nerves, blood, and lymphatic vessels. The unique location and structure make it the preferred tissue for DNA analysis [32][33][34].
Based on this premise, and the need to expand the application of DNA methylation in forensic anthropology, the aim of this study was to analyze methylation patterns in pulp tissue from adult individuals and correlate these patterns with age, improving current age-at-death estimates.

CpG Sites Identified and Individual Correlations with Age
Twenty healthy erupted third molars, ages between 22 and 70 were used in this study. After DNA extraction from the pulp, bisulfite conversion, and PCR amplification, a total of 46 CpGs sites located at ELOVL2, KLF14, SCGN, NPTX2, and FHL2 were identified, and their methylation levels were assessed through pyrosequencing. Correlation coefficients of these CpGs sites with age were calculated (Table 1). Positive and significant correlations were found in six ELOVL2 CpGs sites, with r values between 0.308 and 0.365. However, CpG5 produced an r = 0.240 and was not significant. In contrast to these results, among seven CpGs evaluated on KLF14, only one, CpG7, showed a positive and significant correlation with age (r = 0.468). The same thing happened with NPTX2, as only CpG4 showed a significant and positive correlation with age (r = 0.327). Correlations for the methylation levels for CpGs located in FHL2 with age were negative. However, while two loci in FHL2, CpG1 and CpG2, did show a strong correlation with age (r = −0.367 and r = −0.376, respectively), the results were not significant.

Construction of Prediction Models for Age Estimation
A backward stepwise multiple linear regression analysis was performed to create prediction models for age estimation based on an assessment of these five genes, by including sets of individual CpGs producing r values higher than 0.2, and selected after assessment on each gene of individual and significant CpG contribution to age. These models are shown in Table 2. The model with the highest correlation coefficient (R 2 = 0.975) produced a Mean Absolute Error (MAE) between chronological and estimated age of 1.5474. This model included the majority of the ELOVL2, NPTX2, KLF14, SCGN CpGs, as well as certain CpGs from FHL2. Despite the fact that some individual CpGs of FHL2 produced correlations that were less discriminatory than those produced by other CpGs sites, they did appear to improve age estimation when combined with the other markers. The second model removed KLF14 CpG7, obtaining a significant and strong correlation coefficient (R 2 = 0.972) and an MAE between chronological and estimated age of 1.711. The next model eliminated FHL2 CpG6, obtaining a high correlation coefficient of R 2 = 0.961, and an MAE between chronological and estimated age 2.047. The last model removed ELOVL2 CpG5, (p = 0.0001), producing an R 2 = 0.955, and an MAE of 2.1313.
Due to the sample size (20 teeth in total), leave-one-out cross-validation was chosen to validate the models. Using this procedure, Model 1 showed the highest estimation error, followed by Model 4. Cross-validation of Models 2 and 4 produced the best accuracy. Additionally, Pearson correlations were carried out to compare the predictive ages and chronological ages, resulting in nearly the same correlation in the first three models (r = 0.98) and a slightly lower correlation in the fourth model (r = 0.97). These correlations between predictive and chronological ages are depicted in Figure 1.

Discussion
The objective of this study was to improve age-at-death estimates in forensic anthropology, based on epigenetics, particularly, the assessment of DNA methylation. This was supported by previous work applied to this discipline [25,[29][30][31]. Teeth are the hardest tissues in human body, thus the present study focused on pulp, as it is the inner layer of the teeth, well protected from environmental insult. The methodological approach was based on bisulfite-modified PCR and pyrosequencing. Applying this technique, it was possible to evaluate the DNA methylation levels of five genes (ELOVL2, NPTX2, KLF14, SCGN, and FHL2). Certain CpGs of these genes showed a significant correlation with age, particularly in the case of ELOVL2. After statistical analysis, four multivariate regression models were proposed based on differential methylation of these genes. These equations were then utilized to accurately determine the age, with MAEs between 1.5 and 2.13 years.
There are few studies focused on the analysis of DNA methylation in teeth for age estimation. The first, developed by Bekaert et al. [25], used dentin samples and assessed four genes (ASPA, PDE4C, ELOVL2, and EDARADD), proposing a quadratic regression model (R 2 = 0.74) and Mean Absolute Deviation (MAD) of 4.86 years. This estimation error was higher than the one obtained in the present study; however, the difference between tissues and the genes evaluated in both works should be considered. In fact, Giuliani et al. [29] pointed out to this difference on age estimates among the three dental layers, cementum, dentin, and pulp, assessing three genes (ELOVL2, PENK, and FHL2). As revealed in our study, Giuliani et al. demonstrated that multivariate models presented the most accurate age estimate, with the best correlation with pulp (difference between chronological and predicted age 2.25 years), and the worst correlation with dentin (difference between chronological and predicted age 7.07 years). A combination of cementum and pulp analyses retrieved the best estimation (difference between chronological and predicted age 1.20 years). Our work improves age estimates in pulp by expanding the analysis to other genes and utilizing more CpGs sites.
Unlike previous studies that have demonstrated a difference in age estimation among tooth layers, more recent work examining teeth utilized the DNA methylation levels of the whole tooth for age correlations. The study of Márquez-Ruiz et al. [30] developed multivariate models based on three genes (ELOVL2, ASPA, and EDARADD). As in the present study, certain CpGs that were less significant but still correlated with age were included in the models, improving the estimates, and providing a MAE between 4.80 and 5.08 years. Additionally, these results demonstrated that the type of tooth and sex did not have an impact on the estimates. The current work only used third molars, as they are the most protected teeth in the jaw.
Correia Dias et al. [31] also used the whole tooth to assess DNA methylation levels, using two different methodologies: Sanger Sequencing and SNaPshot. In contrast to other studies, results with EDARADD and MIR29B2C using Sanger sequencing showed no significant correlation with age, while ELOVL2, FHL2, and PDE4C showed weak correlations. However, their ELOVL2 correlations were similar to the ones obtained in the present study and previous works [25,30]. In addition, the proposed model was a simple linear regression, providing estimates with FHL2 CpG4 with a MAD of 11.35 years. When they assessed the genes by SNaPshot, FHL2 showed no significant age correlation, but again, ELOVL2 showed significant correlation with age as did KLF14, and TRIM59. These correlations were higher than those obtained in the present study. However with SNaPshot, the best model utilized a multivariate linear regression combining ELOVL2 and KLF14, producing a MAD of 7.07.
The above two works showed larger differences between chronological and estimated age, probably due to use of the whole tooth for the analysis instead of separating the layers [29]. The difference in composition among the layers and the difference in types of cells, may impact the methylation status of the chosen genes. Another important issue could be differences in methodology. To our knowledge, this study is the first work that applies pyrosequencing to assess DNA methylation in pulp tissue. The correlations obtained in this study were similar to the ones presented on Márquez-Ruiz's work [30] as well as Bekaert et al. [25], using pyrosequencing, and to some extent the ones obtained using Sanger sequencing by Correia Dias et al. [31]. Correia Dias et al. found differences in methylation when using SNaPshot [31], producing different correlations than those obtained with Sanger sequencing. The work of Giuliani et al. [29], applying MALDI-TOF mass spectrometry, found the highest correlations with ELOVL2 and FHL2, and included more CpGs than the previous work. Thus, the technique used may have an influence on the results, particularly if bias in methylation levels occur due to poor coverage or lower input levels. However, as the present study demonstrates, ELOVL2 [25,[29][30][31] and FHL2 [29,31] can produce useful age estimation in teeth.
ELOVL2 encodes a transmembrane protein involved in the synthesis of long ω3 and ω6 polyunsaturated fatty acids (PUFA) [35] and has been included on several studies for age estimation for criminalistics applications [27,[36][37][38][39][40]. In fact, Zbiec-Piekarska et al. [41] assessed the usefulness of ELOVL2 and two of its CpGs sites as unique markers for age estimation in blood samples, obtaining prediction errors of approximately ±7 years. Although this determination is better than current forensic anthropological estimates, [25,29,30], more accurate determinations may be obtained by including additional genes as shown in this and other studies.
In the present study, additional genes were selected based on the composition of pulp, which contains blood capillaries, nerves, and odontoblasts [29]. FHL2, is a transcriptional co-factor involved in different processes: cell cycle regulation, bone formation, and wound healing [42,43]. Apart from the aforementioned teeth studies [29,31], its role in age estimation has been demonstrated in blood samples [31,39,40], saliva and buccal cells [27]. This work has also demonstrated the usefulness of KLF14, which was assessed in the present study, although only included in the first model. KLF14 is a member of the Krüppel-like factor family of transcription factors, regulating gene expression in adipose tissues [44]. Its usefulness in age estimation has been demonstrated in blood samples [40] and saliva [26]. The study of Alghanim et al. [45] tested this marker along with the other genes assessed in the present work, SCGN, and DLX5. In saliva, a multivariate model of two CpG sites of KLF14 retrieved a MAD of 5.8 in the training set, and the combined model of KLF14 and SCGN produced a MAD of 6.2. In contrast, the best model for blood samples was a multivariate combination of 2 CpG sites of KLF14 and 1 CpG site of SCGN, producing a MAD of 6.6. SCGN encodes for a secretagogin protein, which is involved in potassium chloride stimulated calcium flux as well as cell proliferation [46]. In the present study, SCGN seemed to play a key role in age estimation, as both CpGs 3 and 8 were included in the four models and produced among the highest correlations in the genes analyzed (except ELOVL2 CpG1 and CpG7). NPTX2 (neuronal pentraxin II) is involved in synapse formation [47], as pulp also contains nerve cells, the methylation levels of this gene may also show a response with age. Only one CpG, CpG4 was kept through the four models. Silva et al. [48] assessed the role of DNA methylation at this gene for age estimation in blood and saliva samples. Applying a multivariate regression model with 6 CpGs sites, including CpG4, they obtained an average difference of 9 years in saliva samples, with similar results for blood samples. In our results, NPTX2 played a small role (as only one CpG was selected in our models) in this determination.
The present study validated the usefulness of ELOVL2 and FHL2 as targets for age estimation in dental tissues, and also, identified new potential markers: KLF14, SCGN, and NPTX2. Compared with other methodologies for age estimation, DNA methylation provides similar results to aspartic acid racemization, as our study was able to determine the age in a range 1.5-2.13 years, depending on the model. One of the disadvantages of aspartic acid racemization is that it is influenced by temperature, so it cannot be applied to burnt remains [49]. It would be interesting to focus future studies on analyzing the effects of fire and other taphonomic effects on DNA methylation in age estimation.
The main limitation of this study is the sample size, in this case twenty teeth. This sample size could influence the significance of individual CpGs correlations with respect to age. While our sample size was similar to that used by Giuliani et al. [29], larger sample sizes can cover a finer degree of ages, possibly leading to improved models. An important point to be made is that methodology can also impact the results as discussed above. We believe that Pyrosequencing has inherent advantages over other procedures, including high coverage and precision [50], making it a useful tool to assess DNA methylation. Its relatively low cost should make it accessible to forensic labs.

Sample Collection and Teeth Processing
Twenty healthy erupted third molars were collected from patients in a dental clinic in Spain (age range: 22-70 years old). All dental elements originated from different individuals. Florida International University's ethical committee approved all procedures related to the experimentation with human subjects. Sample data were limited to sex, age, and population group. Teeth were washed with a soft toothbrush under running sterile distilled water and dried at room temperature. Teeth were irradiated for 15 min per side with ultraviolet light (254 nm) to eliminate exogenous DNA. Enamel and cementum were removed using a diamond brush. In the midline between cementum and enamel, crowns were separated from the roots using a diamond-cutting disc. The roots were cut along the midline and the pulp was removed using a spoon excavator.

DNA Extraction
DNA was extracted from pulp using DNeasy Blood and Tissue Kit (Qiagen, GmbH, Hilden, Germany), according to the manufacturer's protocol. DNA was eluted in 35 µL Buffer AE.

DNA Quantification
DNA was quantified using Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific, Waltham, MA, USA), according to the manufacturer's protocol.

Bisulfite Conversion and PCR
200 ng of DNA was bisulfite converted using the Epi Tect Fast Bisulfite Conversion Kit (Qiagen, GmbH, Hilden, Germany). Converted DNA was eluted with 20 µL of elution buffer. 1.5 µL of converted DNA was amplified by singleplex PCR in a total volume of 0.2 µM of primers for KLF14, NPTX2, ELOVL2, FHL2, SCGN and 2x Qiagen PyroMark PCR Master Mix (Qiagen, GmbH, Hilden, Germany). All primer sequences are listed in Table S1. PCR reactions consisted of an initial hold at 95 • C for 15 min followed by 45 cycles of 30 s at 94 • C, 30 s at 56 • C, 30 s at 72 • C. PCR amplification ended with a final extension step at 72 • C for 10 min.

Pyrosequencing
Methylation levels were assessed after loading 15 µL of PCR product into the Pyro-Mark Q48 Instrument (Qiagen, GmbH, Hilden, Germany), and performed pyrosequencing with 0.4 µM of sequencing primers following manufacturer's instructions.

Methylation Results Analyses and Statistics
Pyrosequencing results were analyzed using the PyroMark Q48 Autoprep software (Qiagen, GmbH, Hilden, Germany). Statistical analyses were performed using IBM SPSS 26 (IBM, Armonk, NY, USA). Firstly, simple correlation analyses were performed between age and methylation levels of KLF14, NPTX2, ELOVL2, FHL2, SCGN CpGs. Multivariate linear regression models were performed to predict age. Validation of the models was performed by leave-one-out cross validation (LOOCV), in which one observation is left out and used for validation, while the remaining samples are used as a training set. This was repeated 20 times so that a complete LOOCV was performed.

Conclusions
In conclusion, the present study describes for the first time the application of DNA methylation for age determination in pulp tissues by applying pyrosequencing. The results provide great accuracy with a Mean Absolute Error (MAE) between predicted and chronological age of 1.5-2.13 years for age-at-death estimation in adult individuals. Although the mathematical models are complex, the accuracies are excellent, surpassing other techniques for forensic anthropology estimates. Apart from ELOVL2 and FHL2, we also identified three additional genes-NPTX2, KLF14, and SCGN-as potential markers for age estimation. Future research will expand these results by increasing the number of teeth, exploring other markers, and investigating taphonomic and environmental insults.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/ijms22073717/s1, Table S1: Primer Sequences. Informed Consent Statement: Teeth were extracted for clinical purposes, not specifically for this study. As a result, patient consent was waived.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy issues.