Association of cord blood methylation with neonatal leptin: An epigenome wide association study

Background Neonatal adiposity is a risk factor for childhood obesity. Investigating contributors to neonatal adiposity is important for understanding early life obesity risk. Epigenetic changes of metabolic genes in cord blood may contribute to excessive neonatal adiposity and subsequent childhood obesity. This study aims to evaluate the association of cord blood DNA methylation patterns with anthropometric measures and cord blood leptin, a biomarker of neonatal adiposity. Methods A cross-sectional study was performed on a multiethnic cohort of 114 full term neonates born to mothers without gestational diabetes at a university hospital. Cord blood was assayed for leptin and for epigenome-wide DNA methylation profiles via the Illumina 450K platform. Neonatal body composition was measured by air displacement plethysmography. Multivariable linear regression was used to analyze associations between individual CpG sites as well as differentially methylated regions in cord blood DNA with measures of newborn adiposity including anthropometrics (birth weight, fat mass and percent body fat) and cord blood leptin. False discovery rate was estimated to account for multiple comparisons. Results 247 CpG sites as well as 18 differentially methylated gene regions were associated with cord blood leptin but no epigenetic changes were associated with birth weight, fat mass or percent body fat. Genes of interest identified in this study are DNAJA4, TFR2, SMAD3, PLAG1, FGF1, and HNF4A. Conclusion Epigenetic changes in cord blood DNA are associated with cord blood leptin levels, a measure of neonatal adiposity.


Introduction
Adiposity at birth may be a predictor of obesity in childhood and adulthood. [1] Obesity has a current estimated prevalence of 17% among children aged 2-19 and has become increasingly challenging to treat once present. [2] Obesity related co-morbidities, such as type 2 diabetes and metabolic syndrome are now occurring earlier and more frequently in individuals with early life obesity. [3,4] The etiology of obesity is complex, multifactorial, and includes genetic, nutritional, and environmental determinants. Researchers studying the developmental origins of health and disease have proposed that the intrauterine environment of the developing fetus contributes to adipose tissue deposition via fetal programming, suggesting that obesity risk is present before birth. [5,6] Epigenetics, the study of modifications to DNA that alter gene expression without changing gene sequence, is one mechanism contributing to the early life development of excess adiposity and future risk of an adverse metabolic phenotype. [7] Aberrant methylation of CpG dinucleotides in DNA is a type of epigenetic modification that can be both heritable and modifiable by one's environment. Prenatal exposures, maternal pregnancy characteristics, and the intrauterine milieu can alter susceptibility of neonatal DNA to methylation, leading to changes in the child's gene expression, gene regulation, and metabolic risk. [8,9] Prior studies have demonstrated associations between maternal phenotypes and offspring DNA methylation, supporting the hypothesis that the intrauterine environment impacts fetal epigenetics. [10][11][12] Methylation patterns present at birth which are associated with newborn adiposity or later childhood obesity suggest that epigenetics may be partially responsible for the perinatal origins of obesity risk and predict future obesity. Identifying DNA methylation patterns associated with perinatal adiposity measures is one potential tool that can be used to detect high risk individuals early in life, a critical time point for early intervention before obesity develops.
Several research groups have identified epigenetic changes associated with newborn size; however many of these studies have used a targeted approach, only examining specific genes of interest such as HIF3A and AHRR. [13,14] Studies that have previously taken an epigenome wide approach performed associations with indirect measures of newborn adiposity, such as birth weight. [15] Replication of specific epigenetic changes within independent cohorts are limited.
In this study, we evaluated the relationship between cord blood methylation patterns and markers of neonatal adiposity in a cohort of healthy, full-term infants born to mothers with normal glucose tolerance. Maternal hyperglycemia, even below the diagnostic threshold for gestational diabetes, is a well described risk factor for neonatal adiposity; therefore, elimination of this known confounder is key when investigating adiposity contributors. [16] We utilized an epigenome wide approach to examine differentially methylated regions in cord blood DNA and their association with newborn anthropometrics (birth weight, fat mass, and percent body fat) as well as cord blood leptin levels, a biomarker of neonatal adiposity with the aim to identify both novel and previously described epigenotype-phenotype relationships [8,9,[11][12][13][14][15]. We hypothesize that cord blood epigenetic changes will be associated with measures of neonatal adiposity such as percent body fat, fat mass, and cord blood leptin, and provide insight into early life mechanisms of future obesity risk.

Subjects
Our study population consisted of a cohort of 114 healthy maternal-neonatal pairs on whom cord blood DNA was available. Participants were recruited from 2011-2014 at a large academic medical center in Chicago, Illinois, USA as previously described in detail. [17] Women carrying a singleton pregnancy with normal glucose tolerance on a fasting two-hour 75g OGTT performed between 24 and 28 weeks gestation were eligible for the study. [18] Women were excluded if they had a history of greater than 3 term pregnancies, were on chronic medications such as glucocorticoids, insulin, or anti-hypertensives, or smoked during pregnancy, as these factors can be associated with excess or restricted growth. [19] Newborns were fullterm and excluded if they required intensive care, were too ill to undergo body composition measurements within the first 24-72 hours of life, or had congenital anomalies, as some of these are independently associated with abnormal fetal growth. Cord blood was collected after birth by labor and delivery staff and processed within 30 minutes. Cord blood to be used for DNA extraction and future leptin assays was stored at -70˚C until laboratory assays were performed. 115 maternal neonatal pairs were initially included in the study; one pair was excluded due to poor DNA quality. Of the 114 neonates included in the final analysis, 105 had body composition data available. This study was approved by the Northwestern University Institutional Review Board and each mother provided written informed consent for herself and her neonate at the time of study enrollment.

Body composition measurements
Body composition measurements of each neonate occurred between 24-72 hours of life and were obtained in duplicate by one of two trained examiners. Length was obtained using a hard-surface measuring board. Measurements were recorded to the nearest 0.1cm, performed in duplicate, and the results averaged for the final research measure. Weight and adiposity measurements were obtained by method of air displacement plethysmography (PeaPod, Cosmed, Rome, Italy), a noninvasive, nonuser dependent modality that has been validated in comparison with deuterium dilution in full-term infants between ages 0.4-21.7 weeks and of weight 2-8 kg. [20,21] To measure weight, the infant was undressed and placed on the calibrated PeaPod scale and weight was recorded to the nearest 0.0001 kg. Next, the infant was placed inside the PeaPod volume chamber for two minutes to determine body volume. Density was calculated after which age-and sex-specific fat-free mass density values were used to determine absolute fat-free mass and fat mass. Percent body fat was subsequently calculated from these values. [21] Laboratory measurements Samples for leptin were batched and measured in duplicate with a radioimmunoassay kit (Millipore Corp, Billerica, MA, USA). The inter-and intra-assay coefficients of variation for leptin were 3.7-5.9% and 3.0-4.0%, respectively.
DNA was purified from neonatal cord blood using an Autopure LS Automated DNA Purification System with Autopure reagents (Autogen, Inc., Holliston, MA). The purified DNA samples were stored under -20˚C. DNA quality and quantity were assessed by Nanodrop (ThermoFisher Scientific, MA, USA). Bisulfite conversion was performed on 500 ng of DNA using the EZ DNA Methylation Kit (Zymo Research, CA, USA). Methylation levels were measured using the Infinium HumanMethylation 450K Beadchip array (Illumina, Inc. CA, USA), which targets~486,000 CpG sites, in 114 samples that passed DNA quality testing. Samples were randomly plated on each chip with regard to neonatal sex. BeadChips were scanned with an Illumina iScan and analyzed using Illumina GenomeStudio software. All experiments were conducted following manufacturer protocols in the Genomics Core Facility at the Center for Genetic Medicine at Northwestern University.

DNA methylation data processing
Raw Illumina IDAT data were preprocessed per previously published methodology. [10] Briefly, one DNA sample containing more than 5% of CpG probes with detection p-values greater than 0�01, as well as 191 CpG probes that were not detectable (detection p-value > 0�01) in more than 5% of samples were removed during quality control. We further removed 65 built in SNP probes, 3,091 non-CpG probes, 36,535 probes containing proximal SNPs, and 11,648 probes on sex chromosomes. Signal intensities of the filtered probes were corrected for background noise and channel color bias. There were 434,506 CpG sites used in the final analysis. Methylated and unmethylated intensities were then quantile-normalized and corresponding β values were calculated (i.e., the proportion of methylated probe intensity out of total intensity). The function ComBat in R package sva was implemented to the normalized β values to adjust for potential batch effects and then the function sva in the same R package was applied to generate surrogate variables that were used to account for other unwanted variations in the data including the confounding effects of cell type heterogeneity on methylation profiles. [22]

Bioinformatics and statistical analysis
We conducted an epigenome wide association study using previously described methodology. [23] Briefly, linear regression models were used with neonatal adiposity measures as dependent variables and methylation as the variable of interest. Models were adjusted for maternal age at delivery, race, gestational age in days, and infant sex, as these variables can affect neonatal body composition. Two surrogate variables generated by sva were also adjusted. An adjusted p value (FDR) < 0.05 following the Benjamini-Hochberg procedure was considered significant. [24] We also examined the association of percent body fat, fat mass, and log 10 -transformed cord blood leptin levels with differentially methylated regions (DMRs) using the R package DMRcate. [25] T-statistics of CpG sites from EWAS were smoothed by chromosome using a Gaussian kernel smoothing function with bandwidth λ = 1000 base pairs and scaling factor C = 2. DMRs were assigned by grouping significant CpG sites (FDR < 0�05). We used Stouffer's method to compute combined FDR as the statistical inference for that region and the mean coefficients from the regression models summarizes the regional effect. [26] Within each DMR identified, given the identical sample size and similar distribution of methylation values across all the adjacent CpGs, we applied an inverse-variance weighting approach to compute the weighted mean of coefficients across the CpGs, weighted by the inverse of the corresponding standard error. The weighted mean coefficients were then re-expressed as percent changes in cord blood leptin levels with every 0.01 methylation β value increase. All analyses were conducted using R software (version 3.3.1).

Regulatory elements
Transcriptional regulations can be controlled by complicated interactions between regulatory elements, such as histone modifications and DMRs. In order to further explore potential functional implications of differentially methylated regions, we used the Encyclopedia of DNA Elements (ENCODE) Project [27] to find regulatory elements that overlapped or were nearby the significant DMRs. We used DNase I hypersensitivity sites (DNase), transcription factor binding sites (TFBS), and annotations of histone modification ChIP peaks pooled across cell lines (data available in the ENCODE Analysis Hub at the European Bioinformatics Institute http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/). The hg19 human assembly from Genome Browser was used to provide DMR location information.

Results
Descriptive statistics of the study participants are shown in Table 1. The majority of infants had a birth weight that was appropriate for gestational age and percent body fat of the cohort was normally distributed. 61% of mothers were normal weight while 34% were overweight or obese. No mothers in this cohort smoked during pregnancy.
We did not identify any associations between individual CpG sites or CpG gene regions and % fat, fat mass, or birth weight. When studying methylation of individual CpG sites, we found associations of 247 unique CpG sites with cord blood leptin, a marker of neonatal adiposity. Among them, 177 (72%) were negatively associated with leptin and 70 (28%) were positively associated with leptin (Fig 1, S1 Table). Using the DAVID Functional Annotation Tool (version 6.8) [28,29], the genes targeted by the 177 negatively associated CpGs were enriched in lipid metabolic process (GO:0006629) (p = 0.0075), the genes targeted by the 70 positively associated CpGs were enriched in regulation of CD8-positive, alpha-beta T cell proliferation (GO:2000564) (p = 0.0077). The top 10 CpG sites are displayed in Table 2 with the remainder listed in S1 Table. All analyses were adjusted for maternal age at delivery, maternal race, neonatal sex, and gestational age. We also identified DMRs in 18 genes that were associated with cord blood leptin levels. The name and function of each gene, along with the location of each DMR is detailed in Table 3. Increased DMR methylation in 9 of these genes was negatively associated with cord blood leptin levels while hypermethylation in 9 genes demonstrated positive association with leptin. The function or proposed function of all genes is listed; however, 2 genes have not been well studied with regards to their role in disease development.
Increased methylation across 14 CpG sites in the DnaJ heat shock protein family (Hsp40) member A4 (DNAJA4) gene promoter (chromosome 15, position 78555907 to 78557584, hg19 genome assembly) was negatively associated with cord blood leptin levels (Fig 2). For every 0.01 increase in methylation in this region, neonatal cord blood leptin decreased by 3.2% (FDR adjusted p = 0.0114). Furthermore, multiple gene regulatory elements were found to overlap with the hypermethylated region (Fig 2). All CpG sites evaluated in the DNAJA4 gene are displayed in S2 Table. A 0.01 increase in methylation across 4 CpG sites in the Transferrin receptor 2 (TFR2) gene (chromosome 7, position 100230781 to 100231672) was associated with a 6.2% increase in neonatal cord blood leptin (FDR adjusted p = 0.0358). These CpG sites are located at the promoter region of one TFR2 transcript (Fig 3). All CpG sites evaluated in the TFR2 gene are displayed in S2 Table. In addition, there are several regulatory elements that overlap with this differentially methylated region (Fig 3). We did not find any differentially methylated regions associated with anthropometric measures of neonatal adiposity.

Discussion
This epigenome wide association study demonstrates a relationship between neonatal cord blood leptin levels and 1) 247 individual CpG sites and 2) differentially methylated regions of 18 genes. SMAD3, PLAG1, FGF1, HNF4A. DNAJA4, and TFR2 are six genes identified in this study that have previously been reported in relation to tissue growth and adiposity. We chose to highlight these genes for their potential impact on adipose tissue development.
When evaluating associations between individual CpG sites and cord blood leptin levels, four genes are worth noting. SMAD3 encodes for smad3, a signaling effector that mediates transforming growth factor-beta's (TGF-beta) inhibition of adipocyte differentiation. [30] We identified that increased methylation at cg03480935 within SMAD3 was associated with decreased cord leptin levels. This finding is unexpected as increased methylation is typically associated with decreased gene expression. In this case, potential decreased expression of SMAD3 would lead to less inhibition of adipocyte differentiation and presumably higher leptin levels.
PLAG1 is a transcription factor whose activation results in upregulation of target genes important for cell proliferation. PLAG1 mutations are associated with lipoblastoma, a benign adipocytic tumor, suggesting its role in adipocyte growth and/or proliferation. [31] We found that increased methylation at cg21448513 within PLAG1 was negatively associated with cord leptin levels (FDR adjusted p = 0.02). The increased methylation at cg21448513 in PLAG1 may lead to decreased adipocyte proliferation and thus lower cord blood leptin levels.   Cord blood methylation and neonatal leptin FGF1 is another gene clinically important for cell growth and described to have an adipogenic effect. [32] In a mouse model, high fat diet fed mice and the ob/ob mice had higher FGF1 mRNA in their adipose tissue than mice fed a normal diet or lean control mice. [33] We identified that decreased methylation at a CpG site in the FGF1 gene body (cg13724550, FDR adjusted p = 0.0412) is associated with leptin levels.
Lastly, HNF4A is another gene of interest. Increased methylation at a cg16121136 (FDR adjusted p = 0.0307) within the transcription start site was associated with lower leptin levels. Clinically, mutations in HNF4A can cause Mature Onset Diabetes of the Young, Type 1 (MODY1). Further study is necessary to understand the importance of altered methylation at these individual CpG sites on gene expression.
In our analysis of differentially methylated gene regions, DnaJ heat shock protein family (Hsp40) member A4 (DNAJA4) and Transferrin receptor 2 (TFR2) emerged as two interesting genes that have been previously reported in the literature to be associated with tissue growth and adipocytes, respectively. [34,35] DNAJA4 encodes for a heat shock protein and has been reported to play a role in fetal growth as well as growth of sarcomas. [36,37] We report that DNAJA4 promoter hypermethylation across 14 CpG sites was negatively associated with neonatal leptin levels, a measure of adiposity. The CpG region identified in our study overlaps with a region previously reported by Roifman, et al in a twin study of placental DNAJA4 methylation and severe growth discordance. [34] While the direction of aberrant methylation and the tissue in which methylation was measured differed from our study, our replication of the same CpG region highlights that this gene region may play a role in the regulation of fetal growth. Our study is the first to report a relationship between cord blood DNAJA4 and leptin, and thus further study into DNAJA4 methylation patterns as they relate to fetal fat tissue accumulation and adipocyte leptin production are necessary.
We also found a positive association of TFR2 hypermethylation and cord blood leptin levels in a region of the gene with multiple overlapping regulatory elements. Given the proximity to regulatory elements, methylation changes in this area may impact gene expression; however, expression analyses are necessary to fully evaluate this possibility. TFR2, expressed primarily in the liver, encodes transferrin receptor 2 which functions as a mediator of cellular uptake of transferrin bound iron. Iron homeostasis has been previously associated with adiposity, perhaps mediated by obesity induced inflammation. [38] In one study of mice fed high fat diets, TFR2 was more highly expressed in adipose tissue compared to mice fed a usual diet. [35] However, the direction of our findings differ from this report, as in our study, increased TFR2 methylation was positively associated with leptin levels. Furthermore, our study is the first to associate TFR2 methylation with a marker of adiposity in a human population. TFR2 may play a role in the regulation of adipose tissue but additional study is needed to fully elucidate these mechanisms.
Using ENCODE, we found multiple regulatory elements overlapping with the DNAJA4 and TFR2 DMRs, suggesting that the significant DMRs are in areas of active gene regulation (Figs  1 and 2). The potential for clinical impact due to altered methylation in these areas is physiologically plausible but without gene expression analyses, we cannot confirm or refute this possibility.
The genes identified in this study are novel in comparison to prior studies of newborn epigenetics and body size. Our previously unreported findings may be accounted for by the fact that the field of newborn epigenetics continues to emerge and evolve and methylation patterns are both tissue and age dependent. Furthermore, this study is not directly comparable to the majority of studies examining newborn methylation and childhood obesity as prior studies have largely focused on outcomes such as birth weight [14,39] as opposed to anthropometric measures or biomarkers of adiposity or childhood body composition. [11,12,40] Pan et al's study [13] had a similar aim as ours and identified HIF3A methylation in cord blood to be associated with newborn adiposity. While previously reported genes did not emerge in the current study, the genes we did identify were all implicated in tissue growth, leptin expression or adipocyte development. We propose that altered methylation in these genes lead to impaired adipocyte tissue differentiation and/or growth, thus altering adipocyte mediated leptin production. With more adipose tissue growth, we expect higher leptin production, perhaps inducing a state of leptin resistance, which, in turn is associated with impaired insulin sensitivity, insulin signaling and satiety signaling. [41,42] Together, these factors can impact energy homeostasis and body composition. While the mechanisms underlying the development of leptin resistance and impact on obesity development are still being elucidated, literature suggests that high leptin levels in obese states are associated with adverse metabolic risk.
Strengths of our study include the use of a cohort of healthy mothers with documented normal glucose tolerance allowing us to remove the well-known confounder of maternal gestational diabetes on fetal adiposity development. In addition, the determination of infant body fat utilized a validated, non-user dependent method contributing to the accuracy and precision of our measurements. Use of the Infinium Illumina 450K array allowed us to examine approximately half a million CpG sites across 99% of RefSeq genes within the genome and identify novel associations. [43] Finally, our DMR analysis helped improve statistical power for detecting weak associations, as neighboring CpGs with similar effects reinforce each other. CpGs have been suggested to not only function individually, but also as a group to impact gene expression. [44] It is also believed that DMRs can control cell-type specific transcriptional repression of an associated gene. DMRcate is a data-driven approach to identify DMRs that can remove the bias incurred from irregularly spaced methylation sites to identify DMRs, which is particularly useful with the design of the Illumina platform and more powerful than other algorithms in detecting DMR where CpG coverage is sparse.
One limitation of our study is our inability to determine the impact that the identified methylation changes have on gene expression due to lack of available RNA. Expression analysis in relevant tissues, such as adipose tissue, is necessary to study the biological plausibility of our reported findings. Our study did not reveal any associations between cord blood gene methylation and direct anthropometric measures of adiposity such as neonatal % body fat, fat mass, and birth weight. Potential explanations for our inability to observe a statistically significant association with newborn fat include our small sample size and inclusion of infants both to women with documented normal glucose tolerance, which removed the well-known confounder of fetal hyperinsulinism and gestational diabetes on newborn macrosomia. [45] However, we did identify epigenetic associations with cord blood leptin levels. In this cohort [46] and others, [47] leptin has been highly correlated with measures of newborn fat and higher levels may impair normal satiety signaling, energy expenditures, and promote the development of insulin resistance. [48] Therefore, the reported associations are notable findings that may reflect subtle, early life changes in a newborn's metabolic state that may have long term impact on metabolic health. Furthermore, there is little published literature to suggest whether the gene relationships identified in this study play a role in adipose tissue accretion, long term obesity risk or have a significant clinical impact. However, the findings in the DNAJA4, TFR2, PLAG1, and FGF1 genes do suggest a possible underlying physiological mechanism for adipocyte growth mediated by these genes. Additionally, the cross-sectional design of our study only allows us to report associations and we are thus unable to comment on causality or the longterm impact of our findings. Lastly, generalizability is limited given our small sample size and our sole inclusion of mothers with documented normal glucose tolerance during pregnancy.
This study is among the first to examine associations of individual CpG sites and differentially methylated regions in cord blood DNA with cord blood leptin, a marker of neonatal adiposity. In particular, DNAJA4, TFR2, SMAD3, PLAG1, FGF1 and HNF4A methylation represent novel and possibly physiologically relevant markers of neonatal adiposity. While studies in adults suggest that methylation changes are the consequence of adiposity, the direction in early life and the impact of the in-utero environment is not well characterized. Further study in larger cohorts is necessary to reproduce these our reported findings, elicit how gene methylation is controlled and determine whether these methylation changes impact gene expression. Following a longitudinal cohort over time is also necessary to determine how specific methylation patterns at birth translate to body composition and metabolic risk in childhood and adolescence.
Supporting information S1 Table. Differentially methylated CpG sites associated with cord blood leptin levels.
Percent  Table. Evaluated CpG sites in the DNAJA4 and TFR2 genes.^Percent change in leptin for every 0.01 increase in methylation beta value. � CpG sites that are included in the differentially methylated region (DMR). Hg19 Human Assembly was used to provide DMR location. (PDF)