450K Epigenome-Wide Scan Identifies Differential DNA Methylation in Newborns Related to Maternal Smoking during Pregnancy

Background: Epigenetic modifications, such as DNA methylation, due to in utero exposures may play a critical role in early programming for childhood and adult illness. Maternal smoking is a major risk factor for multiple adverse health outcomes in children, but the underlying mechanisms are unclear. Objective: We investigated epigenome-wide methylation in cord blood of newborns in relation to maternal smoking during pregnancy. Methods: We examined maternal plasma cotinine (an objective biomarker of smoking) measured during pregnancy in relation to DNA methylation at 473,844 CpG sites (CpGs) in 1,062 newborn cord blood samples from the Norwegian Mother and Child Cohort Study (MoBa) using the Infinium HumanMethylation450 BeadChip (450K). Results: We found differential DNA methylation at epigenome-wide statistical significance (p-value < 1.06 × 10–7) for 26 CpGs mapped to 10 genes. We replicated findings for CpGs in AHRR, CYP1A1, and GFI1 at strict Bonferroni-corrected statistical significance in a U.S. birth cohort. AHRR and CYP1A1 play a key role in the aryl hydrocarbon receptor signaling pathway, which mediates the detoxification of the components of tobacco smoke. GFI1 is involved in diverse developmental processes but has not previously been implicated in responses to tobacco smoke. Conclusions: We identified a set of genes with methylation changes present at birth in children whose mothers smoked during pregnancy. This is the first study of differential methylation across the genome in relation to maternal smoking during pregnancy using the 450K platform. Our findings implicate epigenetic mechanisms in the pathogenesis of the adverse health outcomes associated with this important in utero exposure.


Quality Control
Bisulfite conversion for the MoBa samples was evaluated according to methods previously described (Bibikova et al. 2011). Additionally, we included 28 blind replicate samples (14 study subjects run in duplicate, included on each plate), 26 plate control samples provided by Illumina (DNA from two cells lines run on each of the 13, 96-well plates), 25 plate control samples provided by us [DNA from 2 control individuals on each of 12 plates (1 on each half plate), 1 on the 13th plate] and 8 samples prepared from mixing methylated and nonmethylated DNA, described as follows. Human HCT116 DKO Methylated DNA (Cat# D5014-2) and human HCT116 DKO non-methylated DNA (Cat# D5014-1) were purchased from Zymo Research (Irvine, CA). The fully methylated DNA was mixed with non-methylated DNA to provide a series of methylation controls (10%, 35%, 60%, and 85% methylated) to be included on the first and twelfth plates.
We received from Illumina (San Diego, CA), data for a total of 1,204 samples. Samples with an average detection p-value across all probes of less than 0.05 and/or indicated by Illumina to have failed (N=49) were omitted from further analysis along with 1 sample erroneously included in the dataset. Multidimensional scaling (MDS) plots were used to evaluate gender outliers based on chromosome X data, where males and females separated into two distinct clusters. Samples separating into erroneous clusters (males in female cluster or females in male cluster) or not belonging to a distinct cluster were omitted (N=13). Blind duplicate samples were highly correlated (Spearman rho = 0.997) and the mean difference in beta was 0.0043 (standard error = 0.00012). For the 14 blind duplicate pairs, results from one of the two samples in each pair was selected at random to retain in the dataset and the other was omitted from further analysis. CpGs with missing chromosome data (N=65, mostly control probes), missing more than 10% of data across individuals (N=20), or on chromosome X (N=11,232) or Y (N=416) were omitted, resulting in 473,844 probes for analysis.
The laboratory analysis plan was designed to exclude batch effects. All samples were run with a single set of reagents on a single machine at Illumina, Inc. (San Diego, CA). Bisulfite conversion and methylation measurements including reruns were performed in March 2011.
Variables representing the chip (12 samples), chip set (four contiguous chips or half of a plate), and plate (96 samples) were included as covariates in statistical models to evaluate potential confounding. In addition, the distributions of beta and logratio values were compared across chips, chip sets and plates. We found that chip, chip set and plate were not appreciable sources of variability.
The NEST data quality control followed a similar protocol as described above. In addition to the 18 smokers (9 males, 9 females) and 18 non-smokers (9 males, 9 females), eight plate control samples and three samples representing 10%, 50%, and 85% methylation were included on a single plate.
It is possible that SNPs at or near CpGs could influence methylation intensities and thereby the associations we observed. We searched online databases to determine the presence of an underlying SNP for the top 105 most statistically significant CpGs. Information was obtained for SNPs with minor allele frequency ≥ 5% in the CEU (Utah residents with Northern Supplemental Material, Figure S1. Histograms showing the distribution of methylation levels in our data. Bimodal distribution was observed when considering all 473,844 CpG sites whereas approximately normal distribution was observed for most individually plotted CpG sites. (a) Beta across all CpGs analyzed; (b) log(beta/1-beta) across all CpGs analyzed; (c) Beta for one representative CpG (cg11924019); (d) log(beta/1beta) for the representative CpG.