Male obesity impacts DNA methylation reprogramming in sperm

Male obesity has profound effects on morbidity and mortality, but relatively little is known about the impact of obesity on gametes and the potential for adverse effects of male obesity to be passed to the next generation. DNA methylation contributes to gene regulation and is erased and re-established during gametogenesis. Throughout post-pubertal spermatogenesis, there are continual needs to both maintain established methylation and complete DNA methylation programming, even during epididymal maturation. This dynamic epigenetic landscape may confer increased vulnerability to environmental influences, including the obesogenic environment, that could disrupt reprogramming fidelity. Here we conducted an exploratory analysis that showed that overweight/obesity (n = 20) is associated with differences in mature spermatozoa DNA methylation profiles relative to controls with normal BMI (n = 47). We identified 3264 CpG sites in human sperm that are significantly associated with BMI (p < 0.05) using Infinium HumanMethylation450 BeadChips. These CpG sites were significantly overrepresented among genes involved in transcriptional regulation and misregulation in cancer, nervous system development, and stem cell pluripotency. Analysis of individual sperm using bisulfite sequencing of cloned alleles revealed that the methylation differences are present in a subset of sperm rather than being randomly distributed across all sperm. Male obesity is associated with altered sperm DNA methylation profiles that appear to affect reprogramming fidelity in a subset of sperm, suggestive of an influence on the spermatogonia. Further work is required to determine the potential heritability of these DNA methylation alterations. If heritable, these changes have the potential to impede normal development.


Background
A growing body of evidence supports that early-life environmental exposures can increase the risk of adult chronic disease. Effects of environmental exposures may be mediated through epigenetic changes, including changes in DNA methylation [1]. The patterns of DNA methylation throughout the genome (referred to as the methylome) help to regulate temporal and spatial gene expression. Plasticity of the methylome lends itself to heightened vulnerability to potential detrimental errors during periods of epigenetic flux, especially during the methylation reprogramming events that take place immediately post-fertilization and during gametogenesis [2].
After puberty, sperm production is continuous throughout adult life. This requires that sperm-specific DNA methylation profiles established during reprogramming of the primordial germ cells are maintained in the maturing sperm cells, and final methylation patterns must be established. The DNA methyltransferase enzymes are present throughout spermatogenesis, and studies in rodents have shown that de novo DNA methylation is continued even after the spermatids transit into the epididymis for maturation [3,4]. As a result of this continual need for methylation maintenance and completion of reprogramming, DNA methylation in male gametes may be more vulnerable to exogenous and endogenous environmental influences, including an obesogenic environment [5][6][7][8][9][10][11].
We previously demonstrated that babies born to obese fathers have altered DNA methylation at several regulatory regions of imprinted genes [10,11]. Imprinted genes are defined by DNA methylation that is divergent at the same genomic locations in sperm versus oocytes. These imprinted regions are differentially established after sex determination in the embryo during gametogenesis and give rise to monoallelic gene expression. Here, the active and silenced alleles in each somatic cell are determined by the epigenetic marks that distinguish the two parental copies. The ~ 100 known imprinted genes are critical mediators of early growth and development, yet they comprise a relatively small subset of the genes throughout the genome. A follow-up study sought to understand these methylation changes at imprinted regions by analyzing methylation patterns in mature spermatozoa and semen parameters of normal weight men versus men who were overweight or obese [12]. We reported significantly altered DNA methylation in sperm of the overweight and obese men as compared to the normal weight men at multiple imprinted gene regulatory regions. Herein we greatly expand our initial studies by conducting an exploratory examination of the influence of overweight/ obesity on DNA methylation throughout the genome.

Study subjects
Study subject characteristics by BMI category are presented in Table 1. Twenty of the 67 men were categorized as overweight/obese (BMI > 25), representing 29.9% of our study population. One man was excluded from the study because of a BMI of 59 kg/m 2 , and one man was excluded due to insufficient sample availability. There was no significant difference between BMI categories in education, biological paternity, sperm concentration, or sperm motility. There were significant differences between BMI categories for age, marital status, and being a patient at the Duke Fertility Center, with overweight/ obese men being older, more likely to be married, and more likely to be patients. The majority of men in both BMI categories had not previously biologically fathered children. Using the Kendall's rank correlation, we found no significant relationships between any of the semen parameters analyzed and BMI of all study subjects, nor when study subjects were stratified by BMI (overweight/ obese or normal).

Altered DNA methylation in sperm from men with elevated BMI
The Illumina Infinium HumanMethylation450 BeadChip (hereafter, 450K) is designed to generate quantitative DNA methylation data for 485,512 CpG sites throughout the genome. In order to account for the two different probe design types on the 450K BeadChip, the CpG sites were subjected to subset quantile within-array normalization (SWAN). Dropping CpG probes with poor detection p-values resulted in retention of 485,498 probes. All unreliable probes based on prior criteria [13]  were removed, reducing the number of retained probes to 294,833. Finally, probes that were invariant across all samples (β variance < 2 × 10 -5 ) were discarded, resulting in a total of 291,061 retained CpG sites for analysis [14,15,17]. Linear regression analysis was conducted on the 291,061 CpG sites controlling for age, race, smoking status and clinic patient status. A higher BMI was associated with one CpG site, cg24769403, which passed significance at the false discovery rate (FDR p = 0.007; 9.9% higher methylation in overweight/obese men) and the more conservative Bonferroni correction (p = 1.0 × 10 -7 ; Fig. 1). This CpG site lies 70.9-kb downstream of the transcription start site of the protooncogene, ADRA1B. None of the other CpGs were significant after adjustment for multiple comparisons. We therefore analyzed the data based on the raw p values (p < 0.01), which showed there were 3,264 CpG sites significantly associated with BMI, with p values ranging from p = 0.01 to p = 2.4 × 10 -8 (average and median p = 5.3 × 10 -3 ). The FDR p values for all CpG sites can be found in Additional file 2: Table S1. Of these, 2,851 were associated with unique gene names. The probes associated with the top 20 differentially methylated CpGs are shown in Table 2. There were 315 genes with multiple significant CpG sites per gene (range [2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19]. The top two genes with multiple affected CpGs were Protein Tyrosine Phosphatase Receptor Type N2 (PTPRN2; 19 sites with methylation values ranging from 3.4% lower to 3.6% higher in healthy versus overweight/obese; unadjusted p values ranging from 2.4e−5 to 0.007) and zinc finger protein 33A (ZNF33A; 10 sites with methylation values ranging from 1.3% to 6.4% lower in healthy versus overweight/obese men; unadjusted p values ranging from 0.0001 to 0.008).

Genomic imprinting results
The current comprehensive approach is also consistent with our earlier findings of imprinted genes in the same study population. We earlier reported that obesity was associated with lower methylation at the MEG3, SNRPN, and SGCE/PEG10 DMRs, and increased DNA  Notably, the latter probe CG site is included within the exact region identified as exhibiting decreased methylation in the overweight/obese from our prior study [12]. We also observed increased methylation for H19 (cg15963714: 1.4% higher, p = 0.007). There is no overlap in the other regions identified here using the 450K platform and the specific CG sites analyzed in our prior analysis.

Pyrosequencing results
To confirm results from our 450K data, we performed bisulfite pyrosequencing of all remaining sperm DNA samples for target regions of arbitrarily chosen genes ( Table 5) that were designed to measure methylation at the identified differentially methylated CpG site from the 450K platform, including Tumor protein P53 regulated apoptosis-inducing protein 1 (TP53AIP1), spermatogenesis-associated 21 (SPATA21), suppressor of glucose, autophagy associated 1 (SOGA1), and ADAM metallopeptidase domain 15 (ADAM15). Figure 2a shows the results for pyrosequencing assay performance, where input methylation using defined mixtures of fully methylated and unmethylated bisulfite converted control DNAs agreed with that measured by pyrosequencing for all four assays (Pearson R 2 = 0.96 to R 2 = 0.99). Figure 2b shows the sperm DNA methylation levels measured by pyrosequencing directly compared to the sperm DNA methylation levels measured on the 450K platform for 30 subjects. The degree of methylation at each specific CpG site measured by pyrosequencing correlated with that of the methylation measured on the 450K platform for all genes (Pearson R 2 = 0.93, p < 0.0001 for TP53AIP1; R 2 = 0.94, p < 0.0001 for SPATA21; R 2 = 0.75, p < 0.0001 for SOGA1; R 2 = 0.55, p < 0.0001 for ADAM15; data from all analyzed sites for each gene are provided in Additional file 3: Table S2). Although the overall levels of methylation for ADAM15 were very low, we nevertheless were able to detect a significant positive correlation between pyrosequencing and 450K values. We then examined the methylation levels at all four differentially methylated CpG sites of the overweight/ obese versus normal weight men with available remaining sample. The sperm of normal weight men had higher DNA methylation levels than sperm of overweight/obese men for TP53AIP1 (21.9% ± 2.7 vs 14.9% ± 2.1, p = 0.07) and in normal weight compared to overweight/obese men for SPATA21 (64.3% ± 2.7 vs 56.3% ± 2.8, p = 0.058) which was in accordance with our results from the 450K   (Fig. 3a, b). SOGA1 showed no difference in methylation between the groups by pyrosequencing (p = 0.79), and while results for ADAM15 were not significant (p = 0.24), they were in agreement with that observed on the 450K platform with respect to men with normal weight having higher methylation than those who were overweight/obese ( Fig. 3c and d; data for all CpG sites analyzed are shown in Additional file 1: Figure S2).

Cloned allele sequencing results
Since the methylation changes that were measured using the 450K and the pyrosequencing data are both representative of the averaged sperm population as a whole in the sample analyzed, we wanted to determine whether the differences in DNA methylation were evident by analyzing differentially methylated CpG sites in multiple single sperm from the same individual. By so doing, we can assess whether the methylation changes occur in a manner that is randomly distributed across all sperm or whether these changes only affect a small subset of sperm. We used bisulfite sequencing of cloned alleles to address this, whereby each individual clone reveals the methylation status of every CpG within the contiguous sequenced region of a single sperm cell, and multiple cloned alleles were sequenced for each individual analyzed. We selected seven regions for analysis ( Table 5) that showed the largest methylation differences between normal weight and overweight/obese men. The participants showing the most divergent results from the 450K platform analysis and with remaining sample were chosen for the cloned allele studies. For three genes, mitotic spindle positioning (MISP), archaelysin family metallopeptidase 1 (AMZ1), and hydroxycarboxylic acid receptor 3 (HCAR3), there were no major differences between the normal weight and overweight/obese sperm samples  Figure S1). However, the other four genes analyzed showed differences in methylation profiles between the sperm cells within a given individual as well as differences between normal weight and overweight/obese individuals. For mitogen-activated protein kinase 8 interacting protein 3 (MAPK8IP3), there were fewer methylated CpG sites across the sperm analyzed in the normal weight (average 76.7% methylation) compared to the overweight/obese sperm samples (average 87.7% methylation) (Fig. 4a). For tubulin-folding cofactor D (TBCD), the sperm of all men analyzed were mostly unmethylated except for several from each individual with a more heavily methylated profile (Fig. 4b). For XK related 6 (XKR6), 16% of the sperm were nearly completely unmethylated, while the remainder were highly or fully methylated in normal weight men (Fig. 4c). In the overweight/obese sample, all of the sperm were heavily methylated (88.5%). Finally, for SOGA1, the majority of the sperm were heavily methylated in the normal weight men, whereas there were a roughly equal number of heavily methylated and largely unmethylated sperm in the overweight/obese men (Fig. 4d).

Discussion
Approximately 34% of adult males are classified as obese in the USA [19], and obesity is one of the major contributors to male-factor infertility [20]. This relationship is likely driven through increased energy input with consequent inflammation, disruption of metabolism and endocrine signaling [21]. Thus, as we have previously suggested [22], changes in the molecular composition of sperm can impact DNA methylation. As such, we sought to expand upon our prior study of imprinted gene regulatory regions [12] to determine whether there were detectable alterations elsewhere in the genome.
We found significant differences in DNA methylation comparing sperm from overweight/obese men to normal weight men at multiple CpG sites. From our exploratory 450K dataset, there was one CpG site that was significant at the FDR that is located downstream of the adrenoreceptor alpha 1B gene (ADRA1B). ADRA1B is a protooncogene that is a member of the alpha-1-adrenergic receptor family. This family of receptors activates mitogenic responses and plays a role in the regulation of cellular growth and proliferation. In particular, ADRA1B, when transfected into NIH 3T3 fibroblasts, induces the neoplastic transformation of cells. Future studies might focus on how lifestyle factors and environmental exposure might impact DNA methylation in additional regions of this gene. Many of the identified genes have key regulatory roles in developmental, metabolic, and inflammatory processes. As such, alterations due to DNA methylation changes could have significant downstream effects. A large number of the gene ontology and KEGG terms associated with the identified differentially methylated sites are related to early embryonic and neuronal development, and the regulation (or misregulation) of transcription. Genes critical for early embryonic and neuronal development, as well as the regulation of transcription, are among the genes that are poised for post-fertilization activation in sperm, given their critical function during early-life development. Their required early activation of expression, however, appears to make them more susceptible to environmental perturbations that can disrupt their proper methylation. If changes in DNA methylation at these genes are retained postfertilization, this could lead to potential unintended consequences during development due to dysregulated expression.
From the 3264 differentially methylated CpG sites that distinguish normal weight from overweight/obese men (unadjusted p value < 0.01), we arbitrarily chose sites for validation. The DNA methylation values measured by bisulfite pyrosequencing for CpG sites associated with TP53AIP1, SPATA21, SOGA1 and ADAM15 were highly Fig. 4 Non-random distribution of methylation changes across the sperm population by bisulfite sequencing of cloned alleles. For each gene, the genomic structure and relative position of the region sequenced are shown, with the actual sequence of the region, and CpG sites queried shown below. The CpG that exhibited differential methylation on the Illumina HumanMethylation450 (450K) bead chip is indicated, along with the probe ID. The numbering of the CpGs below the sequence corresponds to each of the CpGs analyzed. For SOGA1, the bracketed sequence and the numbering of CpGs from 1′ to 6′ are the CpG sites analyzed by bisulfite pyrosequencing (refer to Figs. 2 and 3). For each region, the PCR products derived from bisulfite-modified sperm DNA were cloned and sequenced from either two (panel c) or four (panels a, b and d)  TP53AIP1 encodes a TP53-inducible protein involved in mediating apoptosis [23]. SPATA21 is involved in the differentiation of haploid spermatids. A gene-based association study has shown that SPATA21 is one of several genes implicated in adolescent idiopathic scoliosis [24]. SOGA1 regulates autophagy by playing a role in reducing glucose production in an adiponectin-mediated and insulin-dependent manner [25]. Finally, ADAM15 is a protein coding gene that is a member of the ADAM (a disintegrin and metalloproteinase) protein family which are transmembrane glycoproteins involved in cell adhesion. This protein family is thought to play diverse roles in cellular processes, one of which includes fertilization and, in fact, is among candidates that may be the binding entities at the egg membrane surface [26]. In guinea pig spermatozoa, ADAM15 interacts with the cell adhesion glycoprotein acrogranin during the fertilization process [27]. The findings of our study are consistent with the results of studies by Donkin et al. [28] as well as Potabattula et al. [29], where the potential effects of obesity on the sperm epigenome were investigated, with a focus on potential intergenerational inheritance in the latter. Donkin et al. identified 9081 unique differentially methylated genes between 13 normal weight and 10 obese men. In addition, they analyzed sperm from six obese men before and after bariatric surgery. They found that a large number of genes in sperm showed changes in DNA methylation a week after surgery and that these new profiles were maintained in the sperm for at least a year. Such rapid changes in sperm DNA methylation suggest that the alterations were induced in the maturing sperm, since the timeframe from spermatogonial differentiation to production of mature sperm is about 74 days in humans. That the alterations were detectable one year later may be indicative of a simultaneous and permanent methylation change in the spermatogonial progenitors. Potabattula et al. [29] examined DNA methylation by bisulfite pyrosequencing at seven imprinted genes and one non-imprinted gene in sperm of normal weight men, pre-obesity/obese men, and one underweight man, as well as in the cord blood of offspring. The researchers found a positive correlation at the MEG3-IG regulatory region between sperm DNA methylation and BMI. They also reported a sex-specific correlation between paternal BMI and methylation levels in cord blood for the MEG3-IG DMR, IGF2-DMR0, and HIF3A, the non-imprinted gene the group analyzed. Additionally, hypomethylation of IGF2-DMR0 in fetal cord blood was associated with increased paternal BMI in female offspring. These results support our findings that there are detectable BMI-related differences in sperm DNA methylation and support that these altered methylation patterns can be passed onto offspring.
The current analysis is consistent with our prior findings on genomic imprinting for the regions that were included on the 450K platform. We were able to compare the CpG sites that are represented on the 450K platform with what we had previously published and found agreement with what we had observed in our prior work. Further, the concordance between genes and direction of methylation change between the 450K platform used here and the pyrosequencing data from our prior study support the validity of our findings.
It has generally been thought that the reprogramming events that occur during gametogenesis and post-fertilization leave little chance to transmit any altered methylation profiles from the prior generation to the next. Gametic epigenetic reprogramming has been thought of as evolution's way to ensure undoing of any potentially harmful changes that may have occurred during a parent's lifetime [30]. On the other hand, environmentally induced epigenetic changes in gametes could be transferred to subsequent generations, which might explain how relatively fast evolutionary responses result from environmental changes [9,[31][32][33]. Recent studies have shown that a substantial number of regions of the genome are resistant to the DNA methylation erasure that occurs during gametogenesis and post-fertilization reprogramming [34][35][36]. The partial retention of DNA methylation at these "escapee" regions may provide a way of transmitting intact epigenetic information to the next generation.
The haploid nature of sperm cells makes it possible to use bisulfite sequencing of cloned alleles as a method to examine the patterns of DNA methylation present in individual sperm cells and the distribution of these patterns across multiple sperm from the same individual. Apart from the potential limitation of selecting multiple clones representing the same sperm, our analysis suggests that CpG methylation alterations in sperm at the regions analyzed are not randomly distributed throughout the entirety of the sperm population, but rather appear to be present in a small proportion of the sperm cells. Furthermore, the differences in methylation associated with an overweight/obese BMI are reflected by a shift in these proportions. From our results, we are unable to determine whether the same sperm cell is impacted by methylation changes at more than one of these regions. Nevertheless, these results indicate that overweight/obese men have an increased chance of conceiving a child with sperm carrying a skewed methylation configuration at one or more regions of the genome.
Study limitations include the exploratory nature of the study and a small sample size with recruitment of one-third of our study population from the Duke Fertility Center. We adjusted for Fertility Center patient status and excluded males with known male factor infertility, but it is possible that there is residual confounding due to inherent differences in characteristics of sperm DNA between men attending the clinic and those not attending the clinic. We also limited our study population to Caucasian men due to potential differences in DNA methylation based on race/ethnicity. One of the reasons for potential lack of validation by cloned allele analysis for some regions is that this methodology examines only a small proportion of the total sperm population, and there can be bias in PCR amplification as well as in selection of individual clones for sequencing. Nonetheless, some of our data from this analysis suggest that we were able to detect differences in the distribution of methylation for a number of the genes examined. Finally, there are more than 28 million CpG sites throughout the genome, and the ~ 486,000 included here may have missed detection of other important regions that are affected by overweight/ obese status. Study strengths include restriction to Caucasian men, thus limiting heterogeneity and increasing the power to detect true associations, and that all sample processing and data generation were performed in parallel. We used two independent methods for confirmation of our findings, including rigorously developed assays for bisulfite pyrosequencing as well as bisulfite sequencing of a large number of cloned alleles. We were especially compelled by the successful validation of some of our targets, given that these sites were indeed chosen at random, and not because of the magnitude of their methylation difference or their degree of statistical significance. Our analysis indeed showed that there are multiple different allelic methylation profiles at the same locus in sperm from the same individual.

Conclusions
Our study contributes to the growing body of evidence that the impact of paternal lifestyle on the proper maturation of the epigenetic information carried in the sperm, and potentially on subsequent embryonic and fetal development, is perhaps more important than previously appreciated. Obesity-related epigenetic changes in sperm may be reversible with weight loss, and thus, improving paternal metabolic health is anticipated to increase the proportion of sperm with a more healthy methylome and therefore decrease the chances of an adverse impact on embryonic and fetal development [28,37]. Given the obesity epidemic, it is essential to replicate our results in a larger sample size. Genetics needs to also be examined, since CpG methylation can be influenced by genotype [38][39][40]. Lastly, our results underscore an urgent need to determine the potential for inter-and transgenerational heritability of altered sperm DNA methylation profiles.

Study participation and data collection
All participants in this study were enrolled in the TIEGER study at Duke University. Subject recruitment and inclusion/exclusion criteria were previously described, and subjects were excluded if they had known male factor infertility [12]. BMI categories were defined in accordance with World Health Organization guidelines as follows: normal weight (18.5 kg/m 2 ≤ BMI ≤ 25 kg/m 2 ), overweight (25 kg/m 2 < BMI < 30 kg/m 2 ) and obese (BMI ≥ 30 kg/m 2 ). For the purpose of this study, subjects with BMI > 25 were categorized as "overweight/ obese. " Subjects completed a short questionnaire regarding information on socio-demographic and lifestyle factors, including level of education, marital status, number of children fathered, occupation, and physical activity. Semen, urine, and blood samples were collected from all subjects. The sample collection and processing have been previously described in detail [12].

DNA isolation and methylation analysis
Sperm genomic DNA was extracted using Puregene Reagents (Qiagen; Valencia CA). One microgram of purified DNA for each sample was provided to the Duke Molecular Genomics Core for generation of Illumina HumanMethylation450 BeadChip data according to the manufacturer's instructions (Illumina Inc., San Diego, CA).
The array analysis of methylation levels at each CpG site generated a β-value, which represents the proportion of signal obtained for methylation at a specific CpG site, where 1 is completely methylated and 0 is completely unmethylated. A logit transformation was applied to the β-values, due to the severe heteroscedasticity of highly methylated and unmethylated β-values. The transformed values, or M-values, are defined as: M = log (β/1 − β). The M-values were then used to find differential methylation [41].

Identification of differentially methylated CpG Sites
A site-based analysis was performed using linear regression which examined each CpG site and ordered the list of individual CpG sites by the association between level of methylation and BMI as continuous variables. Potential confounders were selected based on known or observed association with DNA methylation and with obesity. In the final analysis, all results were adjusted for age, smoking status, strenuous exercise (based on median), and clinic patient status. Exercise was categorized as a binary variable, including those who exercised 0-3 days per week and those who exercised 4-7 days per week. Although no current smokers were recruited for the study, six subjects reported a prior history of smoking and were considered for the purposes of covariate adjustment. Multicollinearity between covariates was tested, and no correlations of interest were found. After the significance level for the relationship between BMI and methylation was obtained for each CpG site, the p-values were adjusted to correct for false discovery rate (FDR), denoted as the q-value.

Bisulfite pyrosequencing
Bisulfite modification of 800 ng of sperm DNA was performed using the Zymo EZ DNA Methylation Kit, converting unmethylated cytosines to uracils while leaving methylated cytosines unaltered. The DMRs associated with the following genes were examined: TP53AIP1 (probe cg24908198; 6 CpG sites analyzed), SPATA21 (probe cg17859706; 4 CpG sites analyzed), SOGA1 (probe cg00171166; 6 CpG sites analyzed), and ADAM15 (cg27576241; 4 CpG sites analyzed). Pyrosequencing assay design was performed using PSQ Assay Design Software v1.0 (Qiagen). Forward and reverse PCR primer sequences can be found in Additional file 4: Table S3. The 5′ end of one PCR primer from each pair was conjugated to biotin to allow for retention of one DNA strand through denaturation of the double-stranded amplicons and binding of the biotin-containing strand to streptavidin beads. Bisulfite-modified sperm DNA (40 ng) was then amplified in a 10 μl PCR reaction volume using the PyroMark PCR Kit (Qiagen) with 0.3 µl 25 mM MgCl 2 and 0.24 μl each of the forward and reverse primers (10 mM). In addition, 1 μl of CoralLoad Concentrate (Qiagen) was added to each reaction in order to help visualize amplicons on an agarose gel.
Pyrosequencing assays were performed in duplicate in sequential runs (technical replicates) in CpG analysis mode on a Qiagen Pyromark Q96 MD Pyrosequencer, and the resulting percent methylation for each CpG site as well as bisulfite conversion efficiency was calculated using PyroQ CpG software v1.0 (Qiagen). The values shown represent the mean methylation for the replicate runs for the individual CpG sites that are represented on the HumanMethylation450 BeadChip platform. Validation of pyrosequencing assays was completed in triplicate using defined mixtures of unmethylated and methylated DNA (Epitect DNA; Qiagen).

Bisulfite sequencing of cloned alleles
Bisulfite sequencing of cloned alleles was used to provide more comprehensive information on the specific patterns of methylation that are present in individual haploid sperm cells in the vicinity of, and including, the single CpG site found to differ based on the Illumina beadchip platform. Regions examined included those associated with MISP (probe cg18870054; 22 CpG sites analyzed), AMZ1 (probe cg01098939; 19 CpG sites), GPR109B/ HCAR3 (probe cg18578876; 8 CpG sites), MAP-K8IP3 (probe cg05772935; 11 CpG sites), XKR6 (probe cg25398727; 11 CpG sites), TBCD (probe cg17169982; 8 CpG sites), and SOGA1 (probe cg00171166; 12 CpG sites). Bisulfite-treated DNA (20 ng) was amplified using the HotStarTaq PCR kit and forward and reverse primers that do not anneal to CpG sites. Primer sequences and PCR conditions are provided in Additional file 5: Table S4. The PCR amplicons were resolved on a 2% agarose gel, excised, and purified using GenElute agarose spin columns (Sigma-Aldrich; St. Louis, MO). The eluted DNA was purified using Zymo gDNA Clean and Concentrator (Irvine, CA). The purified DNA was ligated and transformed into competent JM109 E. coli using the pGEM ® T-Easy Vector System according to the manufacturer's instructions (Promega; Madison, WI). The bacterial transformants were plated on LB/ampicillin/ X-Gal plates and grown overnight at 37 °C. Individual colony-forming units were selected for each specimen and underwent whole-cell PCR using Qiagen HotStart Taq DNA polymerase kit with SP6 and T7 primers 5 µl of PCR product was loaded onto a 2% agarose gel to confirm band size was as expected. The remaining PCR product was purified using the Zymo gDNA Clean and Concentrator (Irvine, CA). Following purification, samples underwent PCR in preparation for BigDye Sequencing, using the BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, Austin, TX).

Statistical analyses
Participant characteristics were calculated using a Chisquare analysis. Linear regression models were used to adjust for age, marital status, and fertility clinic patient status, which differed by BMI. When analyzing pyrosequencing data, linear regression models were used for comparing continuous data. Unpaired t-tests were used for two-way comparisons and, where relevant, a Welch's t test and Mann-Whitney U test were applied. Threeway comparisons were performed using one-way analysis of variance (ANOVA) tests with Kruskal-Wallis tests where the data were not normally distributed followed by two-group comparisons using unpaired t-tests or Mann-Whitney U tests as relevant. These statistical analyses were performed using Prism 7 for Mac OS X version 7.0a (GraphPad Software; La Jolla, CA).