Sex differences of leukocytes DNA methylation adjusted for estimated cellular proportions

DNA methylation, which is most frequently the transference of a methyl group to the 5-carbon position of the cytosine in a CpG dinucleotide, plays an important role in both normal development and diseases. To date, several genome-wide methylome studies have revealed sex-biased DNA methylation, yet no studies have investigated sex differences in DNA methylation by taking into account cellular heterogeneity. The aim of the present study was to investigate sex-biased DNA methylation on the autosomes in human blood by adjusting for estimated cellular proportions because cell-type proportions may vary by sex. We performed a genome-wide DNA methylation profiling of the peripheral leukocytes in two sets of samples, a discovery set (49 males and 44 females) and a replication set (14 males and 10 females) using Infinium HumanMethylation450 BeadChips for 485,764 CpG dinucleotides and then examined the effect of sex on DNA methylation with a multiple linear regression analysis after adjusting for age, the estimated 6 cell-type proportions, and the covariates identified in a surrogate variable analysis. We identified differential DNA methylation between males and females at 292 autosomal CpG site loci in the discovery set (Bonferroni-adjusted p < 0.05). Of these 292 CpG sites, significant sex differences were also observed at 98 sites in the replication set (p < 0.05). These findings provided further evidence that DNA methylation may play a role in the differentiation or maintenance of sexual dimorphisms. Our methylome mapping of the effects of sex may be useful to understanding the molecular mechanism involved in both normal development and diseases.


Background
Sex differences have been widely observed not only in genetics and hormones but also in expression of genes and microRNA [1][2][3][4]. DNA methylation, which is most frequently the transference of a methyl group to the 5-carbon position of the cytosine in a CpG dinucleotide, is one of the major mechanisms of epigenetic modifications. This modification plays an important role in gene expression, chromosomal stability, genomic imprinting, X-chromosome inactivation, and mammalian development [5,6]. Recent genome-wide methylome studies have revealed sex-biased DNA methylation in specific genes on the autosomes of several tissues, such as the blood, brain, and saliva [7][8][9]4]. However, researchers have not yet investigated the sex differences in DNA methylation by taking into account cellular heterogeneity, although several studies have demonstrated the effects of cellular heterogeneity on DNA methylation status [10][11][12][13][14][15][16], and cell-type proportions may vary by sex.
To reveal sex differences in DNA methylation in human blood, we conducted a genome-wide profiling of DNA methylation by using peripheral leukocytes and then examined sex-biased DNA methylation after correcting the estimated cell-type proportions of each sample.

Subjects
Ninety-three healthy subjects (49 males and 44 females; mean age: 43.6 ± 12.3 years) for our discovery set and 24 healthy subjects (14 males and 10 females, mean age: 35.3 ± 11.9 years) for our replication set were recruited from volunteers who comprised hospital staff, university students, and company employees. There was no significant age difference between male and female groups in both sample sets (p > 0.05). All subjects who joined this study were of unrelated Japanese origin and signed written informed consent forms that were approved by the institutional ethics committees of Tokushima University Graduate School and the Osaka University Graduate School of Medicine.

DNA methylation methods
Genomic DNA was prepared from peripheral blood samples. A bisulfite conversion of 500 ng of genomic DNA was performed with the EZ DNA methylation kit (Zymo Research). DNA methylation levels were assessed with Infinium HumanMethylation450 BeadChips (Illumina Inc.) according to the manufacturer's instructions. This array's technical schemes, accuracy, and high reproducibility have been described in previous papers [17][18][19]. Quantitative measurements of DNA methylation were determined for 485,764 CpG dinucleotides that covered 99 % of the RefSeq genes and were distributed across whole gene regions, including promoters, gene bodies, and 3′-untranslated regions (UTRs). The arrays also covered 96 % of the CpG islands (CGIs) from the UCSC database with additional coverage in CGI shores (0-2 kb from CGI) and CGI shelves (2-4 kb from CGI). DNA methylation data were analyzed using the methylation analysis module within the BeadStudio software (Illumina Inc.). The DNA methylation status of the CpG sites was calculated as the ratio of the signal from a methylated probe relative to the sum of both the methylated and unmethylated probes. This value, known as β, ranges from 0 (completely unmethylated) to 1 (fully methylated). For intra-chip normalization of the probe intensities, we performed color balance and background corrections on every set of 12 samples from the same chip by using internal control probes. For quality control, β values with detection p values ≥0.05 were treated as missing values. Qualified CpG sites used in statistical analyses were defined as follows: 1) autosomal CpGs with no missing values in all subjects; 2) CpGs with no probe single nucleotide polymorphism (SNPs) at minor allele frequencies ≥5 % in the HapMap-JPT population; 3) CpGs with no probe crossreactivity, and no SNPs at CpG sites and single-base extension sites in a previous paper [20].

Statistical analysis
The cell-type proportions (CD4 + T cell, CD8 + T cell, CD56 + NK cell, CD19 + B cell, CD14 + monocyte, and granulocyte) for each of the samples were estimated using a published algorithm [21,22] implemented in an R-package "Minfi," as we had done in our previous study [15]. Surrogate variable analysis (SVA), which is a method for modeling the potential confounding factors that may or may not be known, including technical factors such as batch effects, can increase the biological accuracy and reproducibility of analyses in microarray studies [23,24]. We used SVA to identify the potential confounding factors in our microarray data as surrogate variables (SVs). Then, we examined the influences of sex on DNA methylation with a multiple linear regression analysis after adjusting for age, significant SVs (8 SVs in the first set and 6 SVs in the replication set), and the estimated 6 cell-type proportions, as in a previous study [8]. Bonferroni correction was applied at the 0.05 level for multiple testing (nominal p value of 1.44 × 10 −7 ). The gene-ontology analysis was performed with the Database for Annotation, Visualization and Integrated Discovery (DAVID) [25].

Estimated cell-type proportions between males and females
We estimated 6 cell-type proportions using "Minfi", a flexible and comprehensive bioconductor package for the analysis of Infinium DNA methylation microarrays developed by Aryee et al. [21]. The average estimated cellular proportions of the male and female groups are shown in Fig. 1. Of the 6 cell types, 2 (CD8 + T cell and CD56 + NK cell) showed small but significant differences between the two groups (Welch's t test p < 0.05), which could be confounding factors in determining sex-differential DNA methylation sites.

Sex differences in DNA methylation in the blood
The DNA methylation levels of 93 subjects were evaluated using Infinium HumanMethylation450 Bead-Chips, and then the sex differences of these levels between 49 males and 44 females were assessed using a multiple linear regression analysis after adjusting for age, the estimated cell-type proportions, and the SVs identified in our SVA. Of the 345,235 CpG sites, significant sex differences in DNA methylation were observed at 292 CpG sites (nominal p < 1.44 × 10 −7 , Additional file 1). When we analyzed array data without adjusting for the estimated cell-type proportions, significant sex differences were observed at 417 CpG sites (Additional file 2), and 270 sites were common between the results from the adjusted and un-adjusted analyses. The reduction in differentially methylated sites after cell proportion adjustment suggests that the present statistical analysis has the ability to lower false-positive detections of sexdifferential DNA methylation sites. Figure 2 shows volcano plots of differentially methylated CpG sites between males and females. Figure 3 shows a quantile-quantile (Q-Q) plot of −log 10 P values, which deviates from their expected values under the null hypothesis. Of the 292 CpG sites, 237 sites (81.2 %) showed higher methylation in females than in males. Table 1 lists the top 20 CpG sites that showed significant sex differences. When these 292 CpG sites were classified into 4 different categories according to their locations in the genes (promoter, gene body, 3′-UTR, and intergenic region), 139 sites (47.6 %) were located in the promoter regions, 59 sites (20.2 %) in gene bodies, and 2 sites (0.7 %) in 3′-UTRs. When these 292 CpG sites were classified into 4 categories according to the CpG content in the genes (CGI, CGI shore, CGI shelf, and others), 139 sites (47.6 %) were located in the CGIs, 10 sites (3.4 %) in CGI shores, and 76 sites (26 %) in CGI shelves. Fig. 1 Average estimated cellular proportions of male and female groups. The y axis is each of average estimated cellular proportions of CD8 + T cell, CD4 + T cell, CD56 + NK cell, CD19 + B cell, CD14 + monocyte, and granulocyte. Significant differences between the two groups were observed in 2 cell types (CD8 + T cell and CD56 + NK cell) (Welch's t test p < 0.05) Fig. 2 Volcano plots of differentially methylated CpG sites between males and females. This volcano plot shows the result of genome-wide DNA methylation differences between 49 males and 44 females after adjusting for the estimated cell mixture proportions. Average beta difference (males-females) is shown on the x axis. Log 10 -converted p value is shown on the y-axis. CpG loci that showed a p value of less than 5 % after Bonferroni correction are colored red. Significant sex differences in DNA methylation were observed at 292 CpG sites (p < 1.44 × 10 −7 )

Gene-ontology analysis
We used DAVID to perform a gene-ontology analysis of the genes, which showed significant sex differences in DNA methylation, and revealed enrichment of genes related to secretion and secretion by cell. Table 2 lists the significant gene-ontology categories.
Validation of sex differences in an independent set of samples DNA methylation levels were measured in an independent cohort of 14 males and 10 females using the same Illumina DNA methylation arrays. Of the top 20 differentially methylated CpG sites between males and females in the first set, the same directions (male > female or male < female) were observed at all CpG sites, and significant sex differences were also observed at 16 sites in the replication set (p < 0.05) ( Table 1). Of the 292 differentially methylated CpG sites in the first set, significant sex differences were also observed at 98 sites in the replication set (p < 0.05).

Discussion
In this study, we conducted a genome-wide DNA methylation profiling of the peripheral leukocytes from non-psychiatric subjects using Infinium Human-Methylation450 BeadChips and identified sex-biased genes on autosomes by adjusting for the estimated cell-type proportions. This blood study is the first to reveal sex differences in DNA methylation by taking into account cellular heterogeneity of blood in the analysis.
We revealed that most of significant loci (81.2 %) showed higher DNA methylation in females than in males. This finding is consistent with the results of previous studies [4,[7][8][9]. However, the explanation for this phenomenon is unclear. Gene-ontology analysis of biological process revealed that genes with sex differences in DNA methylation on autosomes were related to secretion and secretion by cell. Of these 8 secretionrelated genes, 5 genes (FKBP1B, SCIN, SMPD3, STEAP2, and TRIM36) has been associated with prostate cancer and hyperplasia [26][27][28][29][30]. These results may suggest some hormone-related genes are sex-differentially regulated, perhaps via methylation.
To date, two genome-wide methylome studies have examined sex-biased DNA methylation using Illumina Infinium HumanMethylation450 BeadChips [4,9]. When we compared with the 614 sex-biased differential CpG sites on autosomes identified in a previous study using the human prefrontal cortex tissues [4], these CpG sites identified by Xu et al. were significantly enriched for those sites identified in the present study (common CpG site: 93 vs. 293, un-common CpG site: 521 vs. 344,942, odds ratio (OR) = 210; 95 % confidence intervals (CIs), 163-269; Fisher exact test p < 0.05). When we compared with the top 20 sexbiased differential CpG sites on autosomes in the study of Xu et al. [4], we observed common sexbiased DNA methylation at 17 CpG sites which covered 14 distinctive genes (ARID1B, C6orf108, GLUD1, H3F3A, KRT77, SCIN, TFDP1, WBP11P1, YARS2, and ZNF69) in our blood study. These results suggest that sex-biased DNA methylation on autosomes in the brain is also observed in peripheral blood in specific genes, although tissue-specific differences in DNA methylation have been reported [31,32]. ARID1B, which is a member of the SWI/SNF-A chromatin remodeling complex, has been implicated in intellectual disability and autism spectrum disorders [33,34]. GLUD1, which plays a role at glutamatergic synapses [35], has been implicated in schizophrenia [36]. H3F3A, which encodes the replication-independent histone 3 variant H3.3, has been implicated in glioblastoma [37,38].
When we compared with the 564 sex-biased differential genes on autosomes identified in a previous study using the human blood mononuclear cells from a high-aged cohort (over 95 years old) [9], we observed common sex-biased DNA methylation in only 15 genes (AGAP11, ANKRD11, C15orf29, HOXC4, HOXC5, HOXC6, MACROD1, NOTCH4, NSD1, OSTalpha, PEX10, PTPRN2, Fig. 3 Quantile-quantile (Q-Q) plot of DNA methylation between males and females. The x axis is the expected −log 10 P value, and the y axis is the observed −log 10 P value. This Q-Q plot shows a deviation of the observed from the expected, providing evidence of DNA methylation differences between males and females at numerous CpG sites  (2010) has demonstrated HoxC4mediated regulation of activation-induced cytosine deaminase expression, as enhanced by estrogen, and has suggested a possible role of this homeodomain transcription factor in mediating immunopotentiation in gestation and neonatal and adult life [39]. There are several limitations to the present study. First, our sample size was not large. Replication studies with larger samples will be needed. Second, the cellular proportions were created by a bioinformatics tool, so these were not based on direct observation of the relative numbers of cells in the sample. Furthermore, experimental noises may be increased due to the circular use of DNA methylation data, as these data are used first to define cell-type proportions, which are then used as covariates in the differential methylation analysis. Cell-type-specific studies will be needed. Third, we did not take other confounding factors, such as smoking or body mass index, into consideration in our analysis, which may affect DNA methylation status [40,41], because these information were not collected in the present study.