Genome-wide DNA methylation profiling identifies convergent molecular signatures associated with idiopathic and syndromic forms of autism in post-mortem human brain tissue

Autism spectrum disorder (ASD) encompasses a collection of complex neuropsychiatric disorders characterized by deficits in social functioning, communication and repetitive behavior. Building on recent studies supporting a role for developmentally moderated regulatory genomic variation in the molecular etiology of ASD, we quantified genome-wide patterns of DNA methylation in 233 post-mortem tissues samples isolated from three brain regions (prefrontal cortex, temporal cortex and cerebellum) dissected from 43 ASD patients and 38 non-psychiatric control donors. We identified widespread differences in DNA methylation associated with idiopathic ASD (iASD), with consistent signals in both cortical regions that were distinct to those observed in the cerebellum. Individuals carrying a duplication on chromosome 15q (dup15q), representing a genetically-defined subtype of ASD, were characterized by striking differences in DNA methylation across a discrete domain spanning an imprinted gene cluster within the duplicated region. In addition to the dramatic cis-effects on DNA methylation observed in dup15q carriers, we identified convergent methylomic signatures associated with both iASD and dup15q, reflecting the findings from previous studies of gene expression and H3K27ac. Cortical co-methylation network analysis identified a number of co-methylated modules significantly associated with ASD that are enriched for genomic regions annotated to genes involved in the immune system, synaptic signalling and neuronal regulation. Our study represents the first systematic analysis of DNA methylation associated with ASD across multiple brain regions, providing novel evidence for convergent molecular signatures associated with both idiopathic and syndromic autism.


Introduction
Autism spectrum disorder (ASD) encompasses a collection of complex neuropsychiatric disorders characterized by deficits in social interactions and understanding, repetitive behavior and interests, and impairments in language and communication development. ASD affects ~1% of the population and confers severe lifelong disability, contributing significantly to the global burden of disease 1, 2 .
Evidence from neuroimaging, neuropathology, genetic and epidemiological studies has led to the conceptualization of ASD as a neurodevelopmental disorder, with etiological origins before birth 3,4 . Quantitative genetic analyses have shown that ASD has a strong heritable component 5 with an emerging literature implicating rare single base-pair mutations, chromosomal rearrangements, de novo and inherited structural genomic variation, and common (polygenic) risk variants in its pathogenesis [6][7][8][9] . Despite the highly heterogeneous role of genetic variation in ASD, studies of transcriptional 10,11 and regulatory genomic variation 12 in post-mortem ASD brain provide evidence for a highly-convergent molecular pathology, with individuals affected genetically-defined subtypes of ASD sharing the core transcriptional signatures observed in idiopathic autism cases.
There is increasing evidence to support a role for non-sequence-based genomic variation in the etiology of neurodevelopmental phenotypes including ASD 13 . The epigenetic regulation of gene expression in the central nervous system is involved in modulating many core neurobiological and cognitive processes including neurogenesis 14,15 , neuronal plasticity 16 and memory formation 17,18 , and is known to be highly dynamic during human brain development 19 . The dysregulation of epigenetic mechanisms underlies the symptoms of Rett syndrome and Fragile X syndrome, two disorders with considerable phenotypic overlap with ASD [20][21][22] , and epigenetic variation has been recently associated with several neurodevelopmental phenotypes including ASD 12,[23][24][25][26][27][28][29][30][31] . Current epigenome-wide association studies (EWAS) of autism have focused primarily on DNA methylation, the best characterized and most stable epigenetic modification that acts to influence gene expression via physical disruption of transcription factor binding and through the attraction of methyl-binding proteins that initiate chromatin compaction and gene silencing 32 . Despite finding evidence for ASD-associated methylomic variation, however, these analyses have been constrained by the analysis of small sample numbers and limited to the assessment of peripheral tissues or a single brain region 12,[23][24][25][26][27][28][29] .
In this study, we present results from the most systematic analysis of DNA methylation in ASD brain yet undertaken, quantifying methylomic variation in patients with idiopathic ASD (iASD) in addition to patients with a duplication of chromosome 15q11-13 ("dup15q"), which represents the most frequent cytogenetic abnormality associated with ASD occurring in ~1% of cases 33,34 . From each donor we profiled matched post-mortem tissue from three brain regions -prefrontal cortex (PFC), temporal cortex (TC), and cerebellum (CB) -previously implicated in the pathophysiology of ASD. The frontal and temporal lobes, for example, play a role in social cognition, and animal models of ASD highlight cerebellar dysfunction [35][36][37] . We find DNA methylation differences in both groups of iASD and dup15q patients, with consistent patterns of variation seen across the two cortical regions, distinct to those identified in CB. In addition to identifying dramatic cis-effects of the dup15q duplication in all three brain regions, we identify a significant overlap with the core methylomic differences observed in idiopathic autism iASD cases, reflecting findings from studies of transcriptional variation 11 and histone modifications 12 .

Methodological overview
We quantified DNA methylation across the genome using the Illumina Infinium HumanMethylation450 BeadChip ("450K array") in 223 post-mortem tissue samples comprising prefrontal cortex (PFC), temporal cortex (TC) and cerebellum dissected from 43 donors with ASD (including 7 patients with dup15q syndrome and 36 patients with idiopathic ASD (iASD)) and 38 non-psychiatric control subjects. After implementing a stringent quality control (QC) pipeline (see Methods), we obtained high-quality DNA methylation data from 76 PFC samples (n = 36 iASD patients, n = 7 dup15q patients, n = 33 controls), 77 TC samples (n = 33 iASD patients, n = 6 dup15q patients, n = 38 controls) and 70 CB samples (n = 34 iASD patients, n = 7 dup15q patients, n = 29 controls) (Supplementary Table 1). Our primary analyses focused on identifying differentially methylated positions (DMPs) and differentially methylated regions (DMRs) associated with iASD and dup15q, controlling for cellular heterogeneity and other potential confounds, exploring the extent to which signals were shared across idiopathic and syndromic autism cases. Finally, we employed weighted gene co-methylation network analysis (WGCNA) to undertake a systemslevel view of the DNA methylation differences associated with both iASD and dup15q across the three brain regions. An overview of our experimental approach is given in

DNA methylation differences between iASD cases and controls are consistent across cortical regions
No global differences in DNA methylation -estimated by averaging across all Illumina 450K array probes (n=417,460) included in our analysis -were identified between iASD patients and control subjects in any of the three brain regions (PFC: iASD = 48.4%, controls = 48.5%; TC: iASD = 48.4%, controls = 48.4%; CB: iASD = 46.4%, controls = 46.4%). We observed a robust positive correlation between the estimated 'DNA methylation age' -calculated using an epigenetic clock based on DNA methylation values 38, 39 -and recorded chronological age for each of the brain regions (PFC: r = 0.98, TC: r = 0.97. CB: r = 0.94) (Supplementary Figure 2), with no evidence for differential 'epigenetic aging' in iASD patients (PFC: P = 0.10, TC: P = 0.24, CB: P = 0.80). These findings indicate that ASD is not associated with any systemic differences in DNA methylation across the probes included on the Illumina 450K array in the brain regions tested, reflecting findings in studies of other complex neuropsychiatric phenotypes including Alzheimer disease 40 and schizophrenia 41 .
We next used a linear model including covariates for sex, age, brain-bank and neuronal cell proportions derived from the DNA methylation data (except in the CB, as described in the Methods) to identify iASD-associated differentially methylated positions (DMPs) across the genome in each of the three brain regions. The top ranked iASD-associated DMPs in each brain region (PFC: cg08277486, which is located within CCDC144NL and hypermethylated in patients compared to controls (P = 1.11e-06); TC: cg08374799, which is located immediately upstream of ITGB7 and hypomethylated in patients compared to controls (P = 4.46e-07); CB: cg01012394, which is located immediately upstream of EYA3 and hypermethylated in patients compared to controls (P = 1.01e-06)) are shown in Figure 1a, with a list of all DMPs  Table 5). We identified 157 DMPs (P < 5e-05), with the top-ranked cross-cortex iASD-associated difference (cg14392966, P = 1.77E-08) being located in the promoter region of PUS3 (+6bp) and upstream of DDX25 (+1136bp) on chromosome 11q24.2. Using data from an RNA-seq analysis of an overlapping set of samples 11 , we explored the extent to which genes annotated to these DMPs were also differentially expressed in iASD cortex. Of the 111 DMPs annotated to a gene, 20 (18.0%) were annotated to a transcript found to be differentially expressed in iASD cortex (FDR < 0.1) (Supplementary Table 6), with an overall negative correlation (r = -0.435) between DNA methylation and gene expression effects sizes across these probe-gene pairs ( Figure 1d).

Dup15q, a genetically-defined subtype of ASD, is associated with striking differences in DNA methylation across an imprinted gene cluster within the duplicated region
The duplication of human chromosome 15q11-13 ("dup15q") is the most frequent cytogenetic abnormality associated with ASD, occurring in ~1% of cases 33,34 . In addition to the iASD cases profiled in this study, we quantified DNA methylation in PFC, TC, and CB tissue from seven individuals with dup15q; using an established method to identify copy number variation (CNV) from Illumina 450K data 43 , we confirmed dup15q status in each of the brain regions profiled in the seven carriers ( Supplementary Figures 6-8). As with the iASD patients, no global differences in DNA methylation were identified between dup15q carriers and controls in any of the three brain regions (PFC: dup15q carriers = 48.5%, controls = 48.5%; TC: dup15q carriers = 48.4%, controls = 48.4%; CB: dup15q carriers = 46.4%, controls = 46.4%).
Again, we observed a strong positive correlation between the estimated 'DNA methylation age' with recorded chronological age for each of the brain regions (Supplementary Figure 2). Although there was no evidence for differential 'epigenetic aging' in dup15q patients compared to controls in either cortical region (PFC: P = 0.06, TC: P = 0.45), there was a nominally-significant association in CB (P = 0.012) with dup15q carriers characterized by decelerated epigenetic aging compared to controls. Overall, these findings indicate that like iASD patients, dup15q carriers are not characterized by systemic differences in DNA methylation across the probes included on the Illumina 450K array in the brain regions tested. Using a linear model including covariates for sex, age, brain-bank and neuronal cell proportions derived from the DNA methylation data (except in the CB as described in   Tables   11-14) outside the vicinity of the duplication were also identified in each of the three brain regions, suggesting that structural variation on chromosome 15 may influence regulatory genomic variation at other chromosomal locations in trans. Using data from an RNA-seq analysis of an overlapping set of samples 11 , we explored the extent to which genes annotated to these DMPs were also differentially expressed in dup15q ASD cortex. Of the 699 DMPs annotated to a gene, 139 (19.9%) were annotated to a transcript found to be differentially expressed in dup15q cortex (FDR < 0.1) ( Supplementary Table 15), with an overall negative correlation (r = -0.313) between DNA methylation and gene expression effects sizes across these probegene pairs (Figure 2d). Of note, there were striking differences in the correlation between DNA methylation and gene expression for loci within the dup15q region (r = 0.162) compared to those elsewhere in the genome (r = -0.300).

Methylomic differences are shared between iASD and dup15q carriers
Building on a recent analysis of gene expression 11 that revealed a core pattern of cortical transcriptional dysregulation observed in both iASD and dup15q carriers, we next examined the extent to which disease associated DNA methylation differences are shared between these two distinct subgroups of autism patients. Effect sizes at iADS DMPs (P < 5 x10 -5 ) were significantly correlated between iASD and dup15q patients (PFC: r = 0.86, P = 7.03e-10; TC: r = 0.9, P = 9.21e-20; CB: not tested because of the small number of significant DMPs) (Figure 3a). Likewise, effect sizes at dup15q DMPs (P < 5e-05) were found to be highly correlated between dup15q and shows that dup15q carriers cluster together with iASD cases ( Supplementary   Figures 13 and 14), highlighting convergent methylomic signatures associated with both idiopathic and syndromic forms of autism.

Cortical co-methylation modules associated with ASD are enriched for immune, synaptic and neuronal processes
We next used weighted gene co-methylation network analysis (WGCNA) 45 to characterize systems-level differences in DNA methylation associated with ASD. We built co-methylation networks using all 'variable' DNA methylation sites (defined as those where the range of DNA methylation values for the middle 80% of individuals was greater than 5%; N = 251,311) using cross-cortex (PFC and TC) data from all donors (see Methods). WGCNA identified 61 co-methylation modules (Supplementary Table 16) and we used the 'module eigengene' (i.e. the first principal component) for each module to explore differences between controls (n = 29), individuals with iASD (n = 30), individuals with dup15q (n = 6), and a combined ASD group (n = 36). We identified several co-methylation modules robustly associated (FDR < 0.05) with at least one diagnostic category ( Table 1, Supplementary Figure 15). We tested whether the genes annotated to probes in each ASD-associated co-methylation module were enriched for specific gene ontology (GO) pathways using a method that groups related pathways to control for the hierarchical structure of the ontological annotations (see Table 1 and Supplementary Table 17), identifying a number of pathways relevant to the known etiology of ASD. For example, the most enriched pathways amongst genes annotated to probes in the "skyblue3" module, which was associated with both iASD (FDR = 0.0063) and the combined ASD group (FDR = 0.0033), are related to immune function (e.g. interleukin-1 beta production, P = 8.55E-20), consistent with findings from genetic 46 , transcriptomic 47 , and epidemiological data 48,49 . Amongst genes annotated to probes in the "darkorange" module, which was also associated with both iASD (FDR = 0.014) and the combined ASD group (FDR = 0.01), the topranked pathways were related to synaptic signaling and regulation, in particular phosphatidylinositol 3-kinase (PI3K) activity (P = 1.95E-08), which regulates synaptic formation and plasticity 50 . Also of note, pathways enriched amongst genes annotated to probes in the 'pink' module, which was associated with dup15q carriers (FDR = 0.00046) were related to pathways important in neurons (P = 3.41E-15) and postsynaptic density (P = 1.87E-10). A full list of all significant pathways for each of the co-methylation modules is given in Supplementary Table 17.

Discussion
In this study, we quantified DNA methylation in 223 post-mortem tissue samples isolated from the prefrontal cortex, temporal cortex and cerebellum dissected from 43 donors with ASD and 38 non-psychiatric control subjects. To our knowledge, this represents the most systematic analysis of DNA methylation in ASD using diseaserelevant tissue, and the first to compare variation identified in idiopathic and syndromic forms of ASD. We report ASD-associated DNA methylation differences at numerous CpG sites with more pronounced effects in both cortical regions compared to the cerebellum. This finding is consistent with previous gene expression studies illustrating that ASD-related molecular changes are substantially smaller in the cerebellum compared to the cortex 11 .
Although structural variation on chromosome 15 was found to be associated with striking cis-effects on DNA methylation, with a discrete differentially methylated domain spanning an imprinted gene cluster within the duplicated region, variation in DNA methylation associated with autism in the cerebral cortex was highly correlated between iASD and dup15q patients. These results suggest that there are convergent molecular signatures in the cortex associated with different forms of ASD, reinforcing the findings from a recent study of gene expression 11  This study has several strengths. Our epigenome-wide analysis of ASD is, to our knowledge, the largest post-mortem cohort so far and included tissue from three brain regions that have been previously implicated in the pathophysiology of ASD.
This contrasts with previous studies that have been undertaken on much smaller numbers of samples and focused on only one or two brain regions. The inclusion of both idiopathic ASD patients and dup15q carriers in our analyses enabled us to explore evidence for convergent molecular signatures associated with both idiopathic and syndromic forms of autism.
Despite this being the first study to quantify DNA methylation across three different brain regions from both idiopathic and syndromic ASD patients and controls, this study has a number of important limitations that should be considered when interpreting the results. First, DNA methylation was quantified using the Illumina 450K array; although this is a robust and highly reliable platform with content spanning regulatory regions associated with the majority of known annotated genes, it interrogates DNA methylation at a relatively small proportion of sites across the whole genome. Second, because epigenetic processes play an important role in defining cell-type-specific patterns of gene expression 51,52 , the use of bulk tissue from each brain region is a potential confounder in DNA methylation studies 53 .
Despite our efforts to control for the effect of cell type diversity in DNA methylation quantification in our analyses using in silico approaches, this approach is not suitable to estimate the neuronal proportion in the cerebellum and cannot inform us about disease relevant DNA methylation changes specific to individual brain cell types. Of note, our general findings are in line with those reported by Nardone and colleagues 26 which interrogated DNA methylation in cell-sorted cortical neurons (from 16 ASD cases and 15 controls). Third, there is increasing awareness of the importance of 5-hydroxymethyl cytosine (5-hmC) in the human brain 54 , although this modification cannot be distinguished from DNA methylation using standard bisulfitebased approaches. It is plausible that many of the ASD-associated differences identified in this study are confounded by modifications other than DNA methylation.
To date, no study has evaluated the role of genome-wide 5-hmC in ASD, although recent studies from our group quantified levels of 5-hmC across the genome in human cortex and cerebellum 55 and across neurodevelopment 56 .
To conclude, we identified widespread differences in DNA methylation associated with iASD, with consistent signals in both cortical regions that were distinct to those observed in the cerebellum. Individuals carrying a duplication on chromosome 15q (dup15q), representing a genetically-defined subtype of ASD, were characterized by striking differences in DNA methylation across a discrete domain spanning an imprinted gene cluster within the duplicated region. In addition to the striking ciseffects on DNA methylation observed in dup15q carriers, we identified convergent methylomic signatures associated with both iASD and dup15q, reflecting the findings from previous studies of gene expression and H3K27ac. Our study represents the first systematic analysis of DNA methylation associated with ASD across multiple brain regions, providing novel evidence for convergent molecular signatures associated with both idiopathic and syndromic autism and highlighting potential disease-associated pathways that warrant further investigation.

Post-mortem brain tissue from autism cases and controls
Tissue samples for this study were acquired from the Autism Tissue Program (ATP) brain bank at the Oxford UK Brain Bank for Autism (www.brainbankforautism.org.uk), (BA9) and denoted as 'prefrontal cortex' (PFC)), superior temporal gyrus (corresponding to BA41, BA42, or BA22 and denoted as 'temporal cortex' (TC)), and cerebellar vermis (CB) (Supplementary Figure 1). Further information about the samples is given in Supplementary Table 1. Genomic DNA was isolated from ~100 mg of each dissected brain region using a standard phenol-chloroform extraction method, and tested for degradation and purity before analysis.

DNA methylation profiling and data quality control
All samples were randomized with respect to phenotypic status, age, sex and brain-

Identification of autism-associated differential methylation
All statistical analyses were conducted using R statistical package (version 3.1.1).
Analyses were performed to test for differential methylated positions (DMPs) and differentially methylated regions (DMRs) associated with disease status for each brain region. The R package Cell EpigenoType Specific (CETS) mapper 63 designed for the quantification and normalisation of differing neuronal proportions in genomewide DNA methylation data sets was used to estimate brain cellular heterogeneity in cortex (both PFC and TC) samples. CETS based neuronal cell composition estimate was not applied on the cerebellum samples given the known high proportion of non-NeuN-expressing neurons in this brain region 41,63 . To model the effect of samplespecific variables, we performed linear regression for each probe using age, gender, brain bank, CETS (for PFC and TC, but not CB) and diagnosis as independent variables. We also applied an LME model framework for samples with data from multiple cortical regions to identify consistent differential ASD-associated DNA methylation markers across the cortex. The individual donor identified was treated as a random effect, and age, gender, brain bank, CETS, brain region and diagnosis were treated as fixed effects. ASD-associated differential methylated regions (DMRs) were identify using the Python module comb-p 44 to group spatially correlated DMPs (seed P-value<1.00E-03, minimum of 2 probes) at a maximum distance of 300 bp in each analytical groups. Significant DMRs were identified as those with at least two probes and a corrected P value < 0.05 using Sidak correction 64 .

Weighted gene correlation network analysis
For the co-methylation network analyses, modules of co-methylated probes were identified using the WGCNA package in R 45 and network was constructed for crosscortex samples. For each network, the input probe set was pruned to remove probes with minimal variability across samples. We required that each probe have a minimum range of methylation values of 5% within the middle 80% of samples, resulting in 251,311 probes for cortex. In the cross-cortex network, to enrich for modules relating to diagnosis, we regressed out the effect of CET score and age after fitting a linear model for each probe with diagnosis, age, sex, cortical region, CET score, and brain bank as independent variables. A signed network was constructed using the biweight midcorrelations between probes with a soft power of 7. This network was generated blockwise, using the WGCNA function blockwiseModules with the parameters: maxBlockSize=15000, mergeCutHeight=0.1, deepSplit=4, and minModuleSize=100. For each module, a linear mixed effects model was fit between its eigengene and diagnosis, age, sex, cortical region, batch, CET score, and brain bank as fixed effects with individual ID as a random effect. This LME model was fit for 3 subsets of samples: (iASD + dup15q) vs control, iASD vs control, and dup15q vs control.

Gene ontology pathway analysis
Illumina UCSC gene annotation, which is derived from the genomic overlap of probes with RefSeq genes or up to 1500 bp of the transcription start site of a gene, was used to create a test gene list from the probes identified in the disease-associated modules for pathway analysis. Where probes were not annotated to any gene (i.e. in the case of intergenic locations), they were omitted from this analysis; where probes were annotated to multiple genes, all were included. A logistic regression approach was used to test if genes in this list predicted pathway membership, while controlling for the number of probes that passed quality control (i.e., were tested) annotated to each gene. Pathways were downloaded from the GO website (http://geneontology.org/) and mapped to genes, including all parent ontology terms.
All genes with at least one 450 K probe annotated and mapped to at least one GO pathway were considered. Pathways were filtered to those containing between 10 and 2000 genes. After applying this method to all pathways, the list of significant pathways (P < 0.05) was refined by grouping to control for the effect of overlapping genes. This was achieved by taking the most significant pathway, and retesting all remaining significant pathways while controlling additionally for the best term. If the test genes no longer predicted the pathway, the term was said to be explained by the more significant pathway, and hence these pathways were grouped together. This algorithm was repeated, taking the next most significant term, until all pathways were considered as the most significant or found to be explained by a more significant term. Bank.

Conflicts of interest
None of the authors have any conflicts of interest to declare.     Shown is the clustering of samples based on DNA methylation levels (red = low, yellow = high) at iASD-associated DMPs (P < 5e-05) in (b) prefrontal cortex and (c) temporal cortex. In both cortical regions, the two primary clusters are clearly aligned to disease status (prefrontal cortex: cluster 1 = 77% iASD, cluster 2 = 20% iASD; temporal cortex: cluster 1 = 92% iASD, cluster 2 = 22% iASD)). (d) Effect sizes at iASD-associated DMPs are negatively correlated with gene expression level. Shown is the relationship between DNA methylation difference (X-axis) and gene expression difference (Y-axis) for genes annotated to cortical DMPs identified in a RNA-seq study performed on an overlapping set of samples.