Rare gene deletions in genetic generalized and Rolandic epilepsies

Genetic Generalized Epilepsy (GGE) and benign epilepsy with centro-temporal spikes or Rolandic Epilepsy (RE) are common forms of genetic epilepsies. Rare copy number variants have been recognized as important risk factors in brain disorders. We performed a systematic survey of rare deletions affecting protein-coding genes derived from exome data of patients with common forms of genetic epilepsies. We analysed exomes from 390 European patients (196 GGE and 194 RE) and 572 population controls to identify low-frequency genic deletions. We found that 75 (32 GGE and 43 RE) patients out of 390, i.e. ~19%, carried rare genic deletions. In particular, large deletions (>400 kb) represent a higher burden in both GGE and RE syndromes as compared to controls. The detected low-frequency deletions (1) share genes with brain-expressed exons that are under negative selection, (2) overlap with known autism and epilepsy-associated candidate genes, (3) are enriched for CNV intolerant genes recorded by the Exome Aggregation Consortium (ExAC) and (4) coincide with likely disruptive de novo mutations from the NPdenovo database. Employing several knowledge databases, we discuss the most prominent epilepsy candidate genes and their protein-protein networks for GGE and RE.


Introduction
Epilepsies are among the most widespread neurological disorders with a lifetime incidence of 3% [1]. They represent a heterogeneous group of different disease entities that, with regard to aetiology, can be roughly divided in epilepsies with an exogeneous/symptomatic cause and those with a genetic cause. Genetic generalized epilepsies (GGE; formerly idiopathic generalized epilepsies) are the most common genetic epilepsies accounting for 30% of all epilepsies. They comprise syndromes such as juvenile myoclonic epilepsy, childhood absence epilepsy and juvenile absence epilepsy. In general, they tend to take a benign course and show a good response to pharmacotherapy. Among focal genetic epilepsies, benign epilepsy with centrotemporal spikes or Rolandic epilepsy (RE) is the most common form. RE has its onset in childhood or early adolescence and usually tapers off around the age of 15.
High-throughput genomic studies raised the number of epilepsy-associated candidate genes to hundreds; nowadays, frequently mutated ones are included in diagnostic gene panels (for recent reviews see [2,3]. Large consortia initiatives such as Epi4k [4] enrolled 1,500 families, in which two or more affected members displayed epilepsy, as well as 750 individuals, including 264 trios, with epileptic encephalopathies and infantile spasms, Lennox-Gastaut syndrome, polymicrogyria or periventricular heterotopias. In addition to the detection of known and unknown risk factors, the consortium found a significant overlap between the gene network of their epilepsy candidate genes and the gene networks for autism spectrum disorder (ASD) and intellectual disability. Intriguingly, epilepsy is the medical condition most highly associated with genetic autism syndromes [5].
Genomic disorders associated with copy number variations (CNVs) appear to be highly penetrant, occur on different haplotype backgrounds in multiple unrelated individuals and seem to be under strong negative selection [6][7][8]. A number of chromosomal locations suspected to contribute to epilepsy have been identified [9][10][11] [12,13].
A genome-wide screen for CNVs using array comparative genomic hybridization (aCGH) in patients with neurological abnormalities and epilepsy led to the identification of recurrent microdeletions on 6q22 and 1q22.31 [14]. A deletion on 15q13.3 belongs to the most frequent recurrent microdeletions in epilepsy patients; it is associated with intellectual disability, autism, schizophrenia, and epilepsy [15,16]. The recurrence of some CNVs seems to be triggered by the genome structure, namely by the chromosomal distribution of interspersed repetitive sequences (like Alu transposons) or recently duplicated genome segments (large blocks of sequences >10 kbp with >95% sequence identity that constitute five to six percent of the genome) that give rise to nonallelic homologous recombination [6,17].
CNV screening in large samples showed that 34% of heterozygous deletions affect genes associated with recessive diseases [18]. CNVs are thought to account for a major proportion of human genetic variation and have an important role in genetic susceptibility to common disease, in particular neuropsychiatric disorders [19]. Genome-wide surveys have demonstrated that rare CNVs altering genes in neuro-developmental pathways are implicated in epilepsy, autism spectrum disorder and schizophrenia [3,20].
Considering all types of CNVs across two analysed cohorts, the total burden was not significantly different between subjects with epilepsy and subjects without neurological disease [21]; however, when considering only genomic deletions affecting at least one gene, the burden was significantly higher in patients. Likewise, using Affymetrix SNP 6.0 array data, it has recently been shown that there is an increased burden of rare large deletions in GGE [13]. The drawback of the latter approach is that smaller CNVs cannot be detected. Systematic searches of CNVs in epilepsy cohorts using whole-exome sequencing (WES) data, which provides the advantage to identify smaller deletions along with the larger ones, are still missing.
In the present study, we provide the CNV results of the largest WES epilepsy cohort reported so far. We aimed at (1) identifying the genome-wide burden of large deletions (>400kb), (2) studying the enrichment for deletions of brain-expressed exons, in particular those under negative selection, (3) detecting deletions that overlap with previously defined autism and epilepsy candidate genes, and (4) browsing knowledge databases to help understand the disease aetiology.

Patient cohorts
All patients or their representatives, if participants were under age 18, and included relatives, gave their informed consent to this study. All procedures were in accordance with the Helsinki declaration and approved by the local ethics committees/internal review boards of the participating centers. The leading institution was the Ethics Commission of the University and the University Clinic of Tübingen.
RE cohort: This cohort included 204 unrelated Rolandic patients of European ancestry which were recruited from centers in Austria (n = 107), Germany (n = 84), and Canada (n = 13).
Control cohort: We used 445 females and 283 males (728 in total) from the Rotterdam Study as population control subjects [22]. The same cohort was recently used for the screening of 18 GABA A -receptor genes in RE and related syndromes [23].

Workflow for CNV detection
Our primary analysis workflow included three major steps as shown in Fig 1. These are 1) data pre-processing, 2) SNV/INDEL analysis and 3) copy number variant analysis.
Data pre-processing: Sequencing adapters were removed from the FASTQ files with cutadapt [24] and sickle [25]. GATK best practices were followed for the next steps of data pre-processing and variant calling [26]. Alignment to the GRCh37 human reference genome was performed using BWA-MEM [27] with default parameters. Conversion of SAM to BAM files was done with SAMtools [28]. Sorting of BAM files, marking of duplicate reads due to PCR amplification and addition of read group information were done using Picard (https://github. com/broadinstitute/picard) tools with default parameters. Base quality score recalibration and local realignment for INDELs was performed using GATK version 3.2.
Coverage: Mean depth of coverage and target coverage of exons were calculated from the BAM files using the depth of coverage tool from GATK. The same files were also used as input for calling of CNVs.
Variant calling: The GATK haplotype caller (version 3.2) was chosen to perform multiple sample variant calling and genotyping with default parameters. To include splice site variants in the flanking regions of the exons, exonic intervals were extended by 100 bp each upstream and downstream. Multiple sample calling is advantageous in deciding whether a variant can be identified confidently as it provides the genotype for every sample. It allows filtering variants based on the rate of missing genotypes across all samples and also according to the individual genotype.
Sample QC: Samples were excluded from the analysis based on the following criteria: 1) Samples with a mean depth <30x or <70% of exon targets covered at <20x were excluded from further analysis; 2) samples with >3 standard deviations from mean in number of alternate alleles, number of heterozygotes, transition/transversion ratio, number of singletons and call rate as calculated with the PLINK/SEQ i-stats tool (https://atgu.mgh.harvard.edu/ plinkseq/); 3) call rate <97%; 4) ethnically unmatched samples as identified by multi-dimensional scaling analysis with PLINK version 1.9 [29]; 5) PI_HAT score>0.25 as calculated by PLINK version 1.9 to exclude related individuals.
Variant QC: Initial filtering of variants was performed based on quality metrics over all the samples with the following parameters for VQSR: Tranches chosen, VQSRTrancheSNV99. To further exclude low quality variants, we also applied filtering based on quality metrics for each genotype using read depth and quality of individual genotypes. Genotypes with a read depth of <10 and GQ of <20 were converted to missing by using BCFtools [28]. Multiallelic variants were decomposed using variant-tests [30] and left-normalized using BCFtools [28].
CNV detection: In the remaining high quality samples, CNVs were detected by using XHMM as described in [35]. In the current study, we focused only on deletions, as the false positive rate for duplications is too high to allow for meaningful interpretation. CNV calls were annotated using bedtools version 2.5 [36]. NCBI RefSeq (hg19, 20150322) was used to identify the genes that lie within the deletion boundaries.

Burden analysis of large and rare deletions
Excess deletion rate of the large deletions (length >400 kb) in subjects with epilepsy compared to the controls was measured as described in [13] using PLINK version 1.9 [29]. We set the overlap fraction to 0.7 (70%) and the internal allele frequency cut-off <0.5% and evaluated the significance empirically by 10,000 case-control label permutations.

Case-only CNVs
The CNVs that are unique for cases (not present in any of the in-house controls) and occur at a low frequency, i.e., present in 2 independent cases, while having a frequency of 1% in the CNVmap, the DGV gold standard dataset [37] and 1000 genomes SV [38] were selected and subjected to further analysis as described below.

Validation of CNVs
We proceeded by visual inspection of depth variation across exons of the filtered deletions; we also performed qPCR validations of three small deletions, two of which, NCAPD2 and CAPN1, stood the filtering procedure (see Table 1). For RE patients, genomic DNA samples were analysed using the Illumina OmniExpress Beadchip (Illumina, San Diego, CA, USA) [13]. Twenty-three of 60 CNVs present in the RE patients were validated by available array data (S1 Table). Generally, small CNVs cannot be reliably identified with SNP arrays [39]. Indeed, of the 37 CNVs that were not identified in the beadchip data, 23 have a size of <10 kb, whereas only 2 of the 23 validated CNVs have a size of less than 10 kb according to the array data.

Compound heterozygous mutations and protein-protein interactions
We checked for concurrence of a deletion in one allele and a deleterious variant in the second allele. We included the first order interacting partners from the protein-protein interaction network (PPIN) in this analysis [40] and assessed if any gene or its first order interacting partner carries a deletion in one allele and a deleterious variant in the other. We excluded all genes that had no HGNC (HUGO Gene Nomenclature Committee) entry resulting in a network of 13,364 genes and 140,902 interactions. This network was then further filtered for interactions likely to occur in brain tissues using a curated data set of brain-expressed genes [41]. The final brain-specific PPIN consisted of 10,469 genes and 114,533 interactions.

Gene-set enrichment analysis
Genes that were expressed in brain [42] and located within deletion boundaries were used as input for an enrichment analysis using the Ingenuity Pathway Analyser (IPA1) [43]. We performed the enrichment analysis with all deleted genes from the RE and GGE samples together as well as for each phenotype separately.

CNV tolerance score analysis
The CNV tolerance score was used as defined in [45]. The CNV tolerance and deletion scores for the genes that are deleted in our study were obtained from the ExAC database [46] and their enrichment in GGE and RE cases was assessed by the Wilcoxon rank sum test.

Overlap with different databases
The overlap between the different data sets was obtained by gene symbol matches between the detected gene deletions and the gene lists from different databases; more details are given in the discussion section. A workflow depicting the steps above is shown in Fig 1.

Results
After quality control, exomes of 390 epilepsy cases (196 GGE, 194 RE) and 572 controls were used for downstream analyses (Fig 1). The final RE and GGE datasets comprised 26,476 and 30,207 variants, respectively. 75 out of 390 epilepsy patients (~19%) carried a total of 104 case-only deletions spanning 260 genes (see Table 1), which covered a wide size range between 915 bp and 3.11 Mbp. 43 out of 194 RE patients carried deletions compared to 32 out of 196 patients with GGE, thus, we did not observe any significant difference in the total number of deletions between the two disease entities (p-value = 0.68). In the combined dataset, 35 out of 73 were large multigene deletions. Among them were several recurrent deletions (see Table 1), including those located on 15q13.3 and 16p11.2 that were previously reported to be associated with epilepsy and other brain disorders.

Comparative analysis of Rolandic and GGE candidate genes
Because our cohort is composed of GGE and RE patients, we sought to compare the functional differences between the two subtypes of epilepsies by studying the pathways and functions that are enriched in the respective deleted genes (see Table 2). Initially we performed GO term enrichment without applying any additional filter to the deletion calls and noticed that synaptic and receptor functions are more prominent in RE cases (data not shown). If the deletion calls were filtered for brain-specific gene expression, we observed that, separately and together, GGE and RE-deleted genes are enriched for the functional terms "nervous system development and function", "behavior" and "tissue morphology"; this functional convergence might have been expected when selecting for brain-expressed genes. When analysing GGE and RE datasets separately, the top PPIN enriched in GGE is associated with "carbohydrate metabolism", "small molecule biochemistry" and "cell signaling", whereas the top network enriched in RE is associated with "neurological disease", "organismal injury and abnormalities" and "psychological disorders" (see Table 3). The top enriched network including GGE and RE-deleted genes (Fig 2) is described below.

Deletion burden analysis
We performed 10,000 case-control label permutations to test whether there is an increased burden of large and rare deletions in cases as compared to the controls (Table 4). We noticed that (1) the deletion rate per individual with at least one deletion in cases compared to the controls showed statistical significance in both GGE and RE (p-value = 1e-04, p-value = 0.011) and (2), considering cumulative length of all large and small deletions, no significant difference between cases and controls was observed in both GGE and RE (p-value = 0.16, p-value = 0.41), indicating that there is no difference in the length of CNVs in cases and controls.

Enrichment for known epilepsy and autism-associated genes
To check the overlap between the deletions detected in our study and genes known to be associated with epilepsy, we searched for overlap with the genes listed (n = 499) in the Epilepsy-Genes database [47]. This led to the following set of 8 genes: CHRFAM7A, CHRNA7, SCN1A, CNTNAP2, GABRB3, GRIN2A, IGSF8, ITPR1. The GRIN2A deletion is from the same patient published earlier [48] and which we used as one of the positive controls in our primary CNV detection pipeline [49]. One should notice that genes such as CHRNA7 and GABRB3 are located within larger deletions containing other genes; so they might be questionable as bona fide epilepsy-associated genes. Using the core autism candidate genes (n = 455 genes) present in brainspan [50], we identified 13 deleted genes: APBA2, ATP10A, CDH22, CDH8, GABRA5, GABRG3, NDN, NDNL2, CNTNAP2, GABRB3, GRIN2A, SCN1A and SHANK1 (Table 5). This set is particularly enriched in GO terms "neuron parts" and "transporter complexes". Note that GABRB3 and GABRG3 belong to multigenic large deletions ( Table 1).

Deletions of brain-critical exons
Disorders such as autism, schizophrenia, mental retardation and epilepsy impact fecundity and put negative selection pressure on risk alleles. In a recent report [7] exome and transcriptome data from large human population samples were combined to define a class of brainexpressed exons that are under purifying selection. These exons that are highly expressed in brain tissues and characterized by a low mutation burden in population controls were called "brain-critical exons" (n = 3,955); the associated genes were accordingly called "brain-critical genes" (BCG, n = 1,863) [3].
Twenty-two deleted genes are in common with the BCG set (see Table 5). The SHANK1 deletion is found in a single RE case. It spans 17,339 bp (8 exons out of 9). There is only one report on the possible implication of the deletion of this gene in childhood epilepsy [51]. A deletion of ITPR1 is observed in another RE case; this deletion affects also SUMF1, but this gene was filtered out by the BCG overlap selection. The deletion of CNTN1 in a GGE patient encompasses in addition MUC19 and LRRK2, the latter is a known Parkinson candidate gene [52].

Exome Aggregation Consortium deletions
The ExAC data comprise 60,706 unrelated individuals sequenced as part of various diseasespecific and population genetic studies. Deletions annotated in ExAC (release 0.3.1 of 23/08/ 16) were identified, similar to the present study, by read depth analysis using XHMM [45]. We sought to compare those CNV calls with the ones detected in the present work. Out of the 260 deleted genes detected in our study, 164 genes (67%) showed deletions in ExAC too (see S2  Table). Several genes highlighted in the previous paragraphs were ranked high using the CNV tolerance score defined by [45]. However, we did not identify a significant difference, neither in CNV tolerance scores (p-value = 0.53) nor in CNV deletion scores (p-value = 0.22), between GGE and RE-deleted genes. This may indicate that GGE and RE deletions are equally likely to fall into the same category of ExAC deletion calls.

Compound heterozygous and first order protein-protein interaction mutations
Compound heterozygous mutations play a role in many disease aetiologies such as autism and Parkinson's disease [53][54][55]. We searched for possibly deleterious non-synonymous changes in the parental undeleted gene copy, but we did not detect any hemizygous variant that had a critical intolerance score (see Methods). Subsequently, we hypothesised that simultaneous  Rare gene deletions in epilepsies mutations in proteins which interact directly (first-order protein interactors) may increase the associated deleterious effect. Within a curated brain-specific PPIN (see Methods, [40]), we inspected first order interacting proteins with potentially deleterious mutations or exon losses (see Table 6) and found a few interesting hits, including SPTAN1 that interacts directly with SHANK1; SPTAN1 encodes alpha-II spectrin and is known to be associated with epilepsy [56,57]. A remarkable and unique case of multiple hits was observed in a patient who accumulated four hits: the originally detected ITPR1 deletion and three potentially deleterious nonsynonymous SNVs in RYR2, HOMER2 and STARD13. RYR2 (ryanodine receptor 2) and ITPR1 (inositol-1,4,5-trisphosphate receptor 1) have been independently reported to be Table 5. Overlap with specific gene sets. NDUFS3  APBA2  APBA2  CHRFAM7A  SACS   RIMBP2  ATRNL1  ATP10A  CHRNA7  CNTNAP2   TJP1  CDH22  CDH22  SCN1A  GABRB3   CNTN1  CSMD1  CDH8  CNTNAP2  GRIN2A   CNTNAP2  ETV1  GABRA5  GABRB3  ITPR1   GABRB3  FAN1  GABRG3  GRIN2A  SCN1A   GRIN2A  GMFB  NDN  IGSF8   HSPA1L  IGSF8  NDNL2  ITPR1   IGSF8 NPR2 CNTNAP2

PSD genes BCG genes Autism brainSpan EpilepsyDB clinVar
PSD (postsynaptic density); BCG (Brain Critical Genes). Genes common to at least two of the compared sets are highlighted in grey.
https://doi.org/10.1371/journal.pone.0202022.t005 implicated in brain disorders. RYR2 de novo mutations have been identified in patients with intellectual disability [58] and activation of ITPR1 and RYR2 can lead to the release of Ca 2+ from intracellular stores affecting propagating Ca 2+ waves [59]. HOMER2, a brain-expressed gene, has been reported to be involved in signalling defects in neuropsychiatric disorders [60].
The STARD13 locus has been reported to be associated with aneurysm and sporadic brain arteriovenous malformations [61,62].

Over-representation of gene-disease associations
DisGeNET is a discovery platform integrating information on gene-disease associations from public data sources and literature [44].

Protein-protein interaction network analysis
We searched for network modules carrying a higher deletion burden with Ingenuity Pathway Analyser (IPA 1 ). Considering GGE and RE together and using brain-expressed genes as an input for IPA we identified a total of 12 networks. The identified network scores ranged from two to 49 and the number of focus molecules in each network ranged from one to 24. Of all the 12 identified networks, the network shown in Fig 2 is the top-ranked network with a score of 49 and 24 focus molecules. It is associated to the terms "Nervous system development and function", "Neurological disease" and "Behavior" (see Table 3). The network reveals an interesting module where the genes CAPN1, GRIN2A, ITPR1, SCNA1 and CHRNA7 are central. Interestingly, CAPN1 is well ranked (no deletion or duplication) in the ExAC CNV records (S2 Table) and it is not covered by BCG, epilepsy and autism data sets used in this study.

Enrichment for likely disruptive de novo mutations
Many studies on neuropsychiatric disorders such as autism spectrum disorder, epileptic encephalopathy, intellectual disability and schizophrenia have utilized massive trio-based whole-exome sequencing (WES) and whole-genome sequencing (WGS). Epilepsy candidate genes with de novo mutations (DNMs) were searched in the NeuroPsychiatric De Novo Database, NPdenovo [63]. DNMs were found in GABRB3, SHANK1, ITPR1, GRIN2A, SCN1A, PCDHB4 and IQGAP2.

Discussion
We analysed a WES dataset of 390 epilepsy patients (196 GEE, 194RE) for microdeletions. The deletion rate per individual with at least one deletion in cases compared to 572 controls showed statistical significance in both GGE and RE. Enrichment for known epilepsy and autism genes led to gene sets with synaptic and receptor functions which were mainly represented in Rolandic cases. The top PPIN enriched in GGE was associated with "carbohydrate metabolism", "small molecule biochemistry" and "cell signaling", whereas the top networks associated with RE are "neurological disease", "organismal injury and abnormalities" and "psychological disorders", this is reminiscent of our previous attempt to classify metabolic and developmental epilepsies [3]. Among single-gene deletions, CDH22, CDH12 and CDH8 are of particular interest; CDH12 is a cadherin expressed specifically in the brain and its temporal pattern of expression seems to be consistent with a role during a critical period of neuronal development [64]. Moreover, a group of cadherins, CDH7, CDH12, CDH18 and PCDH12, are reported to be associated with bipolar disease and schizophrenia [65]. The smallest deletion (1,166 bp) that we could detect in this study concerns NCAPD2; this gene is annotated in the autismkb database [66]. It is an important component of the chromatin-condensing complex, which is highly conserved across metazoan. This gene was previously found to be associated with Parkinson's disease [39] and its paralog NCAPD3 is associated with developmental delay [67].
Deletions of brain-critical exons pointed to the ITPR1 deletion, which has been reported to be associated with spinocerebellar ataxia type 16 [68,69]. CNTN1 is another deletion of interest, the gene is highly expressed in fetal brain, it encodes a neural membrane protein which functions as a cell adhesion molecule and may be involved in forming axonal connections/ growth and in neuronal migration in the developing nervous system [70,71]. Moreover, its paralogs CNTN2 and CNTN4 are associated with epilepsy [72] and autism [73], respectively. Interestingly, in the ExAC data, the brain-expressed genes ITPR1 and CNTN1 show the third and fourth highest intolerance score ranks, respectively (S2 Table).
Protein-Protein interaction network analysis revealed the CAPN1 deletion as an interesting candidate gene; this is a double gene loss (4,270 bp) spanning CAPN1 (exon 17 to 22 out of 22 exons) and SLC22A1 (exon 1 out of 10 exons). SLC22A1, a transporter of organic ions across cell membranes, is lowly expressed in the brain, whereas CAPN1 is highly expressed in the brain. Calpain1 (CAPN1) belongs to the calcium-dependent proteases, which play critical roles in both physiological and pathological conditions in the central nervous system. They are also recognized for their synaptic and extra-synaptic neurotoxicity and neuro-protection [74]. Several ion channels, including GRIN2A [75] are calpain substrates. Further, a missense mutation in CAPN1 is associated with spino-cerebellar ataxia in the Parson Russell terrier dog breed [76] and has recently been reported in humans with cerebellar ataxia and limb spasticity [77].
Additional candidate genes can be identified on the periphery of the IPA network (see Fig  2): 1) CNTN1 (commented on above), 2) SACS, for which a large deletion (> 1Mb) was found, and 3) the single gene deletion of KCNQ1 (~57 kb). For SACS, a SNV is reported to be associated with spastic ataxia [78] and epilepsy [79]. KCNQ1 and its paralog KCNQ3 are subunits forming an expressed neuronal voltage-gated potassium channel. Further, hypomorphic mutations in either KCNQ2, an established epilepsy-associated gene [80], or KCNQ3 are reported to be highly penetrant [81]. KCNQ1 is co-expressed in heart and brain; it is found in forebrain neuronal networks and brainstem nuclei, regions in which a defect in the ability of neurons to repolarize after an action potential can produce seizures and dysregulate autonomic control of the mouse heart [82], yet one should be cautious as no validation is available for human.
Enrichment for likely disruptive de novo mutations in several genes suggests that deletions of these genes could cause a similar phenotype as in the NPdenovo and consequently will be penetrant in the heterozygotic state. This is indeed the case for ITPR1, for which recessive and dominant de novo mutations causing Gillespie syndrome [83], a rare variant form of aniridia characterized by non-progressive cerebellar ataxia, intellectual disability and iris hypoplasia, have been described. Two of the genes, which we have identified as ITPR1 interactors, RYR2 and SPTAN1, are also DNM genes in DPdenovo.
In summary, by filtering and comparison to genes that are (1) evolutionary constrained in the brain, (2) implicated in autism and epilepsy, (3) spanned by ExAC deletions, or (4) affected by neuropsychiatric associated de novo mutations, we observed a significant enrichment of deletions in genes potentially involved in neuropsychiatric diseases, namely GRIN2A, GABRB3, SHANK1, ITPR1, CNTN1, SCN1A, PCDHB4, IQGAP2, SACS, KCNQ1 and CAPN1. Interaction network analysis identified a hub connecting many of the epilepsy candidate genes identified in this and previous studies. The extended search for likely deleterious mutations in the first order protein-protein interactions and NPdenovo database pointed to the potential importance of ITPR1 deletion alone or in combination with RYR2 and SPTAN1 deleterious mutations.
We are aware that the set of epilepsy exomes that we screened for CNVs in the present study, although the largest analyzed so far, is still small given the genetic complexity of the disease and its population frequency. However, this study appears to provide a contrasting view to the genetic bases of childhood and juvenile epilepsies, as the top protein-protein interactions showing that GGE deleted proteins are preferentially associated with metabolic pathways, whereas in RE cases the association is biased towards neurological processes. Scrutinizing of additional patients' exomes/genomes and transcriptomes should provide an efficient way to understand the disease aetiology and the biological processes underlying it. The results presented here may contribute to the understanding of epilepsy genetics and provide a resource for future validations to improve diagnostics.
Supporting information S1 Table. Deletions present in array data. (DOCX) S2 Table. Deletions in common with ExAC CNVs. Data is sorted from low to high deletion score (del.score) and duplication (dup) frequencies. "+" indicates expression in the brain. Deletion score increases with increasing intolerance. (DOCX)