Functional Analysis of the C.3705+5G>C Mutation in the SCN1A Gene: Cryptic Splicing Site Activation and Partial Exon Skipping

Functional Analysis of the C.3705+5G>C Mutation in the SCN1A Gene: Cryptic Splicing Site Activation and Partial Exon Skipping Ben Mahmoud A1*, Mansour RB3, Driss F4, Gargouri SB1, Tabebi M1, Rhouma BB1, Tlili A5, Siala O1 and Fakhfakh F1,2 1Laboratory of Human Molecular Genetics, Faculty of Medicine of Sfax, University of Sfax, Tunisia 2Department of Life Sciences, Faculty of Science of Sfax, University of Sfax, Tunisia 3ISBS (High Institute of Biotechnology of Sfax) Sfax, Tunisia 4CBS (Center of Biotechnology of Sfax) Sfax, Tunisia 5Department of Applied Biology, College of Sciences, University of Sharjah, UAE


Introduction
Voltage-gated sodium channels (SCN) are the primary molecules responsible for generating action potentials in nervous system, skeletal and cardiac muscle, and mutations in the genes encoding those expressed in neurons including Nav1.1 encoded by SCN1A have been reported to cause epilepsies [1]. Voltage-gated sodium channels are essential for the generation and propagation of action potentials in electrically excitable tissues, such as brain, muscle, and heart. These channels are hetero-multimeric protein complexes consisting of one α and one or two β subunits [2]. To date, more than 600 variants have been reported within the SCN1A gene [3]. These mutations may alter the function of the channel leading to abnormal hyperexcitability of neural network and resulting in epileptic seizure [4].
Indeed, it is now recognized that a wide range of genetic variants, which can be localized in introns and also in exons, may affect the splicing signals either directly by the disruption of constitutive splice sites [5][6][7], or indirectly by the creation of cryptic splice sites.
Several substitutions in human SCN1A gene have been reported to cause aberrant splicing in patients with several form of epilepsy such as the c.694+1G>A mutation in intron 5 in patient with generalized epilepsy and febrile seizure plus (GEFS+) [8] and the c.2946+1G>T mutation in intron 15 in patient with clonic generalized epilepsy (CGE) [9]. Splicing mutations were also described in intron 14 (c.2589+3A>T) [9], in intron 18 (c.3705+1G>T) [10] or in intron 24 (c.4581+2A>G) [11] in patients with severe myoclonic epilepsy of infancy (SMEI). In addition, the c.3705+5G>C was described in patient with severe idiopathic generalized epilepsy of infancy (SIGEI) syndrome [12]. But the postulated pathogenicity of theses splicing mutations was not confirmed failing the functional demonstration.
Accordingly, in the present study site-directed mutagenesis, exvivo assays and bioinformatic tools were undertaken to investigate the potential effects of the c.3705+5G>C mutation in intron 18 of the SCN1A gene previously described associated with severe idiopathic generalized epilepsy of infancy (SIGEI) syndrome.

Mutagenesis and minigene construct
Mutagenesis assay was performed to create the c.3705+5G>C variation in intron 18 by a three-step PCR using complementary internal primers, SCN1AM18F 5'CTCTGGTGACTGAGATTAAG3' and the reverse primer, SCN1AM18R 5'CTTAATCTCAGTCACCAGAG3' (the base change is underlined). The two PCR products were then mixed and amplified using the outside primers SCN1AE18F 5'TGATATGGTAACC GGAAAAGC3'andSCN1AE18R 5'TCACATGCTAGCGTGTTAATG3' to create the restriction Bste(II) and Nhe(I) sites (underlined) at the two extremity of PCR product. The resulting 735bp product was sequenced to ensure that no errors were introduced during PCR amplification and then digested with Bste(II) (invitrogen) and Nhe(I) (invitrogen). Wild type fragment and mutated fragment were inserted into the splicing cassette as described previousely [13]. The splicing cassette p(13,17)/cytomegalovirus [CMV], a minigene cassette contains the 2 adjacent constitutive exons 13 and 17 of human 4.1 gene, with their downstream and upstream flanking intron sequences. The size of recombinant cassette was 1255 bp, a feature that was used to Abstract Several substitutions in Voltage-gated sodium channels (SCN1A) gene have been reported to cause aberrant splicing. Accordingly, this study aimed to investigate the potential effects of a transition in the fifth nucleotide at the donor splice site of intron 18 of the SCN1A gene previously described leading to SIGEI phenotype. Functional analyses using PCR mutagenesis, followed by an ex-vivo splicing assays, revealed that the c.3705+5G>C mutation leads to the activation of a cryptic site into exon 18 leading to a partial exon skipping followed by a premature stop codon at position 1253 in the SCN1A protein. Bioinformatic tools showed an homology between cryptic and normal splicing consensus especially at position -3, -2, -1, +1, +2 and +5; confirming the crucial role of these positions in the exon definition and explains the strength of the novel donor consensus. This analysis revealed also the enrichment of regions close to the new splice site in ESEs elements with high scores underlining the importance of ESEmediated SR protein function for accurate new splice site recognition. Our results demonstrate that splicing analysis of mRNA may help to understand both the functional consequences of mutations affected splicing consensus and the correlation between genotype and phenotype.
Page 2 of 7 isoform 1 according to the recommendations of the Human Genome Variation Society and based on cDNA reference sequence CCDS54413 refers to the full-length isoform encoding for a 2,009 amino acid protein ENST00000303395.8 transcript from the Ensembl database. This substitution affects the position 5 nucleotides downstream of the 5' donor splice site of intron 18. The splice consensus sequence of Zhang predicts this nucleotide position to be a G 75% to 86% of the time (depending on GC content) with an allele frequency in the general population of >1% [17]. The strength of the splice site was indicated by the consensus value and the CV variation ∆cv; the consensus sequences were (C/A)AG/GT(A/G)AGT) for the donor splice site and cag/G for the acceptor site. HSF predictions showed that the numerical score calculated for the altered sequence 5'ss (CTG/GTGACT) was 82, 08% comparing to 94, 09% for the wild type sequence (CTG/GTGAGT).
Ex vivo assay revealed the effect of c.3705+5G>C mutation on the activation of a cryptic splice site. Considering the deleterious effect of the c.3705+5G>C mutation revealed by bioinformatic tools, we performed experimental assay to test this result using mutagenesis and minigene cassette.
Site-directed mutagenesis was carried out to induce a homozygous c.3705+5G>C change within the SCN1A gene using mutated primers ( Figure 1). Analysis of effects of this mutation on RNA splicing was performed using a minigene splicing system. A wilde type and mutated genomic fragment containing exon 18 as well as 299 bp and 280 bp of introns 17 and 18, respectively, were cloned into a splicing cassette. RT-PCR performed on total RNA extracted from Hela cells transfected with the normal construct revealed a product of 440 bp corresponding to the expected splicing product containing the exon 18 of the SCN1A gene and the 2 adjacent constitutive exons 13 and 17 of human 4.1R gene. While the RT-PCR performed on the total RNA from cells transfected with the mutated construct displayed a shorter product of 391 bp ( Figure 2), eliciting an abnormal splicing process. Direct sequencing revealed a partial skipping of exon 18 caused by the alteration of the donor splice consensus and the activation of a cryptic splice site (CTG/ GTTTG) into the same exon of the SCN1A gene ( Figure 3). distinguish the recombinant from the native plasmids. 2 µg of each normal and mutated constructions were transfected in Hela cells using Lipofectamine (ROCHE) and were selected for 6-8 days in the same medium RPMI (Roswell Park Memorial Institute) containing 600-800 μg G418/ml (Geneticin invitrogen).

RT-PCR assay
Hela Cells were first pelleted for 5 min at 1000 rpm and washed twice with PBS. Total RNA was extracted from 1 to 3 × 10 7 cells with Trizol reagent (invitrogen) according to the instructions of the manufacturer. RNA was then suspended in 80 µL RNA storage solution (Takara) and treated with RNAse-free DNAse (DNA-free TM ; Takara). RNA content was measured at 260 nm using a NanoDrop ND1000 spectrophotometer (NanoDrop Technologies). 2 µg of total RNA was incubated at 70°C for 10 min with 500 ng of oligo d(T), chilled on ice and then reverse transcribed for 50 min at 37°C using 1 µg of oligo d (

Computational analyses
To analyze the effects of the c.3705+5G>C mutation on splicing event, we used HSF software that includes new algorithms derived from the Universal Mutation Database (UMD) [14,15] and allowing the evaluation of the strength of 5'ss, 3'ss, and branch points. This analysis was performed on the given splice site based on two parameters, namely CV and ∆cv. The relative strengths of the splice sites were given as a consensus values (CV) ranging between 0 and 100. Splice sites with CV values higher than 80 were denoted as strong splice sites, those with CV values ranging from 70 to 80 were designated as less strong sites, and those with CV values ranging from 65 to 70 were referred to as weak splice sites [16]. ESE finder 3.0 was used to identify the effect of the c.3705+5G>C variation on exonic splicing enhancers composition and score and for defining consensus motif of four SR proteins (SF2/ASF; SC35; SRp40 and SRp55). A motif score is considered a high score when it is greater than the threshold value defined by the ESE finder; these thresholds were defined previously.

Results
Bioinformatic prediction of the effects of the c.3705+5G>C mutation on the splicing event in silico analysis was performed to predict the putative effects of the c.3705+5G>C mutation on the splicing event and the strength (% values) of the acceptor and donor splice sites of exon 18. Mutation assignments was taken from the start codon with +1 corresponding to the A of the ATG in the reference sequence of SCN1A

Prediction of ESE composition in normal and mutated sequences
Analysis by the ESE finder 3.0 was performed to compare the composition on ESEs in the normal and partially exon 18 skipped.
Results showed that regions close to the new splice site are enriched in ESEs. In fact, we have identified a strong constitutive ESE elements corresponding to SRp40, SC35 with a score of 3.46 and 2.95 respectively. In addition, two SF2/ASF motifs that reside upstream on site cryptic with a score of 5.73 and 3.03, thus significantly higher  The neuronal voltage-gated sodium-channel α subunit SCN1A is a monomer and consists of four homologous domains (DI-DIV) and represented in different colors. Each domain has six transmembrane segments (S1-S6). The c.3705+5G>C mutation is located in highly conserved S1 transmembrane segments of the channel of domain III (S1DIII) and could lead to production of truncated mutant Nav1.1 and the introduction of 33 new amino acids followed by a premature stop codon of the protein sequence.

Discussion
The present work represents a functional analysis of the effect of the c.3705+5G>C mutation at the donor splice site of intron 18 in the SCN1A gene on splicing process. Following a PCR mutagenesis approach and ex-vivo splicing assays, RT-PCR analyses were carried out on mRNA isolated from the Hela-cells transfected with normal or mutated vectors. The results showed that the c.3705+5G>C altered the donor splice consensus of exon 18 and lead to the activation of a cryptic site into the same exon leading to the introduction of 33 new amino acids followed by a premature stop codon at position 1253 of the protein sequence. This translation of the aberrant SCN1A mRNA would generate a truncated protein containing only 1252 residues over a total of 2009, which suggests the loss of 757 amino acids residues that comprises the third and fourth transmembrane domains of the SCN1A protein, a crucial region for the protein function which plays an important role in channel gating mechanisms.
In fact, cryptic splice site activation is known with exon skipping to be a result of the most common outcome of mutations affecting splice sites and consensus [18]. The occurrence of cryptic splice at (+/-) 100 bp on each side of the exon/intron boundary have been explained by 2 hypothesis [19]. The first hypothesis elicits that a cryptic splice site if it is at a distance from an exon, a cryptic exon can be recognized only if the complementary splice site (3'ss for a 5' cryptic ss and vice versa) is present at a short distance but in presence of a branch point and a favorable splice regulator context. The second hypothesis which is conform with our results consists of the location of the cryptic site at the vicinity of the constitutive splice site and therefore is in competition for recognition by the cellular splicing machinery. The strength of this new site, its localization in relation to the branch point for 3'ss as well as the sequence context (splice regulators) should be taken into account. In fact, we here we used bioinformatic tools to compare the composition on ESEs in the normal and partially exon 18 skipped and results showed that regions close to the new splice site are enriched in ESEs and containing strong constitutive ESE elements with high score underlining their importance for new splice site recognition.
The c.3705+5G>C leads to partial skipped exon 18 and as a consequence, the reading frame is shifted and a premature termination codon appears at position 1253 of the protein, the truncated protein, if translated, would contain only 1252 residues over a total of 2009.
The truncated protein, if translated, would have an aberrant folding and be targeted for degradation by a "protein quality control" mechanism requiring molecular chaperones and cellular protease [20]. However, messenger RNAs containing premature stop codons are generally targeted for degradation through the nonsense-mediated mRNA decay (NMD) pathway [21]. According to the PTC position in more than 50 nucleotides upstream of the last exon-exon junction, nonsense mRNA are predicted to be NMD substrates [22]. The consequences for protein sequence and function alteration, as well as for the triggering of the NMD pathway, have been previously demonstrated for exon-skipping events in several studies [23,24].
This study conclusively demonstrates the importance of consensus sequences in splicing reactions; even a single intronic nucleotide substitution can cause aberrant splicing. The G to C substitution at position +5 reduces the strength of the consensus donor site of SCN1A exon 18 and led to a novel exon 18 definition mediated by a cryptic splice site in the same exon.
The consensus motif for 5'ss of mammalian genes is known as CAG/ GTRAGT (at positions P-3 P-2 P-1/ P1P2-P6) where the purine (R) is either an adenine (A) or a guanine (G). The critical role of position +5 have been already reported especially, within the 5'ss consensus, where the fifth nucleotide of the intron is a (G) in 84% of 5'ss from about 400 vertebrate genes [25]. A (G) at position +5 is also present in 80% of the introns in SCN1A gene.
Collectively, the present study extends the potential role of gain of DNA methylation in the regulation of premature RNA splicing. DNA methylation is emerging as an important factor for both exon selection by the splicing machinery and for the regulation of alternative splicing. In the equal exon-intron GC content group, the presence of CG dinucleotides at specific positions in the 30 splice site and the 50 splice site is correlated with a very high level of DNA methylation (almost 100%) compared to the methylation level of the surrounding regions [26]. Since DNA methylation occurs predominantly on the 5-position within the cytosine pyrimidine ring in the CpG dinucleotides [27], the c.3705+5G>C mutation within SCN1A gene might gain of DNA methylation at the mutated cytosine site. Traditionally, it is known that DNA methylation plays an essential role in the regulation gene expression, X chromosome inactivation and maintenance of genomic imprinting [28]. Interestingly, recent studies indicate that it is also involved in the alternative splicing [29]. As DNA methyltransferase (DNMTs) and some histone modifiers play a major role in the establishment and maintenance of DNA methylation, inhibition the catalytic function of these enzymes might rule out the potential effect of gain of DNA methylation at G>C mutation site in alternative splicing in the SCN1A gene [30,31].
Alternative splicing is the joining of different 5'and 3'splice sites, allowing individual genes to express multiple mRNAs that encode proteins with diverse and even antagonistic functions. Up to 59% of human genes generate multiple mRNAs by alternative splicing [32], and ∼80% of alternative splicing results in changes in the encoded protein [33], revealing what is likely to be the primary source of human proteomic diversity. Alternative splicing generates segments of mRNA variability that can insert or remove amino acids, shift the reading frame, or introduce a termination codon. Alternative splicing also affects gene expression by removing or inserting regulatory elements controlling translation, mRNA stability, or localization. The example compiled here show that alternative splice site selection can be altered by even single nucleotide change. Since the formation of protein complexes on the pre-mRNA is responsible for splice site selection, the ex-vivo splicing analysis revealed that c.3705+5G>C mutation led to aberrant splicing, resulting in the changed ratio between two protein isoforms: a normal transcript and a transcript with a partial exon skipping. The partial exon skipping causes a premature stop codon, disrupting the functional domains of SCN1A protein, which may alter its biological role in channel gating mechanisms. In fact, it has long been assumed that cryptic splice sites are used only when a mutation disrupts use Page 6 of 7 of the authentic splice site [34,35]. More specifically, mutations that affect the 5'ss consensus of b-globin IVS1 reduce or eliminate the use of the authentic 5'ss, while activating three cryptic sites in vivo, located at positions -38, -16 or +13 from the exon 1-intron 1 junction [36]. Activation of the -16 and +13 cryptic sites is readily detectable in splicing assays performed in vitro [35,37].
In the present study, the apparition of a cryptic splice site changed the donor splice site consensus of exon 18, the novel consensus donor site was CTG/GTTTGA comparing to the CTG/GTGAGT in the normal sequence. We interestingly notice a large homology between consensus especially at position -3, -2, -1, +1, +2 and +5; confirming the crucial role of these positions in the exon definition and explains the strength of the novel donor consensus. The aberrant effect of the c.3705+5G>C mutation on SCN1A expression pattern may derived from the disruption of base pairing between the exon 18 donors splicing consensus and the 5' end of U1snRNA, this recognition is being crucial for U6snRNP recruitment [38].
Overall, at least one system correctly predicted 100% of mutations affecting invariant positions as well as -1, +3 and +5 positions of the 5'ss. Therefore, only this area was investigated for the presence of potential splice sites. Our case demonstrates that splice mutations may affect SCN1A mRNA metabolism in manifold ways and that molecular study including splicing analysis may help to understand both the functional consequences of such mutations and the correlation between genotype and phenotype caused by mutations in SCN1A gene [39][40][41].
In fact, several splice site mutations in SCN1A gene have been previously reported for a variety of epilepsy phenotypes that can resulting in exon skipping, a decrease in spliceosome-mRNA binding, or/and activation of cryptic splice site(s) and leading to SCN1A protein deficiency (Table 1). This might be reflected the phenotype heterogeneity observed in patients with theses SCN1A splicing mutations and it is remarkable that the majority of them including the c.3705+5G>C mutation studied here are associated with severe forms of epilepsy [39][40][41].