Functional examination of MLH1, MSH2, and MSH6 intronic mutations identified in Danish colorectal cancer patients

Background Germ-line mutations in the DNA mismatch repair genes MLH1, MSH2, and MSH6 predispose to the development of colorectal cancer (Lynch syndrome or hereditary nonpolyposis colorectal cancer). These mutations include disease-causing frame-shift, nonsense, and splicing mutations as well as large genomic rearrangements. However, a large number of mutations, including missense, silent, and intronic variants, are classified as variants of unknown clinical significance. Methods Intronic MLH1, MSH2, or MSH6 variants were investigated using in silico prediction tools and mini-gene assay to asses the effect on splicing. Results We describe in silico and in vitro characterization of nine intronic MLH1, MSH2, or MSH6 mutations identified in Danish colorectal cancer patients, of which four mutations are novel. The analysis revealed aberrant splicing of five mutations (MLH1 c.588 + 5G > A, MLH1 c.677 + 3A > T, MLH1 c.1732-2A > T, MSH2 c.1276 + 1G > T, and MSH2 c.1662-2A > C), while four mutations had no effect on splicing compared to wild type (MLH1 c.117-34A > T, MLH1 c.1039-8 T > A, MSH2 c.2459-18delT, and MSH6 c.3439-16C > T). Conclusions In conclusion, we classify five MLH1/MSH2 mutations as pathogenic, whereas four MLH1/MSH2/MSH6 mutations are classified as neutral. This study supports the notion that in silico prediction tools and mini-gene assays are important for the classification of intronic variants, and thereby crucial for the genetic counseling of patients and their family members.


Background
Lynch syndrome, also called hereditary nonpolyposis colorectal cancer (HNPCC), is an autosomal dominantly inherited cancer predisposition syndrome primarily associated with germ-line mutations in the MLH1 (MIM# 120436), MSH2 (MIM# 609309), and MSH6 (MIM# 600678) genes [1]. Mutation carriers have an increased risk of several specific cancers, in particular colorectal, endometrial, small bowel, and ovarian cancer as well as uroepithelial tumors. The estimated lifetime risk of developing colorectal cancer with a pathogenic mutation in one of these genes is up to 70% [2], depending on the mutated mismatch repair gene and the gender of the patient.
The MLH1, MSH2, and MSH6 proteins are involved in the repair of single base mismatches and short insertiondeletion loops that arise during DNA replication [3]. Mutations in MLH1, MSH2, and MSH6 are scattered throughout the genes (http://chromium.liacs.nl/LOVD2/colon_cancer/) and include frame-shift, nonsense, missense, and splice site mutations as well as large genomic rearrangements, of which several have been identified in Danish Lynch syndrome families [4][5][6][7]. However, a large number of MLH1, MSH2, and MSH6 missense, silent, and intronic mutations are of unknown clinical significance. It is clinically important to optimize the classification of these mutations into pathogenic mutations or benign polymorphisms in order to provide affected families with a more accurate risk assessment but also to offer predictive (presymptomatic) genetic testing to family members. The classification can be facilitated by performing functional assays (reviewed by [8]). In this study, we performed in silico analysis and functional examinations of nine intronic MLH1, MSH2, and MSH6 variants identified in Danish colorectal cancer patients enabling us to classify five mutations as pathogenic and four variants as neutral/polymorphisms.

Patients and clinical data
Following verbal and written consent blood samples were collected from the probands (all adults) and genetic screening was performed. Since the study is part of normal diagnostic procedures no ethical approval was obtained (H-4-2013-FSP-082). Clinical data regarding family phenotype, individual phenotype, cancer diagnosis, age at onset, adenomas, and age at adenomas (See Additional file 1) were obtained from the Danish HNPCC register. The study was conducted in accordance with the Helsinki Declaration.

MLH1, MSH2, and MSH6 screening
Genomic DNA was purified from whole blood using Qiagen's QIAamp DNA mini kit or Promega's Maxwell DNA purification system according to the accompanying instructions. MLH1, MSH2, and MSH6 were amplified using intronic primer pairs flanking each exon. PCR products were sequenced using an ABI3730 DNA analyzer (Applied Biosystems). Moreover, genomic DNA was examined by MLPA analysis using kit P003 and P072 (MRC-Holland). Sequence variations, except well-known polymorphisms, were verified in a new blood sample. MLH1, MSH2, and MSH6 variants are numbered according to GenBank accession numbers NM_000249, NM_000251, and NM_000179, respectively. The nomenclature guidelines of the Human Genome Variation Society (www.hgvs.org/mutnomen) were used in all cases.

In silico analysis
The following five splice site prediction programs were used to predict the effect of mutations on the efficiency of splicing: Splice Site Finder (http://www. interactive-biosoftware.com); GeneSplicer (http://www. cbcb.umd.edu/software/GeneSplicer); Splice Site Prediction by Neural Network (http://www.fruitfly.org/seq_tools/splice. html); MaxEntScan (http://genes.mit.edu/burgelab/maxent/ Xmaxentscan_scoreseq.html); and Human Splicing Finder (http://www.umd.be/HSF/). The analysis was performed by the integrated software Alamut V.2.2.1 (http://www. interactive-biosoftware.com). The genomic sequence spanning the individual mutations and nearby exons was submitted according to the guidelines of each program and default settings were used in all predictions. A variation of more than 10% in at least two algorithms was considered as having an effect on splicing [9].

Mini-gene assay
Wild type exons along with at least 200 bp of 5′ and 3′ intronic sequences from MLH1, MSH2, and MSH6 were PCR amplified from human genomic DNA using Pwo DNA polymerase (Roche) and forward and reverse primers carrying restriction sites for BamHI or XhoI (primer sequences are available on request). PCR products were subcloned into the pSPL3 vector and all constructs were verified by sequencing. Single nucleotide substitutions or deletions were introduced using Finnzymes' Phusion site-directed mutagenesis kit or Stratagene's QuikChange II site-directed mutagenesis kit with PfuUltra high-fidelity DNA polymerase according to the accompanying instructions. Wild type and mutant constructs were transfected in duplicate into COS-7 cells as recently described [10]. Cells were harvested after 48 hours and total RNA was extracted using NucleoSpin RNA/protein kits for total RNA and protein isolation (Macherey-Nagel). cDNA was synthesized using 1 μg/μl of RNA, M-MuLV reverse transcriptase polymerase (New England Biolabs), and 0.5 μg/μl of nucleotide oligo(dT) 15 primer. cDNA was amplified with Pwo DNA polymerase using the primers dUSD2 (5′-TCTGAGTCACCTGGACAACC-3′) and dUSA4 (5′-AT CTCAGTGGTATTTGTGAGC-3′). PCR products were separated by electrophoresis on a 1% agarose gel containing ethidium bromide. Each DNA band was gel purified using GE Healthcare's Illustra GFX PCR DNA and gel band purification kit and sequenced with dUSD2 and dUSA4 primers.

Results
Since 1995, our department has conducted screening of the entire coding regions and the exon-intron boundaries of MLH1 and MSH2. Furthermore, since 2004, screening of MSH6 and MLPA analysis of all three genes, have also been performed. During this period, a relatively broad spectrum of disease-causing germ-line MLH1, MSH2, and MSH6 mutations has been identified [4][5][6][7]. However, mutational screening has also identified numerous variants of unknown clinical significance, including several intronic variants. Five of these intronic variants were identified in The potential pathogenicity of the variants was investigated using five different in silico splice site prediction programs which predict changes in splice site strength. The threshold employed was a variation between the wild type and the mutation score of more than 10% in at least two different algorithms [9]. According to this criterion, seven mutations, namely, MLH1 c.588

Discussion
Mutations located in the introns of mismatch repair genes can interfere with splicing and cause aberrant spliced mRNA transcripts leading to non-functional mismatch repair proteins. Several cis-acting elements, including the donor splice site, the acceptor splice site, the branch point, the polypyrimidine tract, and exonic/intronic splicing enhancers and silencers, are crucial for the splicing mechanism. The donor splice site consists of the conserved dinucleotides GT, whereas the acceptor splice site consists of three regions: the conserved dinucleotides AG, the polypyrimidine tract, and the branch point [11]. Mutations in splicing motifs can lead to partial or complete skipping of the neighboring exon or inclusion of intronic sequence.
Moreover, a mutation can create an ectopic splice site or activate a cryptic splice site, both of which are usually weak and only used when a mutation disrupts the normal splice site.
Ideally RNA from a patient should be examined by RT-PCR analysis to establish if a mutation has an effect on splicing. However, in many cases, RNA is not available from the patient. Alternatively, the mutation can be examined by mini-gene analysis [12]. In fact, a high concordance between RT-PCR analysis and mini-gene assay has previously been observed [9,[13][14][15]. As an indicative examination prior to the mini-gene assay, several in silico prediction tools can be used to indicate which variants require further analysis.  In this study, we examined the effect on splicing of nine intronic variants identified in Danish colorectal cancer families by in silico analysis and in vitro using a mini-gene assay. The in silico analysis predicted altered splicing for MLH1 c.588 + 5G > A, MLH1 c.677 + 3A > T, MLH1 c.1039-8 T > A, MLH1 c.1732-2A > T, MSH2 c.1276 + 1G > T, MSH2 c.1662-2A > C, and MSH2 c.2459-18delT, whereas MLH1 c.117-34A > T and MSH6 c.3439-16C > T were predicted to have no effect on splicing. It should be noted that three mutations in our study (MLH1 c.1732-2A > T, MSH2 c.1276 + 1G > T, and MSH2 c.1662-2A > C) are located in the highly conserved donor and acceptor splice sites and hence they are easily predicted by in silico programs. However, mini-gene analysis revealed that the two mutations MLH1 c.1039-8 T > A and MSH2 c.2459-18delT had no effect on splicing, suggesting that the employed criterion (>10% difference between wild type and mutant scores in at least two programs) results in false-positive predictions as previously shown [9].
Mini-gene analysis revealed that the MLH1 c.117-34A > T and MLH1 c.1039-8 T > A variants had no effect on splicing. The MLH1 c.117-34A > T variant has not been described before, whereas our results regarding MLH1 c.1039-8 T > A confirm previous data analyzing patient RNA [16]. Moreover, in one Amsterdam positive family (H13), the MLH1 c.1039-8 T > A mutation was identified together with a disease-causing MLH1 mutation (c.1276C > T, p.Gln426X). In conclusion we classify both variants as neutral (Table 2). In contrast, the MLH1 c.588 + 5G > A and MLH1 c.677 + 3A > T mutations were found to lead to exclusion of exon 7 and exon 8, respectively. Ultimately, this leads to premature stop codons and, therefore, both mutations are classified as disease-causing. These findings confirm previous results showing that the MLH1 c.588 + 5G > A mutation causes either partial skipping/deletion of exon 7 examining patient RNA [17] or skipping of exon 7 as well as both exons 7 and 8 assesed using mini-gene assay [15], and by results from MLH1 c.677 + 3A > C and MLH1 c.677 + 3A > G mutations showing skipping of exon 8 [18,19]. Moreover, our analysis found that MLH1 c.1732-2A > T, which is a Danish founder mutation identified in two Amsterdam positive families (See Additional file 1), results in an in-frame deletion of exon 16, which contains the PMS2 interaction domain. In the 2 families the mutation co-segregates with the disease and has a lod score of 1.2 and 2.7, respectively, and in agreement with previous reports [5,7,20] we therefore classify MLH1 c.1732-2A > T as pathogenic. The MSH2 c.1276 + 1G > T mutation was found to result in the activation of a cryptic splice donor site 48 bp within exon 7, leading to an in-frame deletion of 16 amino acids in the MSH6/MSH3 interaction domain. As the mutation was found to co-segregate in the affected family with a lod score of 1.5, it is regarded as pathogenic. This mutation has previously been described Table 2 The effect on splicing determined by mini-gene assays and an overview of the mutations listed in the literature Sanchez de Abajo* [24] *The authors describe the mutation of interest, but do not examine its putative pathogenic effect. IVS = intron; NI = not identified.
in microsatellite instability-high colorectal cancers, with immunohistochemical analysis of these tumors revealing the absence of nucleic MSH2 expression [21]. Similar results and conclusions have been reported for the MSH2 c.1276 + 1G > A mutation [16]. The MSH2 c.1662-2A > C mutation has not previously been described. We found that this mutation leads to skipping of exon 11 and consequently introduce a premature stop codon. Therefore, this mutation is classified as pathogenic. Finally, the MSH2 c.2459-18delT and MSH6 c.3439-16C > T variants were found to have no effect on splicing and are, therefore, classified as neutral. The MSH2 c.2459-18delT variant has not been described before, whereas the MSH6 c.3439-16C > T variant has previously been shown not to co-segregate with the disease and to be observed in healthy control individuals [22][23][24] and in the exome sequencing project (ESP) database (0.43%), thereby supporting the notion that this variant is neutral.
Overall, in all Amsterdam positive families -except one (H229) -a pathogenic mutation was identified. The index individual in family H229 had rectum cancer at age 58 and transverse colon cancer at age 66. His sister and two maternal cousins all had adenomas, while his mother has caecum cancer at age 48. Moreover his maternal aunt had transverse colon cancer at age 69 and his maternal grandmother had ascending colon cancer. The lack of a pathogenic mutation in this family could be due to an unidentified mutation in regions not previously examined, including the promoter region, the untranslated regions (UTRs) or deep intron sequences in the MLH1, MSH2 or MSH6 genes, or due to a mutation in other genes like PMS2. Future studies using exome sequencing might help identifying a putative pathogenic mutation in this family.

Conclusion
In conclusion, we have examined nine MLH1/MSH2/ MSH6 intronic mutations by in silico and functional assays, thus enabling us to classify five mutations as pathogenic and four variants as neutral/polymorphisms. This study supports the notion that in silico prediction tools and mini-gene assays are important for the assessment of the pathogenicity of intronic variants, together with clinical data, IHC and MSI.