Molecular analysis of inherited cardiomyopathy using next generation semiconductor sequencing technologies

Background Cardiomyopathies are the most common clinical and genetic heterogeneity cardiac diseases, and genetic contribution in particular plays a major role in patients with primary cardiomyopathies. The aim of this study is to investigate cases of inherited cardiomyopathy (IC) for potential disease-causing mutations in 64 genes reported to be associated with IC. Methods A total of 110 independent cases or families diagnosed with various primary cardiomyopathies, including hypertrophic cardiomyopathy, dilated cardiomyopathy, restrictive cardiomyopathy, arrhythmogenic right ventricular cardiomyopathy, left ventricular non-compaction, and undefined cardiomyopathy, were collected after informed consent. A custom designed panel, including 64 genes, was screened using next generation sequencing on the Ion Torrent PGM platform. The best candidate disease-causing variants were verified by Sanger sequencing. Results A total of 78 variants in 73 patients were identified. After excluding the variants predicted to be benign and VUS, 26 pathogenic or likely pathogenic variants were verified in 26 probands (23.6%), including a homozygous variant in the SLC25A4 gene. Of these variants, 15 have been reported in the Human Gene Mutation Database or ClinVar database, while 11 are novel. The majority of variants were observed in the MYH7 (8/26) and MYBPC3 (6/26) gene. Titin (TTN) truncating mutations account for 13% in our dilated cardiomyopathy cases (3/23). Conclusions This study provides an overview of the genetic aberrations in this cohort of Chinese IC patients and demonstrates the power of next generation sequencing in IC. Genetic results can provide precise clinical diagnosis and guidance regarding medical care for some individuals. Electronic supplementary material The online version of this article (10.1186/s12967-018-1605-5) contains supplementary material, which is available to authorized users.


Background
Cardiomyopathy is defined as the presence of a structural or functional impairment in the myocardium and is classified as either primary or secondary. Cardiomyopathy has been formally classified into five distinct forms: hypertrophic (HCM), dilated (DCM), restrictive (RCM), arrhythmogenic right ventricular cardiomyopathies (ARVC), and left ventricular non-compaction (LVNC) [1]. Genetic aberrations may contribute to a significant percentage of primary cardiomyopathy patients. Approximately 60 genes have been reported to be disease-related in inherited cardiomyopathy (IC), As such, a fast, effective genetic screening approach for IC cases would be useful. With a precise molecular diagnosis, physicians can provide accurate treatment strategies and genetic counseling for patients and their family members.
Targeted sequencing of multiple genes of interest is a rapid, cost-effective alternative to whole exome sequencing (WES) or whole genome sequencing and is becoming more commonly used in clinical laboratories. Benchtop sequencers have the advantages of low cost, flexible sequencing options, and easy-to interpret results compared to high throughput sequencers, and have significant cost and time savings over the conventional Sanger sequencing method [2].
The aim of this study was detection of pathogenic variants in 64 candidate genes associated with IC in a cohort of patients with various type of primary cardiomyopathies, and explored the potential clinical application of Ion AmpliSeq ™ custom designed panel and the Ion Personal Genome Machine (PGM) system in IC.

Inclusion and exclusion criteria
Patients diagnosed with IC phenotypes as HCM, DCM, RCM, ARVC/D, LVNC, and overlapping or undefined phenotypes were included in this study. HCM, DCM, RCM, ARVC/D and LVNC were defined based on guidelines described by a the American Heart Association [1]. Overlapping or undefined cardiomyopathies were defined as cardiac manifestations exhibited at least two phenotypes (such as a significantly thickened interventricular septum together with left ventricular dilation) that cannot be ascribed to a single classical phenotype.

Sample collection and DNA extraction
A total of 110 unrelated patients diagnosed with IC, including HCM (n = 34), DCM (n = 22), RCM (n = 13), ARVC/D (n = 7), LVNC (n = 9), and overlapping or undefined cardiomyopathies (n = 25), were identified and enrolled at the Cardiology Clinic at the Peking Union Medical College from January 2012 to December 2015. All patients were determined to not have secondary cardiomyopathy. Of these patients, 18 had a family history of cardiomyopathy or sudden death, while 92 were sporadic cases. Clinical evaluation consisted of a medical history, family history, physical examination, 12-lead echocardiogram (ECG), transthoracic and/or transesophageal ECG, and/or cardiac magnetic resonance imaging. Peripheral blood samples were collected from patients and their family members (if available). Genomic DNA was extracted from peripheral blood leukocytes using a QIAamp DNA Blood Midi Kit (Qiagen, Hilden, Germany), according to the manufacturer's instructions. The study was approved by the Peking Union Medical College Hospital Institutional Review Board, and all individuals signed a written informed consent.

Panel design and library preparation
According to Online Mendelian Inheritance in Man (OMIM, http://omim.org) and PubMed literature retrieval, 64 candidate genes have been reported to be causes of inherited cardiomyopathy, and were selected for panel design (Additional file 1: Table S1). Primers of overlapping amplicons covering the coding sequence (CDS) region, untranslated regions (UTR), and flanking sequences (padding +25 base pairs) of each targeted gene were automatically generated by Ion AmpliSeq designer software. This produced 2231 amplicons, which were divided into 2 primer pools (Life Technologies; Thermo Fisher Scientific). Although Titin (TTN) is a major causative gene for DCM [3,4], it was not included in the 64-gene panel. As such, we used a separate 6-gene panel (including DMD, TTN, OBSCN, FBN1, TGFBR2 and TGFBR1) for 23 patients with DCM, including one patient with overlapping DCM phenotype. Amplicon libraries were prepared using the Ion AmpliSeq Library Kit v2.0 and custom designed primer pools, according to the manufacturer's instructions. DNA fragments from different samples were ligated with barcoded sequencing adaptors using the Ion Xpress Barcode Adapter 1-16 Kit. The library was quantified with a Qubit 2.0 fluorometer (Invitrogen; Thermo Fisher Scientific).

Next generation sequencing and data analysis
Fifteen barcoded samples were pooled in equimolar amounts. Amplified libraries were subjected to emulsion polymerase chain reaction performed on the Ion One-Touch system using the Ion PGM Hi-Q OT2 Kit. Next, ion sphere particles (ISPs) were recovered and enriched using the Ion OneTouch ES system. Enriched templatepositive ISPs were sequenced on an Ion 318 V2 chip using the Ion PGM HI-Q SEQ Kit by the Ion Torrent PGM. Data from PGM runs were processed using Ion Torrent Suite 4.4 software to generate sequence reads. After sequence alignment and variant calling, synonymous variants, intronic variants far away from the exon/ intron boundaries, and variants with a minor allelic frequency (MAF) ≥ 1% in the 1000 Genomes Project, the dbSNP database, and the Exome Aggregation Consortium (ExAC) database were removed from further analysis. Variants were subsequently selected according to the prevalence of each type of cardiomyopathy (for example, MAF < 0.4%, < 0.3%, < 0.2%, < 0.05% for DCM, LVNC, HCM, and ARVC, respectively) in the general population [5,6]. NGS reads were visualized using an integrated genomic viewer (IGV). Sequence variants were confirmed by bidirectional Sanger sequencing. Annotation of the variants was performed using ANNOVAR (http://wanno var.wglab .org/) and Seattleseq (http://snp.gs.washi ngton .edu/Seatt leSeq Annot ation 137/). Polyphen2 (http://genet ics.bwh.harva rd.edu/pph2/), SIFT (http://sift.jcvi.org/) and Mutation-Taster (http://www.mutat ionta ster.org/) were used to predict the function of amino acid substitution. Berkeley Drosophila Genome Project (BDGP) (http://www.fruit fly. org/seq_tools /splic e.html) and Human Splicing Finder (HSF) (http://www.umd.be/HSF/) were used predict the effect of intronic variants on splicing efficiency.

Variant classification
We placed verified variants into the following categories according to guidelines from the American College of Medical Genetics and Genomics (ACMG) and Association of Molecular Pathology (AMP) [7]: pathogenic (P), likely pathogenic (LP), variant of uncertain significance (VUS), likely benign (LB) and benign (B).

Characteristics of the Ion AmpliSeq ™ custom designed panel and depth of coverage
In total, 1352 fragments with a target size of about 409.7 kb were simultaneously generate about 2231 amplicons designed based on the exons and 25 bp exon-intron boundaries of 64 cardiomyopathy-associated genes. Amplicon sizes ranged from 63 to 189 bp, with a mean amplicon length of 145 bp. The custom designed panel covered 93.73% of the bases in the target regions. For each sample, coverage of the targeted region was approximately 96.5%, with an average depth of 100. Target base coverage at 20× was above 94%, with a mean read length of 130 bp. After variant calling by Ion Torrent Suite 4.4, more than 200 variants were detected in each sample (data available upon request). After variants were filtered by allele frequency and mutation type, only zero to two variants required verification by Sanger sequencing in each sample (Additional file 2: Figure S1). All variants' annotation information verified in this study listed in Additional file 3: Table S2.

Mutation detection rate
We analyzed 110 patients with different primary cardiomyopathies, and detected mutations in 64 cardiomyopathy-associated genes using the NGS method. In total, 78 distinct variants from 30 genes were identified in 73 patients after filtering NGS data and verification using Sanger sequencing. Five patients had two variants in different genes ( Table 2). Clinical characteristics of these patients were showed in Additional file 4: Table S3. A pathogenic or likely pathogenic variant was identified in 26 out of 110 independent cases (23.6%) ( Table 1). In 45 additional patients (40.9%), VUS were found and the pathogenicity of which need further study in future (Additional file 5: Table S4). Two variants in two patients previously reported as damaging-mutation, were reclassified as B and LB. Higher detection rates of 50% (9/18) were observed in patients with a family history of cardiac disease. Among 26 pathogenic or likely pathogenic variant, 15 were recorded in HGMD or the ClinVar database and 11 were novel. These variants consisted of 15 missense, 6 nonsense, 2 frameshift, 2 splicing, and 1 stoploss mutations.
The prevalence of disease alleles among the cardiomyopathy genes was not equally distributed. Twenty-six pathogenic or likely pathogenic mutations involved in 10 genes, and MYH7(8/26) and MYBPC3 (6/26) mutations accounted for more than 50% of the variants found in cardiomyopathy cases (Fig. 1). Most of pathogenic or likely pathogenic variants were heterozygous except one homozygous pathogenic variant in the SLC25A4 gene. All of these mutations found in our study are private, identified in a single proband. These data further confirm the genetic heterogeneity of cardiomyopathy.

TTN truncating variants in DCM patients
For the consideration of sequencing cost, we put the largest causative gene (TTN) of DCM in a small panel and performed sequencing only in DCM patients. Of the total number of patients recruited in our study, there were 23 patients with DCM, including overlapping phenotypes. Due to the extremely large size of TTN, missense variants have been commonly observed in genetic screening, but no obviously frequency differences between DCM patients and healthy controls have been noted [4]. Thus, we focused on the truncating variants of TTN found in DCM patients. TTN truncations (two frameshifts and one nonsense variant, Table 1) were found in three patients, which accounted for 13% of the DCM patients in this study.

Patients with two candidate variants
Five patients had two rare variants ( Table 2). In patient 50 (MYBPC3: c.527C>T p.Ala176Val; MYPN: c.411G>C p.Arg137Ser), both variants were classified into VUS according to ACMG guidelines. In patient 56 (BAG3: c.772C>T p.Arg258Trp; VCL: c.133G>T, p.Ala45Ser), the BAG3 variant is regarded as a damaging mutation in HGMD, but was reclassified as "likely benign" in this study based on the following criteria: BS1 (allele frequency is greater than expected for disorder) and BP6 (reputable source recently reported variant as benign, but the evidence is not available to the laboratory to perform an independent evaluation). Patient   (EF 19%). Given that three family members, including the patient's father, suffered sudden death, blood samples from these individuals were not available. Clinical manifestations in patient 62 were consistent with the features of PRKAG2-related cardiomyopathy. The missense mutation p.His530Arg in PRKAG2 was reported as a pathogenic variant in HGMD and the ClinVar database. Although this variant likely contributed to the phenotype in patient 62, whether the LAMA4 variant produced a "double dose" gene mutation effect will require further study. Genetic screening of first-degree relatives of the proband is important in clinical diagnosis and decision-making. In patient 87 (RBM20: c.3545G>A, p.Arg1182His; SCN5A: c.2962C>T, p.Arg988Trp), both variants were reported as damaging in HGMD. The 23-year-old patient presented with generalized cardiac enlargement with heart failure, but no signs of electrical disorders. Thus, both variants were reclassified as VUS in this patient; In patient 92 (VCL: c.2630C>T, p.Pro877Leu; PSEN2: c.998A>G, p.Glu333Gly), the VCL variant was likely benign because it was observed in a healthy adult individual for recessive (homozygous), dominant (heterozygous), or X-linked (hemizygous) disorders, with full penetrance expected at an early age. The pathogenicity of the PSEN2 variant is uncertain significance.

Discussion
More than 60 genes have been described to cause inherited cardiomyopathy. Genetic screening of these genes individually using traditional Sanger sequencing is  High-throughput sequencing technology enable researchers to find an abundance of variants in individual cases, although determining the pathogenicity of each variant identified by NGS remains a challenge. When selecting appropriate criteria for filtering (e.g. excluding the common variants in databases, such as dbSNP and 1000 Genome), the morbidity of the disease must be considered. On the other hand, variants listed in mutation databases that may have previously been regarded as disease-causing mutations may later be proven to be benign or VUS. For example, 38 variants found in our patients were also recorded in HGMD and/or the Clin-Var database. According to ACMG and AMP guidelines, only 15 variants were classified as pathogenic or likely pathogenic; 3 were reclassified as benign or likely benign, and the remaining 20 variants were reclassified as VUS (Table 1 and Additional file 5: Table S4). Thus, extreme caution needs to be used when defining a variant as disease-causing.
Forty novel single nucleotide variations (SNVs) were found in this cohort of patients that have not been reported in public mutation databases. According to ACMG guidelines, and familial co-segregation analysis, 11 novel variants were classified as pathogenic or likely pathogenic mutations (Table 1). One novel variants was classified as likely benign. The remaining 28 novel variants were classified as VUS because there was no evidence supporting classification as either pathogenic or benign.
Target sequencing of a gene panel can revise the clinical diagnosis and guidelines for management. For example, we found a homozygous SLC25A4 (also called ANT1, c.358G>A, p.Gly120Ser) mutation in one patient with an HCM and DCM overlapping phenotype. The mother of the proband was a heterozygous mutation carrier and the father's blood sample was not available. This variant was not found in the 1000 Genome, ESP6500, and ExAC databases. All three bioinformatics analyses classified this variant as damaging. A homozygous SLC25A4 (c.368C>A, p.Ala123Asp) mutation was previously identified in a patient with mitochondrial myopathy and cardiomyopathy [10]. An in vitro study showed that the mutant produced a loss-of-function effect on SLC25A4 activity. Both of the two amino acid substitutions (p.Gly120Ser and p.Ala123Asp) occurred in conserved residues, with the position of two mutations nearby. Gly120 is located in the third transmembrane domain of ANT1 [11] within the dimerization motif "GXXXG". This sequence is thought to be involved in high affinity association between transmembrane domains [12]. As such, we consider this to be a disease-causing mutation. A 42-year-old patient who presented with undefined cardiomyopathy had a p.Gly120Ser mutation and was born to non-consanguineous parents who were both reported to be unaffected. His cardiac manifestations included a significantly thickened interventricular septum together with left ventricular dilation and noncompaction, with subsequent development of heart failure. Histological examination of a muscle biopsy showed ragged-red fibers. Based on genetic testing results, the diagnosis was revised to mitochondrial DNA depletion syndrome 12B (cardiomyopathic type). The patient was nominated as a candidate for future heart transplantation.
There are no formal standards for classifying a variant as causative. All filtered results are based on genotype quality, frequencies, bioinformatics tools, and published data from databases. Additional criteria are needed to support or refute pathogenicity, such as in vitro functional studies or long-term follow-up during the clinical care of each patient. Familial co-segregation studies play a crucial role in determining the variants' pathogenicity, but incomplete penetrance and variable expressivity should be considered in cardiomyopathy. The common occurrence of sudden death in cardiomyopathy made cosegregation analysis more difficult in the present study, and most variants were classified as VUS. This result still valuable, because it can provide genetic data for primary cardiomyopathies to disease-specific databases. Genetic