Identification of novel mutations among Iranian NPC1 patients: a bioinformatics approach to predict pathogenic mutations

Niemann-Pick disease type C (NPC) is a rare lysosomal neurovisceral storage disease caused by mutations in the NPC 1 (95%) or NPC2 (5%) genes. The products of NPC1 and NPC2 genes play considerable roles in glycolipid and cholesterol trafficking, which could consequently lead to NPC disease with variable phenotypes displaying a broad spectrum of symptoms. In the present study 35 Iranian NPC unrelated patients were enrolled. These patients were first analysed by the Filipin Staining test of cholesterol deposits in cells for NPC diagnostics. Genomic DNA was extracted from the samples of peripheral blood leukocytes in EDTA following the manufacturer's protocol. All exon–intron boundaries and coding exons of the NPC1gene were amplified by polymerase chain reaction (PCR) using appropriate sets of primers. Thereafter, the products of PCR were sequenced and analysed using the NCBI database (https://blast.ncbi.nlm.nih.gov/Blast.cgi). The variants were reviewed by some databases including the Human Gene Mutation Database (HGMD) (http://www.hgmd.cf.ac.uk/ac/index.php) and ClinVar (https://www.ncbi.nlm.nih.gov/clinvar (. Moreover, all the variants were manually classified in terms of the American College of Medical Genetics and Genomics (ACMG) guideline. The sequence analysis revealed 20 different variations, 10 of which are new, including one nonsense mutation (c.406C > T); three small deletions, (c.3126delC, c.2920_2923delCCTG, and c.2037delG); and six likely pathogenic missense mutations, (c.542C > A, c.1970G > A, c.1993C > G, c.2821 T > C, c.2872C > G, and c.3632 T > A). Finally, the pathogenicity of these new variants was determined using the ACMG guidelines. The present study aimed to facilitate the prenatal diagnosis of NPC patients in the future. In this regard, we identified 10 novel mutations, and verified that the majority of them occurred in six NPC1 exons (5, 8, 9, 13, 19, and 21), that should be considered with a high priority for Iranian patients' cost-effective evaluation.


Introduction
Niemann-Pick type C (NPC) disease, is a rare genetic and neurodegenerative disorder induced by intracellular accumulation of free cholesterol and gangliosides into lysosomes or late endosome systems [1]. The NPC was estimated to affect at least one person per 100,000 individuals [2,3]. Of note, the patients affected by this disease are clinically heterogeneous with a broad spectrum of phenotypes and age of onset is variable. The onset and strictness of illness are determined by the degree of functional disruption in cholesterol trafficking [4]. In the majority of cases, the most common symptoms are neurological and psychiatric symptoms and arise between 4 and 16 years of age [5]. However, the clinical spectrum is ranged from a neonatal fatal disorder to an adult-onset chronic disease. Correspondingly, the neurological symptoms are manifested as mental deterioration, dystonia, dysarthria, dysphagia, ataxia, psychomotor retardation, cataplexy, and various types of seizures usually combined with vertical supranuclear gaze palsy [6,7]. The neurodegenerative symptoms are often preceded by some visceral complications such as cholestatic jaundice and hepatosplenomegaly [8]. In the NPC patients, the Filipin staining of the cultured fibroblasts has been used extensively as a diagnostic test [9]. The Filipin staining reveals the abnormal intracellular accumulation of cholesterol in fibroblast cells by showing a strong fluorescence in perinuclear vesicles. [10]. Besides, there are several immunologically and ultra-structurally similarities in the brains of patients suffering from Alzheimer's disease and NPC such as the existence of neurofibrillary tangles, endosomal and lysosomal abnormalities [11]. Moreover, the foamy histiocytes (Niemann-Pick cells) could be identified in the bone marrow [8].
NPC is a heterogeneous disorder with two genetic complementation groups [12]. Accordingly, in approximately 95% of NPC patients, mutations are present in the NPC1 gene (MIM 607623) and the remaining patients have mutations in the NPC2 gene (MIM 601015) [13]. The NPC1 gene has 25 exons spanning more than 47 kb in length and is located on chromosome 18q11 [14]. It encodes a mRNA with roughly 4.9-kb that gives origin to a polypeptide with 1,278 amino acids. The NPC protein includes 13 transmembrane domains, six small cytoplasmic loops, three large and four small luminal loops, and one cytoplasmic tail [15,16]. A cysteine-rich domain (residues 855-1,098) was identified in the third luminal loop [15]. All these functional domains are affected by mutations, which are spread through the NPC1 gene. Of note most mutations are located on the cysteine-rich domain, including a hot spot region from residues 927 to 958 [17]. Moreover, the region between the amino acid positions 1,038 and 1,253 is known as another hot spot region. This region was shown to have 35% similarity with the Patched 1 (PTC1) protein, namely between residues 974 and 1,180 [18].
The majority of the variations in the NPC1 gene are missense mutations, small deletions, and insertions [19,20]. The most important cause of neuronal apoptosis in NPC was recognized to be the accumulation of intracellular free cholesterol in large amounts in the late endosomes or lysosomes, which are caused by a genetic deficit in cholesterol trafficking [21,22]. However, identification of molecular defects in this disease can be considered as an important confirming diagnostic procedure, allowing a precise and fast prenatal diagnosis. In this study, the analysis of the NPC1 gene was performed in 35 Iranian patients with NPC, which as a result, led to the detection of 10 new NPC1 mutations. The present study aimed to provide additional information on the genotype of NPC disease among the Iranian patients.

Patients
We studied 35 Iranian unrelated patients diagnosed as NPC using Filipin staining from 2014 to 2018. Documented consent was obtained from patients as approved for the entire study protocol by the NIGEB ethics committee (IR. NIGEB.EC.1397.8.23. B). Filipin staining of skin fibroblasts was performed in the Centogene GmbH) Rostock, German (. Clinical characteristics and genotype of the NPC patients were summarized in Table 1.

Blood sampling and DNA extraction
Blood samples were obtained from the Special Medical Center (SMC) and Taban Medical Laboratory, Tehran, IRAN.
Genomic DNA was extracted from the peripheral blood leukocyte samples in EDTA, using the QIA amp kit (QIA amp ® DNA Micro Kit #56304, QIAGEN, Hilden, Germany) according with the manufacturer's protocol.

Polymerase chain reaction (PCR) amplification and sequencing analysis
All 25 coding exons in 24 amplicons and the flanking regions of the NPC1 gene were amplified by PCR using the primers listed in Table 2. The PCR mixture contained 2 ng DNA template, 20 pmol each primer, 2.5 μL 10 X PCR buffer, and 5 U AmpliTaq in a total volume of 25 μL. The PCR cycle conditions were as following: an initial denaturation at 95°C for 3 min followed by 35 cycles of denaturation at 95°C for 1 min, annealing at 60-63°C for 1 min, and elongation at 72°C for 1 min, with a final incubation for 10 min at 72°C. The PCR amplification products were analysed by 1.5% agarose gel electrophoresis. The PCR products were sequenced using Big Dye Terminator sequencing chemistry (ABI) and the ABI3100 automatic DNA sequence.
The Sequence data were analysed using freely available software (Finch TV) and compared to the query sequence (NM_000271). The variants were reviewed by databases such as ClinVar (https:// www. ncbi. nlm. nih. gov/ clinv ar/) and HGMD (http:// www. hgmd. cf. ac. uk/ ac/ index. php) to determine whether they had been reported previously as pathogenic.
Moreover, all the variants were manually reviewed based on the American College of Medical Genetics and Genomics (ACMG) guideline. Besides, an aggregated
PolyPhen predicts the possible impact of an amino acid substitution on the structure and function of a human protein. The other two in silico approaches were based on evolutionary conservation. Multiple sequence alignments (MSA) in NPC1 from different species were performed using BoxShade server (version 3.21) to verify the conservation degree. The new variants' pathogenicity, including six likely pathogenic and four pathogenic variants, was determined in terms of the ACMG guidelines.

Sequence
The predicted functional effects of novel missense variants were determined using the pre-computed values of the SIFT and PANTHER for the tolerated/deleterious MSA of the NPC1 proteins obtained from human (O15118), chimpanzee (H2QEC5), mouse (O35604), chicken (F1NQT4), and zebrafish (F1QNG7) was performed using Clustal Omega and BoxShade server (version 3.21) (Fig. 1).
Observational studies of several national cohorts have categorized patients by age at the onset of neurological manifestations. By considering findings of these studies, the patients were categorized into early-infantile (< 2 years old), late-infantile (2-6 years old), juvenile (6-15 years old), and adult (≥ 15 years old)-onset forms [25]. In this study, the juvenile-onset, early-, and lateinfantile onset disease cases were by this order the most frequent disease forms, respectively.
In all patients included in this study, the manifestations usually started in the first decade. The mean age at the time of onset was 5.5 years old (1 m-13y). The disease was diagnosed 2-27 years post the initial clinical presentation. It is noteworthy that variable ages of onset and different age-dependent manifestations, make NPC a complex, complicate, and underdiagnosed disease. Further, dysphagia and splenomegaly were observed in all patients, while hepatomegaly was detected in most patients. Additionally, the neurological features, including vertical supranuclear gaze palsy, cerebellar ataxia, and dementia were highly variable.

Discussion
The clinical features, age at the clinical symptoms onset, and the rate of neurological symptoms' progression are highly variable in the NPC disease. In the present study, at the younger age of onset, more severe disease phenotypes were usually observed. Moreover, more than 395 pathogenic variations have been identified for NPC (HGMD Professional). Most of these variations are associated with missense mutations (71%) [16]. A small number of the prevalent variations have been described, including p.(Ile1061Thr) and p.(Pro1007Ala) in the patients from western European descent [29], p.(Arg518Gln) from Japan, and p.(Pro474Leu) from Italy. However, none of these variations was detected in our study [30]. In this work, the c.2821 T > C, p. (Ser941Pro) and c.2872C > G, p. (Arg958Gly) variations, both in exon 19, were the two most common mutations found among NPC patients analysed. Additionally, the majority of the variants were found in exons 5,8,9,13,19, and 21 (80%) and few mutations have been identified in other NPC exons.
Ten of the twenty variants reported are new variations (not previously reported). Among these, the nonsense mutation and the three small deletions were found highly deleterious.
The new p.(Gln136*) mutation gives origin to a premature termination codon that results in the loss of protein expression and function. Furthermore, the three small deletions were found in the patients p2, p24, and p32 in the homozygous state, and their presence result in frameshifts.
The remaining six new variants were nsSNP (non-synonymous single nucleotide polymorphisms), which were analysed using in silico methods. SIFT and PolyPhen web tools predicted 100% of nsSNPs as "deleterious" and "probable damaging" to protein's structure and function. By its turn, PANTHER predicted 5 of those 6 nsS-NPs as deleterious (83.3%) but one of them (Ala181Asp) was classified as tolerated ( Table 4). The variant p. (Gly657Asp) was identified in homozygosity, and the heterozygous state was identified in the NPC1 gene of their parents.
The cysteine-rich loop, which is known as a functionally significant protein-protein interaction site,   has a ring-finger motif and contains nearly one-third of the NPC1 variations [31,32]. In Fig. 2 is presented the location of the detected novel variants on the NPC1 protein structure. It is noteworthy that two of the 10 detected novel variants, including p. (Ser941Pro) and p. (Arg958Gly), were located in the cysteine-rich loop (residues 927-958 a hot spot region of the gene). These variations were found in a heterozygous state in four studied patients, including p7, p8, p13, and p10.
According to findings of our study and the ACMG guideline, some mutations, including p. In some variations, including p.(Gln136*), c.2037delG, c.2920_2923delCCTG, and c.3126delC: a) the null variants (including frameshift, nonsense, initiation codon, and splice sites) exist in a gene, where LOF (Loss Of Function) is a known mechanism of disease (PVS1); b) the deleterious effect on the gene is supported by computational evidence (PP3).; c) they are absent in all frequency database files such as the gnomAD and 1000 Genomes Project, meaning that these mutations are rare (PM2).
Based on this evidence, these mutations can be classified as pathogenic. In the current study, the aggregated knowledge-based tool, VarSome, was used to review the variants comprehensively.
In this study, 24 out of 35 patients were found to be homozygous, and the remaining patients (11 patients) were either in the heterozygous or in the compounded heterozygous states along with the second mutation.
Seven patients were homozygous, and 11 patients were compound heterozygous for the novel mutations. Furthermore, p. Gly657Asp was detected in three patients (p3-p22-p23), and c.2037delG, c.2920_2923delcctg, c.3126delC, and c.406C > T p.(Gln136*) were detected in the patients. p2, p9, p24, and p32, respectively. In the presence of the all 3 deletions and of the nonsense mutation, severe symptoms were observed.
Based on the findings of the present study, no straightforward genotype-phenotype correlations can be established due to the type of new mutations, except in case of the deletions, which presence is always accompanied with severe manifestations of the disease.
The high number of homozygous patients in the present study, could be explained by the high prevalence of consanguineous marriages in Iran and despite patients are not relative they may have common ancestors.

Conclusions
In conclusion, the mutation screening of 35 Iranian patients with NPC was described, resulting in 10 novel pathogenic and likely pathogenic NPC1 gene variants. The Niemann-Pick is a neurological disorder with a broad spectrum of clinical features. An alternative tool used to confirm the diagnosis of NPC was mutation screening. Finally, the detection of mutations will facilitate carrier screening of family members and prenatal diagnosis.