Whole-Exome Sequencing Reveals a Rare Variant of OTOF Gene Causing Congenital Non-syndromic Hearing Loss Among Large Muslim Families Favoring Consanguinity

Non-syndromic hearing loss (NSHL) is one of the most frequent auditory deficits in humans characterized by high clinical and genetic heterogeneity. Very few studies have reported the relationship between OTOF (Locus: DFNB9) and hereditary hearing loss in India. We aimed to decipher the genetic cause of prelingual NSHL in a large affected Muslim consanguineous families using whole-exome sequencing (WES). The study was performed following the guidelines and regulations of the Indian Council of Medical Research (ICMR), New Delhi. The population was identified from Jammu and Kashmir, the Northernmost part of India. Near about 100 individuals were born deaf-mute in the village of 3,000 inhabitants. A total of 103 individuals (with 52 cases and 51 controls) agreed to participate in this study. Our study revealed a rare non-sense homozygous mutation NC_000002.11:g.2:26702224G>A; NM_001287489.2:c.2122C>T; NP_001274418.1:p.(Arg708∗) in the 18th exon of the OTOF gene. Our study provides the first insight into this homozygous condition, which has not been previously reported in ExAC, 1,000 Genome and genomAD databases. Furthermore, the variant was confirmed in the population cohort (n = 103) using Sanger sequencing. In addition to the pathogenic OTOF variant, the WES data also revealed novel and recurrent mutations in CDH23, GJB2, MYO15A, OTOG, and SLC26A4 genes. The rare pathogenic and the novel variants observed in this study have been submitted to the ClinVar database and are publicly available online with the accessions SCV001448680.1, SCV001448682.1 and SCV001448681.1. We conclude that OTOF-related NSHL hearing loss is prevalent in the region due to successive inbreeding in its generations. We recommend premarital genetic testing and genetic counseling strategies to minimize and control the disease risk in future generations.


INTRODUCTION
Over 446 million people have disabling hearing loss worldwide. This estimate is projected to accelerate over 630 million by 2,030 and may raise up to 900 million in 2,050 (Olusanya et al., 2019). The global prevalence of prelingual hearing loss is 1 in 500 newborns and is the 4th leading cause of disability among living individuals (Duman and Tekin, 2012). Phenotypically, hereditary hearing loss (HHL) can be classified into syndromic hearing loss (SHL) and non-syndromic hearing loss (NSHL). The NSHL is one of the most frequent sensory deficits in humans characterized by high clinical and genetic heterogeneity. Approximately 90% of NSHL cases from severe to profound congenital deafness exhibit an autosomal-recessive (AR) pattern of inheritance (DFNB forms). The prevalence of prelingual NSHL is approximately 2.7 cases per 1,000 live births (Vona et al., 2015).
Genotype-phenotype correlations help in understanding the mechanistic etiology, progression, and prognosis of the inherited genetic disorder. However, in case of NSHL, the diagnosis becomes more challenging due to high clinical complexity and genetic heterogeneity. The molecular elucidation of such complex disorder with precise genomic approaches can provide genetic origin and functional consequences, which could be better implemented for proper medical investigation, prognostic, and therapeutic targets. To date, a total of 116 NSHL genes have been identified 1 . Out of these, 45 genes for autosomal dominant-NSHL (AD-NSHL), 5 genes for X-linked and about 75 genes are linked with AR-NSHL. The product of these genes has multiple roles in maintaining the normal physiology of the inner ear. Mutation in any of these genes may have a deleterious impact, altering the normal hearing physiology (Shearer et al., 1993).
Hearing loss is a severe disorder but grossly neglected in India. According to World Health Organization (2018) data, the prevalence of auditory deficit in the country is over 63 million (about 6.3%) of the total population. The population of India and other South Asian countries provides much complexity due to the admixture of their genomes during evolutionary timescales (Patterson et al., 2012). To discover genetic variations of the heterogenetic disorder in such an ethnic group and/or geographically isolated population may need highly equipped and rigorous diagnostic approaches. Massive parallel sequencing (MPS) or next-generation sequencing (NGS) technologies provide an opportunity to precisely explore the genetic architecture of the disease, which furthermore could be used as a baseline for medical genetic testing (Shearer et al., 2010). Very few studies have reported the relationship between OTOF (Locus: DFNB9) and HHL in India.
Here, we employed the whole-exome sequencing (WES) approach to precisely identify the functional pathogenic mutations causing congenital hearing loss from the Muslim population favoring consanguinity. 1 http://hereditaryhearingloss.org/

Study Population
The cohort was selected from the Doda District of Jammu and Kashmir, North India. Approximate 3,000 residents inhabited this village, belonging to the Muslim community favoring consanguinity in their ancestry. To date, in this village, more than 100 individuals were born deaf-mute; some of them migrated to other states (i.e., Punjab, India) and a few to nearby districts. In our preliminary survey, we identified 74 deaf-mute live and 3 deceased cases. Figure 1 presents the pedigree and inheritance detail of all 77 cases. Out of 74 deaf-mutes, only 52 cases were agreed to participate in the study. A total of 103 individuals aged 20-60 years (with 52 cases and 51 controls) were recruited for this study. WES was performed in 07 samples with 02 cases (i.e., F7 and G8) and 05 controls (i.e., C4, D5, H9, K11, and M12), while the rest (n = 96) were subjected to Sanger sequencing to confirm the candidate pathogenic variant.

Sample Collection, DNA Isolation and Quantification
Blood samples were collected using a 21-gage syringe to draw up to a volume of 5 ml in a vacutainer containing Tris-EDTA. For saliva samples, the participants were provided with collection tubes and were asked to provide a 2.5 ml sample. The participants were instructed to spit into the tube up to the red fill line marked on the tube (approximately 2.5 ml), excluding the froth. DNA was extracted using QIAamp DNA Mini Kit 50 (Cat# 51304, Qiagen). The DNA samples were was subjected to QIAXPERT for quantifying the amount of DNA and the purity was checked by measuring the 260/280 nm ratio. DNA samples were also subjected to agarose gel electrophoresis and, after passing through DNA quality check (Supplementary Table 1), were proceeded for Library Protocol.

Whole-Exome Sequencing and Bioinformatic Analysis
The WES library was prepared using Agilent-Sure Select XT Reagent Kit, Illumina (ILM) platforms. Biotinylated oligonucleotide capture probes (V5 + UTR), also called baits, that was designed for the human exons was provided with the kit and used to enrich the region of interest (whole exome) by hybridization. The workflow involved shearing of DNA, repairing ends, adenylation of 3' ends, followed by adapter ligation. At each step, the products were purified using AMPure beads. The adapter sequence was added onto the ends of DNA fragments to generate paired-end libraries. The resulting adaptorligated library was purified, quantified and hybridized with an exome-specific biotinylated capture library. After hybridization, the targeted molecules were captured on streptavidin beads. The resulting enriched DNA library was multiplexed by adding index tags by amplification, followed by purification. Indexed captured library DNA was assessed to check the quality and quantity of the captured libraries. The sequencing was carried out in Illumina HiSeq X10 to generate 2X-150 bp sequence reads at an average 100− sequencing depth. Only those samples were considered for data processing that surpasses the quality scores (Q30 value) greater than 75% of the sequenced bases. The base quality distribution, base content and the GC content are presented in Supplementary Figure 1. The overall alignment percentage (alignment to GRCh37/hg19) in all the samples was greater than 97% (Supplementary Table 2). The average coverage and on-target percentage for the samples was around 99 and 90%, respectively (Supplementary Table 3). The distribution of sequencing depth is shown in Supplementary Figure 2. The sequencing reads were processed and analyzed using the BROAD Institute's Genome Analysis Toolkit (GATK-Toolkit) (DePristo et al., 2011) and variant calling was performed using the complete human reference genome (hg19, NCBI release GRCh37).
The bioinformatic pipelines (alignment variant calling and variant annotation) used in our study are shown in Supplementary Figure 3. All common polymorphisms with a minor allele frequency (MAF) higher than 0.01 were filtered out using several public databases such as 1,000 genomes database (1000Genomes Project Consortium et al., 2015, Ensembl GRCh37 genome browser (Zerbino et al., 2018), exome aggregation consortium database (ExAC) (Lek et al., 2016), genome aggregation database (gnomAD) (Karczewski et al., 2020), and database of single nucleotide polymorphisms (dbSNP). The ClinVar database was used to check the previously reported mutations and associated phenotypes. Exclusion of intronic, synonymous, inframe insertions/deletions (InDels) and mutations in untranslated regions whereas the missense, non-sense variations and frameshift InDels located in exons or splice sites were prioritized. The remaining variants were then verified in dbSNP and NCBI databases.

In silico Evaluation for the Pathogenicity of Candidate Mutation
The altered amino acid was checked for its evolutionary conservation across different species, including the primates and mammals using the genome browser of the University of California at Santa Cruz (UCSC) (Kent et al., 2002). In silico programs including MutationTaster2 (Schwarz et al., 2014), PolyPhen-2 (Adzhubei et al., 2010) and scale-invariant feature transform (SIFT) (Kumar et al., 2009) were used to predict the possible impact of the detected variants.

Audiometric Characteristics
The pulse tone audiometric (PTA) records of the subjects were noted. The hearing level grades were categorized according to WHO and the National Hearing Test (NHT) guidelines viz., normal (<20 dB), mild (20-40 dB), moderate (41-70 dB), severe (71-90 dB), and profound (>90 dB). The average values of both the ears (left and right) were considered to calculate the hearing threshold levels. The age of the individual was also recorded at the time of audiometry.

Co-segregation Analysis
Whole-exome sequencing (at 100X depth) provides sufficient details for the confirmation of the candidate variant. However, Sanger sequencing has also been performed to re-validate/or reconfirm the mutation identified by targeted NGS. Primers (PXL-A0145439) for exon 18 of the OTOF gene (reference sequence NM_001287489.2 and Chr2:26669916-26791779 context region in hg19), were manufactured and supplied by Pxlence 2 . The thermal cycler program was set according to the manufacturer's instructions, using the conditions: initial incubation 98 • C for 2 min, 98 • C for 20 sec, followed by 35 cycles at 60 • C for 30 s, 72 • C for 40 s, final extension 72 • C for 10 min and hold at 4 • C. PCR products were confirmed using a 1.7% w/v agarose gel electrophoresis. The PCR products were then processed for cycle sequencing/BigDye terminator assay, followed by Sephadex (column-based) purification. Finally, the PCR products were loaded onto DNA Sequencer (SeqStudio Genetic Analyzer).

Identification of a Pathogenic Mutation Using Whole-Exome Sequencing
The variants produced from exome-sequencing are presented in Supplementary Tables 4-6. WES analysis generated approximately 72,000 genetic variants (including SNPs, insertions and deletions) in each sample. By applying the narrow down filtering approach, the number of probable causative mutations are presented in Supplementary Tables 7-9, respectively. The variant filtering strategy used to find out the most promising causative mutation has been presented in Figure 2. The most promising causative variants have been presented in Supplementary Table 10. After the removal of duplicates and common mutations, the variants were filtered for rare (<1%), evolutionary conserved and functional homozygous recessive mutations using the online GenIO database (an integrated pipeline based on RefGene, NHLBI-ESP, 1,000 Genomes, dbSNP, ClinVar, COSMIC, gnomAD, OMIM and M-CAP databases) following the American College of Medical Genetics and Genomics and the Association of Molecular Pathology (ACMG-AMP) guidelines (Koile et al., 2018). We observed approximately 03 homozygous and 140 heterozygous mutations (likely to be pathogenic, global MAF <0.01) among the two cases (F7 and G8). After the removal of overlapping variants with controls and considering the conserved in evolution (GERP score >0), we thus identified the disease-causing rare variant NC_000002.11:g.2:26702224G > A; NM_001287489.2:c.2122C > T; NP_001274418.1:p.(Arg708 * ) (rs80356590) in 18th exon of the OTOF gene. This OTOF variant is very rare and no homozygous variant previously reported in genomAD, ExAC, and 1,000 Genome databases; however, few heterozygous cases have been reported in genomAD. We confirmed the OTOF mutation using BAM and VCF files in IGV 2.5.3 software.
The functional consequence (in silico validation) of the variant was predicted using Mutation Taster (Table 1), which revealed the deleterious consequence homozygous condition by premature terminating the otoferlin protein at 708 amino acid. The substitution of the arginine by a premature stop codon p.(Arg708 * ) produces a truncated variant of otoferlin remains non-functional in the sensory hair cells, potentially causing profound hearing loss.
In addition to the rare pathogenic OTOF variant, the WES data also revealed some novel and rare recurrent genomic mutations in the extended family. The novel variants (not found in ExAC, 1,000 G and genomAD databases) of FIGURE 2 | Variant filtering strategy. The narrow down approach applied to whole exome-sequenced samples to explore the most promising causative mutation.

Audiometric and Genotypic Characteristics
All subjects (cases and controls) showed normal clinical features (i.e., no structural and observable phenotypic deformity) other than hearing. The audiometric and genomic characteristics of proband have been presented in Figure 3. The normal individuals (control) showed the hearing level in between 0-20 dB at frequencies 0.125, 0.25, 0.5, 1, 2, 3, 4, 6, and 8 kHz. However, the threshold level of deaf-mute subjects falls under a profound (>90 dB) category for the same audio frequencies ( Figure 3A). The cases F7 (aged 31 years), G8 (aged 25 years) exhibit the characteristics of profound prelingual NSHL. The WES data of cases (F7 and G8) and controls C4 (aged 43 years), D5 (aged 35 years), H9 (aged 60 years), K11 (aged 22 years), and M12 (aged 34 years) have been shown in Figure 3B. The cases (F7 and G8) carry a homozygous mutation; among controls, the C4, H9, and M12 possess heterozygous and D5 and K11 are homozygous for the reference allele (carrying no OTOF mutation).

Co-segregation Analysis
The family pedigree and OTOF segregation with the disease phenotype has been well demonstrated in Supplementary  Figure 4a. The next generation WES ( Figure 3B) and Sanger sequencing data (Supplementary Figure 4B) provides the evidence for perfect segregation of the OTOF variant with the auditory phenotype. The data clearly depicts the autosomal recessive non-syndromic hearing loss (ARNSHL). Further, Sanger sequencing was performed in the remaining 96 samples to reconfirm the mutation pattern of the OTOF gene. The genotypic details of the candidate variant of OTOF gene for all 103 subjects have been displayed in Figure 4. The ancestral lineage of the subjects has a history of consanguineous marriages which determines the existence of the phenotypic spectrum of OTOF in the current generation.

DISCUSSION
The results of our study indicate that a non-sense mutation in the OTOF gene was terminating the peptide chain and generating a truncated protein variant, which ultimately leads to prelingual neurosensory non-syndromic DFNB9 (OMIM, #601071) hearing loss. The functional evidence of the OTOF relationship with DFNB9 impairment has been well established by earlier studies (Roux et al., 2006;Pangršič et al., 2012). The OTOF-related NSHL has been previously reported in the populations from India, Pakistan, China, Japan, Altai Republic, Korea, Spain, Turkey, and Iran (Yasunaga et al., 2000;Tekin et al., 2005;Rodríguez-Ballesteros et al., 2008;Choi et al., 2009;Wang et al., 2010;Mahdieh et al., 2012;Ñhurbanov et al., 2016;Kim et al., 2018;Iwasa et al., 2019). In addition to OTOFrelated deafness, mutations in the autosomal genes like CDH23, Claudin14, GJB2, GJB6, MYO6, MYO15A, SLC26A4, TMC1, TMIE, TMPRSS3, TRIOBP, USHIC, and others are predominant to cause HHL among Indian and Pakistani populations (Yan et al., 2015). It is evident from the previous studies that the mutation spectrum of GJB2, GJB6, SLC26A4, and TMC1 genes was much common in the ethnic groups of eastern and southern parts of India (Padma et al., 2009;Ganapathy et al., 2014;Adhikary et al., 2015;Singh et al., 2018). In the present study, we have also identified the missense/non-sense heterozygous variants in the OTOG p. Otoferlin is a transmembrane vesicular protein (1997 amino acids in humans) encoded by the OTOF gene, spanning in the short arm of chromosome 2 (2p23.3). The otoferlin is expressed in the inner hair cells (IHCs) (Liu et al., 2014) and plays a significant role in neuronal synapse and exocytosis (Shin, 2014). The protein consists of six C2 domains (C2A-F), two Ferlin conserved motifs (Fer-1 and Fer-B) and a transmembrane domain (TMD). Using the Treefam database, Schreiber et al. (2014) the close relationship of otoferlin across different species depicts the domains and motifs that were highly conserved ( Figure 5A). The functional evidence for the high level of protein FIGURE 3 | Audiometric characteristics and whole exome data. (A) Audiometric data shows five normal hearing subjects (levels between 0 and 10 dB) and two cases (levels >90 dB) for all frequencies (KHz). (B) Whole exome data reflects exactly the phenotypic data with two homozygous cases (F7 and G8), three heterozygous normal (C4, H9, and M12) and two normal with homozygous reference allele (D5 and K11).
conservation at the site of OTOF mutation (NP_001274418.1:p. Arg708 * ) in 8 vertebrates has been displayed in Figure 5B and Supplementary Figure 5a (using UCSC Genome Browser). The mRNA levels in 20 different tissues reveal the OTOF in the brain is highly expressed (Supplementary Figure 5b). The proteinprotein interactions of OTOF using STRING v11.0 confirm the functional involvement in the neuronal synapse, exocytosis and hearing functions (Figure 5C; Szklarczyk et al., 2019). The non-sense mutations in the OTOF gene producing truncated versions of protein causing DFNB9 deafness has been well-established by the researchers using in vitro and in vivo models in their experiments (Pangršič et al., 2012;Chatterjee et al., 2015;Hams et al., 2017). In the present study, a similar non-sense mutation (NM_001287489.2:c.2122C > T) results in the protein truncation at p.(Arg708 * ) eventually altering the normal otoferlin physiology. The truncated otoferlin (N-terminus C2A-C) lacking C2D-F and TMD remain unbound with no membrane fusion or exocytosis, which ultimately halts the neurotransmitter release, causing profound hearing loss. Based upon the substantial evidence from in silico testing (Table 1) and previously well-established proofs of protein truncation due to different OTOF non-sense mutations (Chatterjee et al., 2015;Hams et al., 2017), we propose a schematic model showing the role of otoferlin protein with normal and altered physiology via p.(Arg708 * ) in the current study (Figure 6). The otoferlin held on synaptic vesicles via TMD and C2 domains interact with the presynaptic membrane via SNAREs to execute exocytosis under Ca 2+ influx. It is evident from the previous studies that missense mutations in C2B and C2C domains have been linked to hearing loss (Mirghomizadeh et al., 2002;Helfmann et al., 2011). N-terminal domains' role provides structural stability to the protein and C-terminal domains (C2D-F) may play a functionally conserved or redundant role in otoferlin physiology (Chatterjee et al., 2015). Our findings predict the otoferlin synthesis stops at 708 amino acid (between C2C and FerB domains), leading to almost half protein (N-terminal with C2A-C domains) lacking the most conserved and functional part (C2D-F with TMD at C-terminus).
Parental consanguinity has been associated with an increased risk of autosomal recessive disorders . The consanguineous marriages with the adverse effects have been previously reported from the Muslim populations of Jammu and Kashmir Afzal, 2014a,b, 2016;. In the present study, the prevalence of prelingual hearing loss was 1 in 30, considerably higher than the global prevalence (Duman and Tekin, 2012), providing the evident consequence of inbreeding in the population. It might be possible that these groups have settled to the northernmost region of India during the evolutionary time frames and unknowingly have undergone marriages within the groups, which ultimately increased the prevalence of the disorder.

CONCLUSION
The present study evaluated the ARNSHL in a tribal family of the Jammu and Kashmir region. The clinical audiometric evaluation and the information gathered from village representatives suggest the profound prelingual NSHL prevailing in the area. WES categorizes the rare pathogenic mutation in the OTOF gene p.(Arg708 * ), which has been well segregated with the disease. OTOF plays a significant role in the neuronal synapse in the IHCs of the cochlea. The pathogenic mutation (NC_000002.11:g.2:26702224G > A) is predicted to result in the lack of the most conserved and functional domains.
The global allele frequency of this OTOF variant (rs80356590) is 0.00004284 (source: genomAD), which affirms the extremely rare criteria. However, in the present study population, this globally recognized rare variant of OTOF turns into a common one because consanguineous marriages were much prevalent in the region, ultimately increasing the risk of such autosomal The human OTOF protein domains show high resemblance with mouse and rat except for the chicken and zebrafish, which lacks the C2E domain. Ferlin1, FerlinB and Trans-membrane domains are common among all species. (B) The comparison of OTOF for 8 vertebrates shows a highly conserved region (at Arg708 residue) spanning between C2C and FerB domains across different vertebrates. (C) Network showing the functional association of OTOF using STRING v11.0 (https://stringdb.org/). The proximity of the proteins via threads predicts the association and their role in sensorineural activities. In the present study, exome-data have also identified the additional genomic variants in CDH23, GJB2, MYO15A, OTOG, and SLC26A4 for the current STRING model. recessive disorder. In summary, our study provides the roadmap and a piece of safeguard advice for clinicians and health care providers to make people familiar with the genetic cause and its increased risk in the context of inbreeding. Identification of such mutations supports better management of such disorders through genetic counseling. Establishing genetic testing facilities will help the appropriate diagnosis and opening new gateways for precision medicine. Familial and premarital screening will help in FIGURE 6 | Model presenting the role of OTOF in hearing physiology. The otoferlin expressed in the inner hair cells (IHCs) cochlea typically perform the neuronal synapse and exocytosis. A non-sense mutation brings stop codon (at p. 708*) in between C2C and C2D domains results in a truncated version of the protein, which disrupts the normal function and hearing physiology. The current model was adapted from the previously well-established experimental-based studies (Ramakrishnan et al., 2014;Shin, 2014;Chatterjee et al., 2015;Hams et al., 2017).
controlling the hearing disability in future generations. Further population-wide screening is needed to explore the OTOF penetrance and other possible genetic hideouts in NSHL among North Indian populations.

DATA AVAILABILITY STATEMENT
The rare pathogenic and the novel variants observed in this study have been submitted to the ClinVar database and are publicly available online with the accessions SCV001448680.1, SCV001448682.1, and SCV001448681.1

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Institutional Ethical Committee (IEC), Jawaharlal Nehru Medical College (JNMC), Aligarh Muslim University, India. The patients/participants provided their written informed consent to participate in this study. All methods were performed following the guidelines and regulations of the Indian Council of Medical Research (ICMR), New Delhi.

AUTHOR CONTRIBUTIONS
MF, VS, IS, SUR, GS, and MA conceived and designed the study. MF analyzed and interpreted the data. MF, VS, SUR, GS, and MA contributed to the drafting and critical review of the manuscript. All authors contributed to the article and approved the submitted version.