Genomic Studies in a Large Cohort of Hearing Impaired Italian Patients Revealed Several New Alleles, a Rare Case of Uniparental Disomy (UPD) and the Importance to Search for Copy Number Variations

Hereditary hearing loss (HHL) is a common disorder characterized by a huge genetic heterogeneity. The definition of a correct molecular diagnosis is essential for proper genetic counseling, recurrence risk estimation, and therapeutic options. From 20 to 40% of patients carry mutations in GJB2 gene, thus, in more than half of cases it is necessary to look for causative variants in the other genes so far identified (~100). In this light, the use of next-generation sequencing technologies has proved to be the best solution for mutational screening, even though it is not always conclusive. Here we describe a combined approach, based on targeted re-sequencing (TRS) of 96 HHL genes followed by high-density SNP arrays, aimed at the identification of the molecular causes of non-syndromic HHL (NSHL). This strategy has been applied to study 103 Italian unrelated cases, negative for mutations in GJB2, and led to the characterization of 31% of them (i.e., 37% of familial and 26.3% of sporadic cases). In particular, TRS revealed TECTA and ACTG1 genes as major players in the Italian population. Furthermore, two de novo missense variants in ACTG1 have been identified and investigated through protein modeling and molecular dynamics simulations, confirming their likely pathogenic effect. Among the selected patients analyzed by SNP arrays (negative to TRS, or with a single variant in a recessive gene) a molecular diagnosis was reached in ~36% of cases, highlighting the importance to look for large insertions/deletions. Moreover, copy number variants analysis led to the identification of the first case of uniparental disomy involving LOXHD1 gene. Overall, taking into account the contribution of GJB2, plus the results from TRS and SNP arrays, it was possible to reach a molecular diagnosis in ~51% of NSHL cases. These data proved the usefulness of a combined approach for the analysis of NSHL and for the definition of the epidemiological picture of HHL in the Italian population.

Hereditary hearing loss (HHL) is a common disorder characterized by a huge genetic heterogeneity. The definition of a correct molecular diagnosis is essential for proper genetic counseling, recurrence risk estimation, and therapeutic options. From 20 to 40% of patients carry mutations in GJB2 gene, thus, in more than half of cases it is necessary to look for causative variants in the other genes so far identified (∼100). In this light, the use of next-generation sequencing technologies has proved to be the best solution for mutational screening, even though it is not always conclusive. Here we describe a combined approach, based on targeted re-sequencing (TRS) of 96 HHL genes followed by high-density SNP arrays, aimed at the identification of the molecular causes of non-syndromic HHL (NSHL). This strategy has been applied to study 103 Italian unrelated cases, negative for mutations in GJB2, and led to the characterization of 31% of them (i.e., 37% of familial and 26.3% of sporadic cases). In particular, TRS revealed TECTA and ACTG1 genes as major players in the Italian population. Furthermore, two de novo missense variants in ACTG1 have been identified and investigated through protein modeling and molecular dynamics simulations, confirming their likely pathogenic effect. Among the selected patients analyzed by SNP arrays (negative to TRS, or with a single variant in a recessive gene) a molecular diagnosis was reached in ∼36% of cases, highlighting the importance to look for large insertions/deletions. Moreover, copy number variants analysis led to the identification of the first case of uniparental disomy

INTRODUCTION
Hearing loss (HL) is a frequent disease affecting 1-3 in every 1,000 live births (Ječmenica et al., 2015). It is typically described based on its clinical presentation and can be classified by a number of different factors, such as the age of onset, severity, etiology, and pathobiology (Rehm and Morton, 1999;Dror and Avraham, 2009;Alford et al., 2014;Smith et al., 2014;Parker and Bitner-Glindzicz, 2015).
Overall, 60-70% of cases have a genetic etiology (Nishio et al., 2015) and can be further categorized as to whether the gene causes only hearing loss (non-syndromic or NSHL) or multiple clinical features (syndromic). NSHL accounts for the vast majority of hereditary HL cases and includes, according to the pattern of inheritance, autosomal recessive cases (∼80%, labeled as "DFNB"), autosomal dominant (∼20%, labeled as "DFNA") X-linked or mitochondrial cases (<1%) (Stelma and Bhutta, 2014). In Italy, from 20 to 40% of cases are caused by mutations in GJB2 gene (Cama et al., 2009;Primignani et al., 2009;Salvago et al., 2014), making it the major player. However, as expected, considering the unique and complex structure of the inner ear, many other genes have been found to be involved in the hearing phenotype. To date, 158 NSHL loci (60 DFNA loci, 88 DFNB loci, 6 X-linked loci, 2 modifier loci, 1 Y-linked locus, and 1 locus for auditory neuropathy), and 95 genes (27 DFNA genes, 56 DFNB genes, 8 DFNA/DFNB genes, and 4 X-linked genes) have been reported as causative (Hereditary Hearing Loss Homepage; http://hereditaryhearingloss.org/). In this light, a combined approach based on an accurate clinical characterization and on different analytical technologies represents the most effective strategy for the identification of the molecular cause of NSHL, which is essential for proper genetic counseling, recurrence risk estimation, prognosis, and therapeutic options.
Here, we report the results obtained on a large cohort of unrelated patients (N = 103), negative for GJB2 mutations, applying a targeted re-sequencing panel of 96 HHL genes followed by SNP arrays in negative cases. This approach allowed: (1) the characterization of 31% of all NSHL cases, leading to an overall detection rate of ∼51% (together with GJB2), (2) the identification of an extremely rare case of uniparental disomy (UPD) in HHL (i.e., the first one in LOXHD1 gene), (3) the demonstration of the importance of copy number variants (CNVs), and (4) the identification of 17 new alleles.

Ethical Statement
All patients provided written informed consent forms for both genetic counseling and molecular genetic testing prior to enrolment. Written informed consent was obtained from the next of kin on behalf of the minors/children involved in this study. The study was approved by the Institutional Review Board of IRCCS Burlo Garofolo, Trieste, Italy. All research was conducted according to the ethical standard as defined by the Helsinki Declaration. (1) absence of vestibular signs, (2) no obvious syndromic features, (3) absence of diabetes, cardiovascular diseases, visual problems, and neurological disorders, (4) absence of mutations in GJB2, GJB6, and MTRNR1 genes.

Patients: Clinical Evaluation and Sample Collection
All participants underwent pure tone audiometric testing (PTA) or auditory brainstem response (ABR) in order to characterize the severity of HL according to the International guidelines described by Clark (1981).
At least three to four individuals per family have been analyzed by sequencing (both affected and healthy), and for the sporadic cases both the proband and the parents have been sequenced.
A comparison between the identified genetic variants and data reported in NCBI dbSNP build150 (http://www.ncbi.nlm.nih. gov/SNP/) as well as in gnomAD (http://gnomad.broadinstitute. org/), and NHLBI Exome Sequencing Project (ESP) Exome Variant Server 1 led to the exclusion of those variants previously reported as polymorphism. In particular, a Minor Allele Frequency (MAF) cut off of 0.01 and one of 0.001 were used for recessive and dominant cases, respectively.
On the other hand, for novel variants, several in silico tools, such as PolyPhen-2 (Adzhubei et al., 2013), SIFT (Ng and Henikoff, 2003), MutationTaster (Schwarz et al., 2010), LRT (Chun and Fay, 2009), CADD score (Kircher et al., 2014) were used to evaluate the pathogenicity of the variant identified. Moreover, the evolutionary conservation of residues across FIGURE 1 | Causative genes identified in our cohort of Italian patients. The graph shows the distribution of the causative genes detected by TRS and SNP arrays. species was evaluated by PhyloP (Pollard et al., 2010) and GERP (Cooper et al., 2005) scores.
We manually investigated the raw sequence reads for all the candidate pathogenic variants using the Integrative Genomics Viewer (IGV) (Thorvaldsdottir et al., 2013) with the purpose of excluding likely false positive calls due to read misalignment.
Finally, on a patient-by-patient basis, identified variants were discussed in the context of phenotypic data at interdisciplinary meetings and the most likely disease-causing SNVs/INDELs were analyzed by direct Sanger sequencing on a 3500 Dx Genetic Analyzer (Applied Biosystems), using ABI PRISM 3.1 Big Dye terminator chemistry (Applied Biosystems).
Sanger sequencing was employed also to perform the segregation analysis within the family.

SNP Arrays Analysis
SNP arrays analysis was performed using the Human OmniExpress Exome-8 Bead Chip (Illumina Inc., San Diego, CA, United States) containing 960,919 loci derived from phases I, II and III of the International HapMap project. The array includes over 274,000 functional exonic markers, delivering unparalleled coverage of putative functional exonic variant selected from 12,000 individual exome and whole-genome sequences. A total of 200 ng of gDNA (50 ng/µl) for each sample was processed according to Illumina's Infinium HD Assay Super protocol. Normalization of raw image intensity data, genotype clustering, and individual sample genotype calls were performed using Illumina's Genome Studio software v2011.1 (cnvpartition 3.2.0). The CNVs were mapped to the human reference genome hg19 and annotated with UCSC RefGene. Allele detection and genotype calling were performed with Genome Studio software, B allele frequencies (BAFs) and log R ratios were exported as text files for PennCNV analysis (Wang et al., 2007).

Protein Modeling and Molecular Dynamics Simulation
The protein structure for the modeling of two de novo variants of ACTG1 was collected from the protein data bank (PDB, http://www.rcsb.org/pdb/) (PDB ID: 5JLH) (von der Ecken et al., 2016). The wild-type (WT) structure was used to generate the mutational models p.(Thr66Ile) and p.(Arg183Gln) with PyMOL (www.pymol.org). The WT and mutant structures were used in GROMACS (Groningen Machine for Chemical Simulation V 4.5.4 package) (Van Der Spoel et al., 2005;Hess et al., 2008) for molecular dynamics simulations. The GROMOS96 (van Gunsteren., 1996) force field was applied to the structures. Energy minimization was carried out using the steepest descent algorithm with a tolerance of 2,000 kJ/mol/nm. The energyminimized structures were used for the molecular dynamics simulations. The SPC3 (Berendsen et al., 1981) water model was used for solvation in a cubic box (0.8 nm) with periodic boundary conditions applied in all directions. The simulation systems were neutralized by adding counter ions. A twin range cut-off was used for long-range interactions: 0.8 nm for van der Waals interactions and 1.4 nm for electrostatic interactions. All bond lengths were constrained with the LINCS algorithm (Hess et al., 1997). The SETTLE algorithm (Miyamoto and Kollman, 1992) was applied to constrain the geometry of the water molecules. The energy minimized system was subjected to 100 ps equilibration followed by 10 ns production molecular dynamics simulations with a timestep of 2 fs at constant temperature (300 K), pressure (1 atm) and number of particles, without any position restraints (Berendsen et al., 1984). The collected trajectories were analyzed using the tools within GROMACS and the structures were analyzed using PyMOL. In addition, the representative structures were selected from molecular dynamics simulations for the structural analyses using the cluster analysis (Krishnamoorthy et al., 2018) and surface mapping calculations.

Targeted Re-sequencing (TRS)
For the 96 genes under investigation an average 95% of the targeted region was covered at least 20X, with a 337 meandepth. A total of 170 Mbp of raw sequence data was produced per individual. We identified an average of 468 genetic variants (SNVs/INDELs) per subject. After applying the filtering pipeline described in the "Methods" section, an average of 17 residual SNVs/INDELs for each subject were obtained.
A detailed summary of pathogenic variants identified in our screening is reported in Table 1. Overall, 23 already known mutations and 16 novel variants have been detected as here detailed: 30 missense (two de novo in ACTG1, and one in MYO1A), four frameshift deletions, one frameshift insertion, one non-sense, three variants affecting splice-sites.
Moreover, since 14 familiar and sporadic cases were carrying only one causative allele (i.e., an already known pathogenic mutation or a variant predicted as highly damaging) in an autosomal recessive gene (Table 3), we hypothesized that a second mutation, or a large CNV, has been missed in trans. In this light, these individuals have been further investigated with SNP arrays and the most interesting results are reported below.

TECTA
Combining previous data (Vozzi et al., 2014) with the results of the present study, TECTA (NM_005422.2) has been identified as the major NSHL gene in the Italian population (after GJB2), characterizing 13% of positive cases (with both autosomal dominant and recessive pattern of inheritance; Figure 2) Two different splicing variants affecting the same nucleotide have been identified in Familial case_13 and Sporadic case_8. Since TECTA gene is expressed at low levels in peripheral blood, it was not possible to test their effect on mRNA processing directly on the patients samples, however, according to the prediction of Human Splicing Finder software (http://www.umd.be/HSF3/) the c.6000-1G>T allele causes the total loss of the acceptor splice site, while the c.6000-1G>A leads to the loss of the acceptor splice site, together with the creation of a new site one nucleotide after.
Since ACTG1 de novo mutations are a rare cause of NSHL (Wang et al., 2018), protein modeling and molecular dynamics simulations of these two variants have been performed.
The structural stability of the WT and of the ACTG1 mutants was dynamically calculated using root mean square deviation (RMSD). Results showed that the deviation pattern was different for the mutants compared to the WT (Figure 3C-a); in particular, the p.(Thr66Ile) mutant deviated less than the WT, while the p.(Arg183Gln) mutant deviated more than the WT. This behavior correlates with the average number of hydrogen bonds, which are needed for protein's structural stability (Figure 3C-b). NA TRS led to the identification of four known causative, and ten potentially causative, heterozygous alleles in recessive genes. For all variants the affected gene, cDNA and protein change are indicated. All mutations have been classified based on their frequency reported in Genome Aggregation Database (gnomAD, http://gnomad.broadinstitute.org/) and their pathogenicity. In particular, the following tools have been used: GERP++ (higher number is more conserved), Polyphen-2 (D, Probably damaging; P, possibly damaging; B, benign), SIFT (D, deleterious; T, tolerated), MutationTaster "A" ("disease_causing_automatic"); "D" ("disease_causing"); "N" ("polymorphism"); "P" ("polymorphism_automatic"), and CADD (Pathogenicity score: > 10 predicted to be deleterious). NA, not available.  (Figure 3C-a).
The representative structures from molecular dynamics simulations showed that the mutational spots for both Thr66 and Arg183 are located in the binding region of the tropomyosin (Figure 3D). Results showed that the mutant p.(Thr66Ile), which is very close to the D-loop, causes a conformational change of the loop itself that alters the binding surface, while in the case of p.(Arg183Gln), both the tropomyosin-binding region and the D-loop are involved leading to a potential cavity on the binding surface (Figures 3E,F).
These results proved that both variants affect a key-binding region, introducing few critical changes in the structure.

POU4F3 and SMPX
Three families with an X-linked pattern of inheritance have been detected (Figure 4).
Familial case_15 carries a novel frameshift deletion [c.162delG; p.(Lys55Serfs * 25)] in SMPX gene (NM_014332.2), known for causing dominant X-linked NSHL ( Table 1), The proband presents post-lingual bilateral symmetric severe to profound medium-high frequencies HL, while the mother shows a monolateral profound to severe HL ( Figure 4A). As expected, no preferential X-inactivation has been detected in subject I:2.

SNP Arrays
Cases negative to TRS, or those showing a single variant in recessive genes, were analyzed with SNP arrays.

LOXHD1
In the present study we identified the second case of UPD ever described associated with HL, in a patient (Sporadic case_4) showing an early-onset bilateral symmetric severe to profound NSHL ( Figure 5A). Briefly, TRS revealed a novel homozygous mutation in LOXHD1 (NM_144612.6) [c.3071A>G; p.(Tyr1024Cys)], apparently segregating only from the father. The variant was predicted as damaging by all in silico predictor tools and affected the LH2 domain of the protein. SNP array analysis identified one run of homozygosity bigger than 8 Mb in length, spanning LOXHD1 gene on chromosome 18. Analysis of informative SNPs in parental samples and their comparison with the patient's genotype confirmed the presence of a paternal UPD. Moreover, a deep analysis of SNPs on the whole chromosome 18 confirmed the presence of both the small isodisomy segment spanning the LOXHD1 gene plus the presence of heterodisomy on the remaining parts of Chr18 ( Figure 5B).

OTOA
In Sporadic case_1, showing early-onset bilateral symmetric severe to profound medium-high frequencies hearing loss, a homozygous missense variant [c.1865T>A; p.(Leu622His), rs750007142] in OTOA gene (NM_144672.3) was detected. The variant, predicted as damaging by all in silico predictor tools, apparently segregated only from the father, thus the clinical case was further investigated by SNP array. Data analysis led to the identification of a large (∼228.5 Kb) heterozygous submicroscopic 16p12.2 deletion inherited from the mother (Fontana et al., 2017).

STRC and CATSPER2
In Familial Case_6 and Sporadic Case_15 two different deletion involving STRC gene (NM_153700.2) have been identified (Figures 5C-F). The first one belongs to an AR family composed of 4 members, 2 affected siblings (a 5-year old girl and a 3-year old boy) and their normal hearing parents. Both children showed a bilateral moderate symmetric SNHL characterized by a prelingual onset (Figure 5C). A 49.23 Kb deletion on chromosome 15 spanning through STRC and CATSPER2 (NM_172095.2) genes was identified ( Figure 5D). In the second case, a 10 y.o. girl affected by early-onset bilateral symmetric moderate NSHL, with no familiarity for HL (Figure 5E), an homozygous deletion of 49 Kb involving only STRC gene has been detected ( Figure 5F).

MYO6
In Familial Case_16 SNP arrays led to the identification of a novel heterozygous deletion affecting MYO6 gene, known for causing autosomal dominant NSHL. The family is composed of two affected siblings (37 and 35 y.o., respectively), the healthy mother (64 y.o.) and the affected father (80 y.o.). All affected individuals display adult onset bilateral moderate symmetric SNHL ( Figure 5G) and carry a deletion of 75.8 Kb in MYO6 gene ( Figure 5H).
Overall, the combination of an accurate clinical characterization, TRS and SNP arrays, led to the identification of the molecular cause of hearing loss in 31% of cases (37% of familial cases and 26.3% of sporadic cases). Moreover, among the 14 patients analyzed with SNP array (negative to TRS, or with a single variant in a recessive gene) we reached a molecular diagnosis in ∼36% of cases (5 out of 14).

DISCUSSION
Nowadays, the use of targeted re-sequencing in routine clinical diagnosis seems to be one of the most accurate approaches for the molecular diagnosis of highly heterogeneous genetic diseases, such as inherited deafness. Nevertheless, this approach has some limitations, not being able to accurately detect rearrangements such as deletions and duplications (i.e., CNVs) which might be involved in causing a significant proportion of genetic disorders (Yang et al., 2013). Moreover, recent studies have also suggested that runs of homozygosity (ROH) are much more frequent than previously recognized and, in some cases, can unravel UPD (Wang et al., 2015).
To overcome this limitation and to further increase the detection rate of HHL cases, we refined our strategy performing high-density SNP arrays in TRS-negative cases. Results confirmed that a multi-step integrated approach based on TRS followed by SNP arrays, is extremely powerful in advancing the molecular characterization of HHL. In particular, the largest study of Italian HHL patients so far carried out, demonstrates the importance of SNP arrays analysis in detecting the first case of UPD in LOXHD1 gene and confirming the importance of genomic rearrangements in the etiopathogenesis of hearing loss.
Combining TRS and SNP arrays, our strategy allowed to characterize 31% of the ∼65% of cases negative to GJB2 mutations (leading to an overall detection rate -including GJB2of ∼51%). These results seem to be in agreement with previous data reported by Shearer and Smith (2015) and Sloan-Heggen et al. (2015) in which a detection rate of 39% (including GJB2 gene) in a much larger cohort is described. Moreover, results of the present study demonstrated a higher detection rate in familial cases (37%), mainly characterized by autosomal dominant inheritance, compared to sporadic cases (26.3%). Also in this case, these data were in agreement with those reported in the literature (71% for familial cases and 37% for sporadic cases) (Shearer et al., 2011;Sloan-Heggen et al., 2016).
Our study highlights the importance of TECTA as the second most frequently mutated gene in the Italian population, following GJB2. Patients affected by TECTA mutations display an early onset symmetrical NSHL, with different degrees of severity (from mild-moderate, to severe-profound). According to literature data, we identified different genotype-phenotype correlations depending on the inheritance pattern and on which of the functional domain of TECTA was affected . Thus, as expected, Sporadic Case_8, affected by autosomal recessive NSHL, carried a splicing and a non-sense variant, and displayed a moderately severe to severe hearing phenotype (Asgharzade et al., 2017). As regards dominant families, the phenotype of the patients was much more variable and not always reflecting data of previous studies leading to new genotypephenotype links to be further investigated (Choi et al., 2014).
Another gene frequently mutated in our cohort of patients turned out to be ACTG1. Two out of the three mutations identified in this gene were detected as "de novo." So far, several mutations in ACTG1 gene have been described and analyzed by protein modeling (van Wijk et al., 2003;Rendtorff et al., 2006) although de novo ACTG1 variants seem to be a rare cause of NSHL (Wang et al., 2018). In this light, we further investigated their potential role by protein modeling and molecular dynamics simulations demonstrating a significant modification of two functional regions of the protein, the tropomyosin, and myosin binding sites and the D-loop. Considering that the binding of tropomyosin and myosin to actin is a key mechanism for regulating the normal function of the complex (Rayment et al., 1993;Behrmann et al., 2012) and that an alteration in actin filament regulation is an important factor in deafness caused by ACTG1 mutations, these de novo variants likely alter the protein activity, leading to deleterious effects (Lee et al., 2018).
Furthermore, another relevant finding of this study is the high prevalence of X-linked forms, which are expected to account for an inconsiderable part of the genetic forms (Smith et al., 2005) and that have been detected in three cases of both dominant and recessive X-linked NSHL, involving SMPX and POU3F4 genes. Among them, a careful attention should be directed to SMPX gene whose mutations have been recently reported to lead to a mild bilateral HL phenotype in females and a severe to profound early-onset HL in the affected males (Niu et al., 2017). Alternatively, as in the case of our family, the mother shows a monolateral severe to profound (sky-slope) HL while the proband displays a post-lingual bilateral symmetric severe to profound medium-high frequencies NSHL, as described in only one case worldwide (Weegerink et al., 2011). For this reason, during genetic counseling, it would be very important to carefully verify the monolateral clinical manifestations in affected females as an indicator of an X-linked form of HL.
Finally, our data highlight the huge importance of CNVs discovery in TRS negative cases, or in patients carrying only one mutated allele in recessive genes. In fact, among the 14 families analyzed by SNP arrays, a causative CNV was identified in 5 of them, explaining ∼36% of cases.
In particular, this approach allowed the detection of a paternal UPD in chromosome 18, involving LOXHD1 gene. This represents the second example of UPD associated to NSHL, in addition to that one described in GJB2, the most common mutated HL gene. In fact, the only cases of UPD so far described as causative of hearing loss, affect GJB2 gene (Yan et al., 2007).
Furthermore, CNVs analysis confirmed the significant contribution of STRC deletions in hearing loss as recently described in other populations (Plevova et al., 2017). STRC deletions are reported in approximately 1% of mixed deafness populations (Francey et al., 2012;Hoppman et al., 2013), making it as a major contributor to congenital hearing impairment, and can cause autosomal recessive NSHL (Verpy et al., 2001) or Deafness-infertility syndrome (DIS) in males if the adjacent CATSPER2 gene is also involved in the deletion (Zhang et al., 2009). Considering that in Familial Case_6 the deletion also involved CATSPER2 gene, it will be important to evaluate future fertility problems in the male sibling (since now he is only 3 years old).
Overall, our results proved that the combination of an accurate clinical evaluation, TRS of known NSHL genes and SNP arrays can effectively enhance, in a cost-effective way, the genetic characterization of NSHL, leading to the identification of new mutations/CNVs and thus helping in the clinical management of patients.

AUTHOR CONTRIBUTIONS
AM: samples quality control, data production and analysis, study design assessment; SL: data production and analysis; SC: Copy Number Variation (CNV) analysis; VP: Copy Number Variation (CNV) analysis; MM: data production and analysis; EO: samples recruitment, patients' clinical evaluation; SG: samples recruitment, patients' clinical evaluation; UA: samples recruitment, patients' clinical evaluation; MB: data analysis; PoG: protein modeling and molecular dynamics simulation; MB: data production; FF: patients' clinical evaluation; EG: samples recruitment, patients' clinical evaluation; FS: patients' clinical evaluation; AS: samples recruitment, patients' clinical evaluation; CG: samples recruitment, patients' clinical evaluation; MS: samples recruitment, patients' clinical evaluation; PaG: patients' clinical evaluation, data analysis, study design assessment; GG: data analysis, patients' clinical evaluation, study design assessment. All authors contributed in writing or revising the manuscript.