Structural variations causing inherited peripheral neuropathies: A paradigm for understanding genomic organization, chromatin interactions, and gene dysregulation

Abstract Inherited peripheral neuropathies (IPNs) are a clinically and genetically heterogeneous group of diseases affecting the motor and sensory peripheral nerves. IPNs have benefited from gene discovery and genetic diagnosis using next‐generation sequencing with over 80 causative genes available for testing. Despite this success, up to 50% of cases remain genetically unsolved. In the absence of protein coding mutations, noncoding DNA or structural variation (SV) mutations are a possible explanation. The most common IPN, Charcot‐Marie‐Tooth neuropathy type 1A (CMT1A), is caused by a 1.5 Mb duplication causing trisomy of the dosage sensitive gene PMP22. Using genome sequencing, we recently identified two large genomic rearrangements causing IPN subtypes X‐linked CMT (CMTX3) and distal hereditary motor neuropathy (DHMN1), thereby expanding the spectrum of SV mutations causing IPN. Understanding how newly discovered SVs can cause IPN may serve as a useful paradigm to examine the role of topologically associated domains (TADs), chromatin interactions, and gene dysregulation in disease. This review will describe the growing role of SV in the pathogenesis of IPN and the importance of considering this type of mutation in Mendelian diseases where protein coding mutations cannot be identified.

discoveries, numerous whole-exome sequencing studies to screen IPN cohorts have shown 18%-50% of cases (or higher) remain unsolved creating a significant burden for IPN diagnosis (Drew et al., 2015;Hartley et al., 2017;Lupo et al., 2016;Schabhuttl et al., 2014). In these cases where exome sequencing has failed to identify mutations in specific genes, it is likely a proportion of the neuropathy cases may be due to noncoding DNA or structural variation (SV) mutations. We have recently identified novel SV mutations causing two forms of IPN -X-Linked Charcot-Marie-Tooth neuropathy (CMTX3)  and distal hereditary motor neuropathy (DHMN1) (Drew, Cutrupi, Brewer, Nicholson, & Kennerson, 2016). Our findings have added to the spectrum of mutations causing IPN and highlight the importance of interrogating the noncoding genome for SV mutations for IPN families that have been excluded for genome wide protein coding mutations. We also propose that the CMTX3 and DHMN1 disease causing genomic rearrangements may provide a suitable paradigm for studying the growing area of 3D genomic organization and its underlying role in mechanisms of gene dysregulation.

| STRUCTURAL VARIATION
Genetic research in IPNs has been pioneering in the discovery of genomic disorders. Perhaps most importantly, it has expanded our understanding of the effects of copy number variation SVs in disease pathogenesis. Advancement in sequencing and genomic technologies including high-throughput massively parallel sequencing, and arraybased techniques such as array comparative genomic hybridization (aCGH) has led to the development of diverse approaches for SV discovery as well as studying the functional and phenotypic consequences of SV rearrangements (Hoyle, Isfort, Roggenbuck, & Arnold, 2015;Hurles, Dermitzakis, & Tyler-Smith, 2008;Weischenfeldt, Symmons, Spitz, & Korbel, 2013).
Structural variation is a broad term that encompasses genomic rearrangements that disrupt chromosomal organization and architecture. There are many types of structural variation including duplications and deletions (also referred to as copy number variation [CNV]), insertions, inversions, and translocations. These have been reviewed in detail elsewhere (Alkan, Coe, & Eichler, 2011;Guan & Sung, 2016;Stankiewicz & Lupski, 2010). The size of SV DNA rearrangements can range from thousands to millions of base pairs (Gu, Zhang, & Lupski, 2008). SV can also be complex. A simple SV event may be part of a more complex rearrangement involving 2 or more types of SV. Such examples include translocation insertions (Antonarakis, Kazazian, & Tuddenham, 1995;Brewer et al., 2016), translocation deletions (Barber, Ford, Harris, Harrison, & Moorman, 2004), and inversion duplications (Kostiner, Nguyen, Cox, & Cotter, 2002). The scale and complexity of SV can therefore cause major re-organization of the regulatory landscape of genomic loci. It accounts for a considerable proportion of genetic variability, phenotypic diversity, and human disease (Feuk, Carson, & Scherer, 2006;Guan & Sung, 2016;Redon et al., 2006;Weischenfeldt et al., 2013) with SVs having more heritable differences between individuals than SNPs (Baker, 2012;Weischenfeldt et al., 2013). Reference to structural variation in both the scientific literature and in curated databases such as the Database of Genomic Variants (DGV-http:// dgv.tcag.ca/dgv/app/home) has grown significantly (Baker, 2012;J. R. Lupski et al., 1991). SV represents an excellent candidate for disease causing mutations for IPN, especially when genome wide protein coding point mutations have been excluded in unsolved families.

| Known structural variation causing IPN
The most common IPN subtype, CMT1A is caused by a 1.5-Mb tandem duplication of chromosome 17p11.2 (Lupski et al., 1991) which results in trisomy of the gene encoding peripheral myelin protein 22 (PMP22, OMIM:*106907) (Lupski et al., 1992;Patel et al., 1992;Timmerman et al., 1992;Valentijn et al., 1992). The CMT1A duplication represents the first and most common IPN mutation (Inoue et al., 2001), accounting for approximately 50% of all CMT cases (Katona et al., 2009). Reciprocal deletion of the same chromosome 17p11.2 region causes hereditary neuropathy with liability to pressure palsies (HNPP) which results in a mild, episodic peripheral neuropathy (Chance et al., 1993). At the time of reporting the CMT1A duplication/HNPP deletion, this represented a seminal discovery for structural variation (CNV) causing IPN and demonstrated the sensitivity of nervous tissue to gene dosage changes due to the gain or loss of a copy of the PMP22 gene.
Atypical genomic rearrangements have also been identified for the CMT1A/HNPP locus and other IPN loci (Table 1). For CMT1A these include duplications that differ in size to the 1.5 Mb genomic rearrangement (Valentijn et al., 1993) and small exonic deletions involving all or part of PMP22 gene coding exons (Zhang et al., 2010). Two novel duplications that exclude the PMP22 coding region have also been identified in cases of CMT1A Zhang et al., 2010). Weterman et al., reported a 186 kb duplication located upstream of the PMP22 gene in six unrelated families . Work by Zhang et al.,  in 2 of 21 subjects. The study also identified a novel 194 kb duplication upstream of the PMP22 gene in one patient (Zhang et al., 2010). Both the 186 kb and 194 kb duplication mutations shared a common overlapping region of approximately 168 kb (Zhang et al., 2010). This region has since been shown to contain several putative enhancers of the PMP22 gene (Jones et al., 2011(Jones et al., , 2012 which when duplicated cause CMT1A. Cases describing whole or partial gene duplications or deletions at other CMT loci include MPZ (OMIM:*159440) (Hoyer, Braathen, Eek, Skjelbred, & Russell, 2011;Maeda et al., 2012), MFN2 (OMIM:*608507) (Carr et al., 2015), GJB1 (OMIM:*304040) (Ainsworth, Bolton, Murphy, Stuart, & Hahn, 1998;Lin et al., 1999;Nakagawa et al., 2001), and NDRG1 (OMIM:*605262) (Okamoto et al., 2014). Apart from these reports, however, CNV disrupting other IPN loci appear to be rare. Three independent studies used aCGH to examine the contribution of CNV to IPN and all concluded that apart from the CMT1A duplication/HNPP deletion, CNV as a disease mechanism in IPN is rare (Hoyer et al., 2015;Huang et al., 2010;Pehlivan et al., 2016).
Complex structural variations have been described that disrupt known IPN loci and include rearrangements for CMTX1 involving the entire coding region (Rouger et al., 1997), as well as a reciprocal translocation t(16;17)(q12; p11.2) causing HNPP (Nadal et al., 2000). The involvement and contribution of non-CNV structural variations as a pathomechanism in IPN therefore remains largely understudied and poorly understood.

| Discovery of two complex insertions expands the mutation spectrum of IPN
Our laboratory has recently identified two novel SVs as the underlying genetic causes of an X-linked form of CMT (CMTX3, OMIM:%302802) ; and an autosomal dominant form of distal HMN (DHMN1, OMIM:%182960) . Linkage analyses in large families mapped the respective disease loci to chromosome Xq26.3-q27.3 for CMTX3 (Huttner, Kennerson, Reddel, Radovanovic, & Nicholson, 2006) and chromosome 7q34-q36.2 for DHMN1 (Gopinath, Blair, Kennerson, Jennifer, & Nicholson, 2007). Following identification of the respective disease loci, whole-exome sequencing, CNV, and karyotype analyses failed to identify any candidate gene mutations for CMTX3 and DHMN1. Paired end whole-genome sequencing (WGS) was then performed on multiple affected individuals from the CMTX3 and DHMN1 families and analyzed for split and discordant map reads. The analysis revealed a large complex insertion within the respective CMTX3 and DHMN1 disease locus. DHMN1 locus at chromosome 7q36.2 in the reverse orientation ( Figure 2). Both SVs segregated with the disease in their respective families. The SVs were absent in the unaffected individuals sent for WGS, 1054 neurologically normal control chromosomes, and the DGV database.
Our two SV discoveries expand the spectrum of mutations known to cause IPN ( Figure 3) and raise some significant questions with respect to the biological impact of SV mutations in peripheral nerve and their role in gene dysregulation as a disease mechanism for IPN.

| PATHOGENIC MECHANISMS OF SV MUTATIONS
Insights into the mechanisms that underpin the relationship between SV and disease pathogenesis have been garnered from studies on a range of diseases including Hemophillia A (reviewed in (Tuddenham et al., 1991)), Parkinson's Disease (Marongiu et al., 2007;Singleton et al., 2003), and Aniridia (Fantes et al., 1995;Fukushima et al., 1993;Simola, Knuutila, Kaitila, Pirkola, & Pohja, 1983). These studies have revealed that SVs can cause disease by several mechanisms including (i) physical disruption of the DNA structure of genes, (ii) the unmasking of recessive alleles or functional polymorphisms, (iii) alteration of gene dosage of dosage sensitive genes via copy number change, and (iv) disrupting/altering the spatiotemporal control of gene expression causing gene dysregulation/ectopic expression (Kleinjan & Lettice, 2008;Kleinjan & Van Heyningen, 2005;Spielmann & Mundlos, 2013). Given that the CMTX3 and DHMN1 SVs are complex insertions, we propose two possible mechanisms that could lead to the different neuropathy in our families: (i) altered gene dosage resulting from trisomy of the complex inserted regions; or (ii) transcriptional dysregulation of one or more genes mapping within the CMTX3 and DHMN1 disease loci and flanking regions.
The 78

| Gene dosage
The CMT1A/HNPP duplication/deletion sets a precedent for a SV disease mechanism in IPN causing peripheral nerves to be sensitive to gene dosage changes. For both CMTX3 and DHMN1, trisomy of any gene transcripts (whole or partial) within the inserted DNA may result in gene dosage effects and produce neuropathy. For CMTX3, differential expression of ARHGAP39 in lymphoblasts using qPCR was not observed between patients and controls suggesting that trisomy of the partial transcript was not likely to be the mechanism underlying CMTX3 pathology . When testing genes for DHMN1, the expression of the partial transcript UBE3C was higher in patients when compared to controls which may suggest a gene dosage mechanism for DHMN1 (unpublished data). Interestingly, the inserted sequence from chromosome 7q36.3 contains the fully intact MNX1/HB9 gene along with associated promoter and flanking sequences. MNX1/HB9 encodes a motor neuron transcription factor which is crucial for the consolidation of motor neuron identity (Arber et al., 1999). Models of ALS (Peviani et al., 2012) and CMT2A (Detmer, Vande Velde, Cleveland, & Chan, 2008) have used the MNX1/HB9 gene promoter to express genes of interest in mouse motor neurons. Although a gene dosage effect cannot currently be ruled out for DHMN1, these studies suggest that it is highly plausible that the MNX1/HB9 promoter or flanking regulatory sequences could be driving expression of nearby genes in DHMN1 motor neurons. To address this question and the relevance of gene expression changes in peripheral nerve of DHMN1 patients these findings will need to be reproduced in patient-derived neural tissue.

| Disruption to transcriptional regulation
Gene dysregulation through aberrant transcriptional regulation is not unprecedented as a disease mechanism in IPN as point mutations have previously been reported in noncoding regulatory sequences (Tomaselli et al., 2017). Point mutations in the 5 0 untranslated region (UTR), 3 0 UTR (Ionasescu, Searby, Ionasescu, Neuhaus, & Werner, 1996) and the neural-specific promoter (P2) sequences of the GJB1 gene (Houlden et al., 2004) have been reported to cause CMTX1. Similarly, point mutations in the 5 0 UTR of the SEPT9 (OMIM:*604061) gene cause hereditary neuralgic amyotrophy (HNA) (Kuhlenbaumer et al., 2005). These mutations highlight the importance of noncoding sequence mutations in IPN as well as the broader context of human disease.
Structural variations like those causing CMTX3 and DHMN1 may disrupt transcriptional regulation of one or more gene(s) by altering the cis-acting regulatory element environments. The cis-acting regulatory environment contains a number of different sequences or "regulatory elements" that bind trans-acting regulatory proteins (transcription factor binding proteins-TFBPs) and modulate the activity of genes and other transcribed regions of the genome at distances of up to 2-3 Mb. Large genomic rearrangements have been shown to disrupt noncoding, regulatory DNA sequences that structurally alter the regulatory environment. Significant changes in the activity of one or more genes resulting in disease can occur by disrupting the interaction between a gene(s) and regulatory sequences (such as promoters, enhancers/repressors) or introducing new or altered chromatin interactions (Dixon et al., 2012;Lupianez et al., 2015;Spielmann & Mundlos, 2013).

| 427
Quantitative gene expression analysis showed that the CMTX3 candidate gene FGF13 (OMIM:*300070) had increased expression in patient lymphoblasts compared to controls and confirmed the complex insertion of chromosome 8q24.3 sequence into chromosome Xq27.1 can dysregulate genes in the CMTX3 linkage region. Further gene expression studies in disease-relevant tissues for candidate genes mapping to the CMTX3 and DHMN1 loci are therefore necessary to fully elucidate the pathogenic consequences of the CMTX3 and DHMN1 complex insertions.
Interestingly, several other phenotypes in addition to CMTX3 have been reported in which different large chromosomal sequences have been inserted into the same region of chromosome Xq27.1. These include ptosis (Bunyan et al., 2014), hyperthyroidism (Bowl et al., 2005), hypertrichosis (De Stefano et al., 2013;Zhu et al., 2011), and XX male sex reversal (Haines et al., 2015). Although a precise disease mechanism is yet to be elucidated, the multiple phenotypes observed due to insertions at chromosome Xq27.1 raises two possibilities: (i) juxtaposition of a candidate gene with a regulatory element that would otherwise not normally regulate that gene; (ii) the inserted sequence could contain gene regulatory elements which, placed in proximity to genes in the CMTX3 linkage region may result in ectopic neural specific expression.

| Altered chromatin environment
Another important consideration for pathogenic mechanisms underlying the CMTX3 and DHMN1 SV mutations is the effect of structural changes at the chromatin level of organization. Changes in the chromatin structure at a given locus can mediate the interaction between genes and their associated regulatory sequences (Kleinjan & Lettice, 2008;Schluth-Bolard, Ottaviani, Gilson, & Magdiner, 2011). It is becoming increasingly recognized that these interactions take place within the context of a "3D genome" in which multiple, complex genome interactions, and configurations determine important biological functions such as DNA replication, DNA repair, and transcription (Bonev & Cavalli, 2016). Chromosome conformation capture studies examining the frequency of chromatin interactions have demonstrated that functions of the genome such as control of gene expression are moderated by chromosomes being linearly partitioned into topologically associated domains (TADs) (Bonev & Cavalli, 2016;Dixon et al., 2012;Krijger & De Laat, 2016;Olivares-Chauvet et al., 2016). TADs are regions that demarcate chromosomal microenvironments in which sequences far apart in the genome preferentially come in close proximity to make contact with each other (Bouwman & De Laat, 2015). The contact between these linearly disparate sequences occurs through the 3D conformation of the genome. Little is known about the extent to which pathogenic SVs are known to alter the spatial organization of the genome causing disease. However, work in developmental and congenital disorders and cancer have provided some useful insights into the potential mechanisms that underpin the gene dysregulation resulting from altered spatial organization of the genome. These include (i) disruption to TAD boundaries, resulting in aberrant enhancer-promoter interactions; (ii) altering the contents of TADs, thereby introducing novel interactions (reviewed in (Kaiser & Semple, 2017)). Interestingly, studies have shown that the architecture of TADs remains relatively conserved at the megabase scale between tissue types (Berlivet et al., 2013;Dixon et al., 2012). However, at the sub-megabase scale there is evidence suggesting that "sub-TADs" (Phillips- Cremins et al., 2013) can vary among different tissues (Berlivet et al., 2013;Phillips-Cremins et al., 2013), particularly for differentially expressed genes (Dixon et al., 2012). Therefore, it is plausible that CMTX3 and DHMN1 could represent 'TADopathies' (Matharu & Ahituv, 2015) in which the disruption of sub-megabase scale interactions within peripheral nerve sub-TADs could produce neuropathy. The CMTX3 and DHMN1 genomic rearrangements, therefore represent an ideal naturally occurring paradigm to study the disruption of chromatin organization and the impact on gene regulation.

STRATEGIES TO STUDY SV CAUSING IPNS
Unlike protein coding mutations, SV mutations such as those causing CMTX3 and DHMN1 may not immediately reveal a causative gene. This raises some important considerations and challenges within this exciting area of IPN research.
Large SVs causing pathogenic genomic rearrangements will increase the number of candidate genes to consider as causative. In some diseases, multiple genes disrupted by SV events may contribute to disease pathology (reviewed in (Iyer & Girirajan, 2015)). For CMT1A/HNPP, the 1.5 Mb region which encompasses 8 gene transcripts and several noncoding RNAs is specifically caused by trisomy of the PMP22 gene. Given that CMTX3 and DHMN1 are Mendelian diseases, it is likely that the neuropathy will be caused by mutations affecting a single gene. In the event that multiple genes are dysregulated, identifying point mutations in these candidate genes that segregate in unsolved families could provide further evidence for the gene being causative. This was the case for CMT1A and HNPP where point mutations in the PMP22 gene were identified in families lacking the 1.5 Mb duplication/deletion (Nicholson et al., 1994;Roa et al., 1993).
To assess the effects of SV mutations, appropriate models are required. Modelling large genomic rearrangements like the CMTX3 and DHMN1 complex insertions are difficult as they exceed the size limit and cloning accuracy of current technologies. In addition, obtaining disease relevant tissue is not possible prior to postmortem. Many studies for IPN have examined gene expression in alternative tissues such as lymphoblast (Zimon et al., 2012) and fibroblast cells (Echaniz-Laguna et al., 2013;Kennerson et al., 2010). For CMTX3 and DHMN1, observable differences in the expression of FGF13 and UBE3C have been identified in lymphoblasts. However, given that the regulation of gene transcription is likely to occur in a neural tissue specific manner it is not yet clear whether these observed differences are relevant to the CMTX3 and DHMN1 phenotype. To address examining gene expression in neural tissue, motor neurons derived from patient-induced pluripotent stem cells (iPSCs) are now being used. Several studies have used iPSC-derived motor neurons (iPSC-MNs) to model neuromuscular diseases including ALS (Chen et al., 2014;Ichiyanagi et al., 2016;S. Lee & Huang, 2017), SMA (Fuller et al., 2015;Nizzardo et al., 2015) and various IPNs (G. Lee et al., 2009;Saporta et al., 2015). iPSCs-MNs have the potential to model neural specific phenotypic changes, gene expression, and chromatin interactions in disease relevant tissue. This will help to elucidate the disease pathology and identify biological pathways to target for the development of appropriate therapies.

| CONCLUSION
Inherited peripheral neuropathies and other Mendelian diseases have benefited from WES to interrogate protein coding regions of the genome. However, despite the advances in gene discovery and diagnostic testing many cases still remain unsolved. With WGS becoming more cost-effective, and the improvements in technologies to generate longer sequencing reads, looking beyond the exome for SV mutations is feasible and an important consideration for unsolved cases of IPN and other Mendelian diseases. The current active research to understand the role of genomic interactions in gene regulation will be facilitated by the use of patient derived neural tissue from CMTX3 and DHMN1 patients through iPSC technologies. This review highlights the contribution of SV mutations and gene dysregulation as an important development in understanding the pathogenesis of IPNs.

ACKNOWLEDGMENTS
This work was supported by the National Health and Medical Research Council project grant (APP1046680) awarded to M.L.K and G.A.N. A postgraduate scholarship from Sydney Medical School (Pamela Jeanne Elizabeth Churm Postgraduate Research Scholarship) supported A.N.C. We thank the CMT Association of Australia and the CMT families who have participated in the research.

CONFLICT OF INTEREST
None declared.