Metabolic, fibrotic and splicing pathways are all altered in Emery-Dreifuss muscular dystrophy spectrum patients to differing degrees

Abstract Emery-Dreifuss muscular dystrophy (EDMD) is a genetically and clinically variable disorder. Previous attempts to use gene expression changes to find its pathomechanism were unavailing, so we engaged a functional pathway analysis. RNA-Seq was performed on cells from 10 patients diagnosed with an EDMD spectrum disease with different mutations in seven genes. Upon comparing to controls, the pathway analysis revealed that multiple genes involved in fibrosis, metabolism, myogenic signaling and splicing were affected in all patients. Splice variant analysis revealed alterations of muscle-specific variants for several important muscle genes. Deeper analysis of metabolic pathways revealed a reduction in glycolytic and oxidative metabolism and reduced numbers of mitochondria across a larger set of 14 EDMD spectrum patients and 7 controls. Intriguingly, the gene expression signatures segregated the patients into three subgroups whose distinctions could potentially relate to differences in clinical presentation. Finally, differential expression analysis of miRNAs changing in the patients similarly highlighted fibrosis, metabolism and myogenic signaling pathways. This pathway approach revealed a transcriptome profile that can both be used as a template for establishing a biomarker panel for EDMD and direct further investigation into its pathomechanism. Furthermore, the segregation of specific gene changes into distinct groups that appear to correlate with clinical presentation may template development of prognostic biomarkers, though this will first require their testing in a wider set of patients with more clinical information.


Introduction
Emery-Dreifuss muscular dystrophy (EDMD) is a genetically heterogeneous neuromuscular orphan spectrum disease affecting ∼0.3-0.4 in 100 000 people (1), with clinical variability presenting even in family members carrying the same mutation (2)(3)(4)(5). EDMD patients present typically in mid to late childhood with early contractures of elbows and Achilles' tendons and progressive wasting of the lower leg and upper arm muscles. Cardiac involvement is also highly characteristic but tends to appear later in development and quite variably in time, though it tends to be reasonably uniform in the form it takes of cardiac conduction defects and dilated cardiomyopathy (6). Other features vary considerably in clinical presentation, leading to the usage of 'Emery-Dreifuss-like syndromes' (7,8): patients from the same pedigree can show remarkable phenotypic variation (2). The genetic variability is underscored by several confirmed linked genes and several additional candidate genes, although there are still some cases where no confirmed or candidate disease allele has been identified (9)(10)(11). The lack of large pedigrees in combination with its genetic heterogeneity, clinical variability, already some known modifier genes and limited patient numbers makes solving its pathomechanism difficult.
The original genes linked to EDMD, EMD encoding emerin and LMNA encoding lamin A/C, have both cytoskeletal and gene regulation roles leading to strong arguments for either function being responsible for the EDMD pathomechanism (12,13). The subsequent linking of nesprin and Sun proteins to EDMD (14,15) failed to lend clarity since they function in mechanosignal transduction (16). However, several recently linked genes have clear roles in genome organization and regulation (10), suggesting that this is the pathomechanism. These genes encode proteins that, such as emerin, are nuclear envelope transmembrane proteins (NETs) and seem to function by fine-tuning muscle gene expression by promoting the release of pro-myogenic genes from the nuclear periphery to enhance their activation while concomitantly recruiting metabolism genes (many from the alternative differentiation pathway of adipogenesis) to the nuclear envelope to better repress them (17)(18)(19). EDMD mutations were found in five muscle-specific NETs with this genome organization function, PLPP7 (also known as NET39), WFS1, TMEM38A, TMEM201 and TMEM214, and each tested had some specificity in the sets of genes that they target, though there was also some overlap (17). These studies together with the wide range of lamin and emerin gene regulatory activities led us to the non-traditional hypothesis for the EDMD pathomechanism whereby moderate reductions in many genes could yield the same phenotype as shutting down a central gene of a particular pathway. Accordingly, we considered that searching for uniformity among patients in altered pathways might be more revealing than searching for uniformity in expression changes of particular genes.
The only previous study, to our knowledge, using gene expression changes to identify critical misregulated genes underlying EDMD pathophysiology, focused on the identification of genes altered specifically in EDMD compared with a set of 10 other muscular dystrophies (20). This study only considered eight total LMNA-or EMD-linked cases of EDMD, but EDMD now has many more genes and modifiers linked to it and, moreover, there is a wider clinical spectrum of EDMD-like phenotypes (11). Their analysis indicated potential abnormalities in the regulation of cell cycle and myogenic differentiation, associated with perturbations in the pRB/MYOD/LMNA hub, which were consistent with changes in an Emd −/− mouse model (21). Roughly a fifth each of EDMD mutations occurs in LMNA and EMD while another 5-6% are collectively caused by four other widely expressed nuclear envelope proteins nesprin 1 (encoded by SYNE1), nesprin 2 (encoded by SYNE2), Sun1 (encoded by SUN1) and FHL1 (encoded by FHL1) (14,15,(22)(23)(24)). Another approximately 20% of EDMD mutations were accounted for by muscle-specific NETs that regulate musclespecific genome organization (10). These include NET39 (encoded by PLPP7), TMEM38A (encoded by TMEM38A), WFS1 (encoded by WFS1), NET5 (encoded by TMEM201) and TMEM214 (encoded by TMEM214) that affect 3D gene positioning with corresponding effects on expression (17,19). Accordingly, we sought to search for commonly affected pathways from a much wider range of EDMDlinked genes including LMNA, EMD, FHL1, SUN1, SYNE1, PLPP7 and TMEM214 alleles on the expectation that, covering an even wider genetic and clinical spectrum, the most important pathways for EDMD pathophysiology would be highlighted.

RNA-Seq analysis of EDMD patient cells
We performed RNA-Seq on myotubes differentiated in vitro from myoblasts isolated from 10 unrelated clinically diagnosed EDMD patients with distinct mutations in seven different genes to sample the genetic diversity of EDMD (Fig. 1). Patient mutations were TMEM214 p.R179H, PLPP7/NET39 p.M92K, Sun1 p.G68D/G388S, Nesprin 1 p.S6869 * , Emerin p.S58Sfs * 1, FHL1 mutations c.688 + 1G > A, p.C224W and p.V280M, Lamin A/C mutation p.T528K and Lamin A mutation p.R571S (this last mutation occurs in an exon absent from the lamin C splice variant). These mutations covered a wide range of clinical phenotypes with the age of onset ranging from early childhood to adult life and associated pathology ranging from no reported contractures to rigid spine (Table 1). Myoblast isolation followed by in vitro differentiation was chosen over isolating mRNA directly from the tissue samples in order to try to capture the earliest changes in gene expression due to the disease mutations i.e. the most likely to initiate the pathomechanism. Differentiating isolated myoblasts cells also reduces tertiary effects and variation from the age of the patients, range in time from onset to when biopsies were taken and differences in biopsy site (Table 1). These patient variables could also affect the efficiency of myotube differentiation; so, to ensure that different percentages of undifferentiated cells in the population did not impact measuring gene expression changes, the myotubes were specifically isolated by short trypsinization thus removing all myoblast contamination ( Fig. 1A and Supplementary Material, Fig. S1). To define the baseline for comparison, two age-matched healthy controls were similarly analyzed. Samples from all patients and controls all yielded high-quality reads ranging between 56 and 94 million paired-end reads (Supplementary Material, Table S1).
First, we compared each individual patient against the controls. Compared with the controls, each individual patient had between 310 and 2651 upregulated genes and between 429 and 2384 downregulated genes with a false discovery rate (FDR) of 5% (Fig. 1B). The large difference in the number of differentially expressed (DE) genes between patients suggested large heterogeneity. When we calculated the intersection of DE genes in all patients, only three genes were similarly downregulated (MTCO1P12, HLA-H, HLA-C), and one upregulated (MYH14) at 5% FDR, indicating a high degree of variation between patients (Supplementary Material, Fig. S2A). MTCO1P12 is a mitochondrially encoded pseudogene that has been reported to be severely downregulated in inf lammatory bowel disease, associated with reduced mitochondrial energy production (25). HLA-C is a member of the MHC class I and is involved in interferon gamma signaling, while HLA-H is a pseudogene derived from HLA-A which may function in autophagy (26)(27)(28). Mutations in non-muscle myosin gene MYH14 appear to be associated with hearing loss rather than muscle defects (29,30), although it has also been recently linked to mitochondrial fission defects (31).
The small number of intersecting genes was expected, given the heterogeneity and the number of samples. However, this kind of comparison underestimates the underlying similarity since all it takes is one patient for any given gene to miss the arbitrary 5% FDR cutoff and the gene would not be selected. Because the goal of this study is to identify common features among a sample of EDMD spectrum patients, instead of focusing on individual patients we next compared all patients together as a single group against the controls, and we denoted this analysis as G1 (one single group) throughout the text (Fig. 1). The G1 analysis revealed a set of 1127 DE genes (894 upregulated and 233 downregulated) with a 5% FDR cutoff. Preliminary examination of this set of genes across the 10 patients suggested that they might fall into two or three broadly distinct profiles with a main group comprising half of the patients that includes the patient with the classical emerin mutation. Around 60% of the genes were altered in the same direction in all 10 patients ( Fig. 1 and Supplementary Material, Fig.  S2B and C). Thus, while the majority of the 1127 genes behave in a similar fashion (albeit with differences in gene expression levels), there are substantial underlying differences and apparent subgroups of patients with more similar expression profiles that might ref lect clinical variation in EDMD.
Hierarchical clustering identified one main subgroup comprising half of the patients that includes the EMD mutation, with the remaining patients falling more loosely into two smaller groups ( Fig. 1C and D). While it is tempting to suggest these groups may represent separate EDMD spectrum subcategories, the small cohort employed precludes this conclusion. However, we reasoned that the distinction may bear clinical relevance and provide proof of principle that transcriptome analysis could assist clinicians in diagnosis and prognosis of EDMD spectrum patients. The three groupings were: group 1-Emerin p.S58Sfs * 1, Tmem214 p.R179H, NET39 p.M92K, Lamin A p.R571S, FHL1 p.C224W; group 2-Sun1 p.G68D/G388S, FHL1 c.688 + 1G > A, FHL1 p.V280M; group 3-Nesprin 1 p.S6869 * , Lamin A/C p.T528K (Fig. 1D). The same groups were independently identified using a principal component analysis (PCA) (Supplementary Material, Fig. S2D). Of particular note, both PCA and t-distributed stochastic neighbor embedding   Table 1 and myoblasts recovered. These were then differentiated in vitro into myotubes, and myotubes selectively recovered by partial trypsinization. Myotube RNA was extracted and used for sequencing. analyses revealed that clustering was independent of parameters such as patient gender or age or myotube enrichment differences (Supplementary Material, Fig. S3). Moreover, the several FHL1 and lamin mutations tested segregated into different expression subgroups. At the same time, the more recently identified EDMD mutations in TMEM214 and NET39 segregated with more classic emerin, FHL1 and lamin A mutations, further indicating the likelihood that their genome organizing functions could mediate core EDMD pathophysiology. The hypothesis that EDMD is a disease of genome organization misregulation is underscored by the fact that 15% of the genes changing expression in EDMD patient cells were altered by knockdown of at least one of the four muscle-specific genomeorganizing NETs that we previously tested (17) (Fig. 1E). Interestingly, in that study most of the genes altered by knockdown of NET39, TMEM38A and WFS1 were non-overlapping, while those altered by knockdown of TMEM214 exhibited considerable overlap with the sets altered by each of the other NETs (17). However, here there were roughly 70 DE genes overlapping with the sets of genes altered by knockdown of each individual NET while the total number of DE genes under the regulation of any of the four NETs was 165, indicating an enrichment in the EDMD DE set for genes inf luenced by multiple NETs (Fig. 1E). Thus, it is not surprising that NET39 and TMEM214 were both segregated together. Another interesting observation is that the number of NET-regulated genes overlapping with group 1 and group 3 was similar, but much fewer were overlapping for group 2. This suggests that gene misregulation in groups 1 and 3 might be more strongly mediated by the muscle-specific NET-gene tethering complexes than in group 2.

Functional pathway analysis of gene expression changes in EDMD patient cells
The primary aims of this study were to determine whether a functional pathway analysis would be more effective at revealing the likely underlying EDMD pathomechanism than just looking for uniformly altered genes and, if so, to identify candidate biomarkers from the affected pathways, though these would require subsequent validation due to the limited number of patient cells available for analysis. Before using this approach with our wider set of EDMD alleles, we applied a pathway analysis to the data from the previous microarray study by Bakay and colleagues where just LMNA and EMD mutations were considered (20). We reanalyzed Bakay's EDMD data and extracted the subset of DE genes with FDR of 5% (1349 and 1452 upregulated and downregulated genes, respectively). In order to identify enriched functional categories within each set of DE genes, we used g:Profiler (32). This tool calculates the expected number of genes to be identified for any given functional category by chance and compares it to the number of genes observed. We selected categories that were significantly enriched with an FDR of 5%. The resulting list was then summarized by selecting representative classes using a similar approach to Revigo (33), but extended to other functional category databases in addition to gene ontology (GO) terms. Brief ly, similarity matrices were generated by calculating pairwise Jaccard similarity indices between categories and applying hierarchical clustering to group together similar functional categories based on the genes identified. Redundancy was then reduced by choosing a representative category from each group.
The functional categories enriched in the set of EDMDupregulated genes revealed defects in cytokine signaling, organization of the extracellular matrix (ECM), and various signaling pathways important for muscle differentiation and function (e.g. PI3K-Akt, TGF-beta, SMADs). In addition, there was an aberrant upregulation of alternative differentiation pathways, notably adipogenesis but also angiogenesis and osteogenesis. The functions highlighted among the downregulated genes were largely related to metabolism, mitochondrial especially, as well as ribosome biogenesis, muscle contraction and myofibril assembly ( Fig. 2A). Applying the same methodology to our wider set of patient alleles highlighted fewer pathways than what we observed in the Bakay EDMD data, an expected outcome for the hypothesis that sampling the wider genetic variation in the disorder might hone in on the most central pathways for the pathomechanism. Among the upregulated categories, neurogenesis and ECM-related functions stood out, as well as MAPK signaling, lipid transport and TAP binding which are linked to interferon-gamma signaling (34). One category stands out among the downregulated genes: RNA splicing (Fig. 2B). The data above used g:Profiler which is very sensitive to the number of DE genes identified because it looks for statistical overrepresentation of genes belonging to specific functional categories among a set of previously identified DE genes. By contrast, gene set enrichment analysis (GSEA) (35) does not prefilter the data and instead ranks all genes according to the difference in expression between the two conditions tested: controls and EDMD. Next, it determines whether the distribution in the ranked list for any given functional category is random or significantly enriched statistically at either end of the ranked list. This method is especially sensitive for detecting functional categories where many genes are altered by a small amount and does not consider individual gene Pvalues. Therefore, we also applied GSEA to our data, querying several functional genesets within the Reactome, KEGG and WikiPathways databases, found in the Molecular Signatures Database (MSigDB) (36). This approach identified a larger set of functional categories that generally expanded on those identified by g:Profiler and matched better what we observed from the Bakay EDMD geneset, with strong links to ECM organization and cytokine signaling that may be relevant to fibrosis, differentiation, metabolism and splicing (Fig. 2C).
An expansion of categories for metabolic functions included specific categories for diabetes mellitus, adipogenesis, white and brown fat differentiation, nitrogen metabolism fatty acid metabolism, retinol metabolism and many others. Similarly, there was an expansion of cytokines supporting inf lammation for the fibrotic pathways and proteoglycans and elastin adding to the previous emphasis on collagens for ECM defects. Among the differences between our data and Bakay's EDMD data, two categories stand out: RNA splicing and calcium signaling, which were only observed in our data. It is unclear how much this ref lects using terminally differentiated muscle material versus early stages of differentiation in vitro, or a factor of microarray versus RNAseq analysis. In some cases, this is most likely due to the different transcriptome platform used. For example, applying GSEA to genomic positional genesets revealed near uniform upregulation of all mitochondrially encoded genes (Figs 2D and 3A). This could not be observed on Bakay's data because the microarrays did not contain probes for mitochondrially encoded genes. The upregulation of mitochondrial transcripts could lead to increased oxidative stress (37). This finding provides yet another mechanism that could lead to metabolic dysregulation on top of the alterations already indicated by the nuclear genome transcript changes. This further underscored the need to test for actual metabolic deficits in the patient cells themselves as well as to further investigate the other functional pathways highlighted by this analysis.  Table S3 for further details.

Detailed analysis of metabolic pathways uniformly altered in EDMD patients
Since metabolic disruption has been previously reported to affect muscle differentiation/myoblast fusion (38), we decided to investigate this further. While we identified a general upregulation of mitochondrially encoded genes, Bakay's data showed a downregulation of several classes related to mitochondrial function ( Fig. 2A) which was due entirely to nuclear-encoded genes, as there were no mitochondrial genes represented in the microarrays. When we checked the behavior of those genes in our data, we did not observe the same downregulation when considering all 10 patients as a single group (Fig. 3B). However, this is largely due to variability among the patient subgroups identified earlier, suggesting a mechanistic breakdown between them. Group 1 which contained half of our patients, including emerin and lamin A mutations, exhibited the same general downregulation of the nuclear-encoded mitochondrial genes. In contrast, group 2 displayed no alteration in gene expression, while group 3 showed upregulation although this was driven mostly by patient 9 (LMNA) with the other patient in the group, patient 4 (SYNE1), displaying very few changes. While no single gene was uniformly altered in the same direction for all patients, several genes from glycolytic and oxidative metabolism pathways, typically encoding components of mitochondrial complexes, were altered in all tested patients. Other non-mitochondrial metabolic pathways were also altered such as lipid translocation ( Fig. 2C and Supplementary Material, Table S3). Interestingly, downregulation of nuclear-encoded mitochondrial genes was also generally observed in other muscular dystrophies included in the study by Bakay and colleagues (Supplementary Material, Fig. S4).
To investigate the relevance of these gene changes to cellular metabolism, we performed real-time metabolic analysis using the Seahorse XFp Extracellular Flux Analyzer. Myoblasts isolated from the above patients plus several additional EDMD patients and controls were tested, so that we had a total of 14 EDMD patients and 8 controls for this analysis (Table 1). Probing for glycolysis, a significant reduction of the extracellular acidification rate in the EDMD samples was observed (Fig. 3C). Next, we investigated mitochondrial function. When testing for basal respiration there was also a significant reduction of the oxygen consumption rate in the EDMD samples (Fig. 3C). There were no significant differences in fuel dependency, but ATP production was considerably reduced in the EDMD samples ( Fig. 3D and E). The significant reduction in mitochondrial respiration raised another possibility to investigate that the absolute number of mitochondria might also be down due to problems in mitochondria biogenesis. Therefore, we quantified relative mitochondria numbers by qPCR. This revealed a clear reduction in mitochondria numbers (Fig. 3F), which with the generally elevated mitochondrial genome transcripts would suggest that a reduction in mitochondria numbers resulted in an overcompensation of expression which in turn could have resulted in inhibiting mitochondrial fission and repair. Thus, we also investigated whether genes in pathways associated with mitophagy were altered in the patients. Indeed, multiple mitophagy pathway genes were altered in all patients (Fig. 3G). Although no one individual gene was altered in all the patients, it is worth noting CISD2 is significantly downregulated in most patients. Reduction of CISD2 has been linked to degeneration of skeletal muscles, misregulated Ca 2+ homeostasis and abnormalities in mitochondrial morphology in mouse (39), as well as cardiac dysfunction in humans (40).

Detailed analysis of other pathways uniformly altered in EDMD patients
Several studies suggest that the timing of several aspects of myotube fusion could underlie some of the aberrancies observed in patient muscle (41) and, though it is unclear whether fibrosis drives the pathology or is a consequence of the pathology, fibrosis has been generally observed in EDMD patient biopsies. Contributing to these processes could be several subpathways that fall variously under the larger pathways for ECM/fibrosis, cell cycle regulation and signaling/differentiation (Fig. 4A). As for the metabolic analysis, no individual genes were altered in cells from all patients, but every patient had some genes altered that could affect ECM through changes in collagen deposition (Fig. 4B). For example, 35 out of 46 collagen genes exhibited changes in at least one comparison (Bakay EDMD, G1 or one of the subgroups gp1, gp2 and gp3) and all patients had multiple of these genes altered (Supplementary Material, Fig. S5). Note that it often appears visually that the Bakay data in the first column has little change when viewing the cluster analysis, but when looking at the full set of genes listed in the matching supplemental figures there are definitely some genes strongly changing, just not necessarily the same ones. This may be due to differences in the myogenic state of the material studied: while Bakay and colleagues used muscle biopsies containing terminally differentiated muscle fibers, we focused on the earlier stages of myogenesis by in vitro differentiating cultured myoblasts obtained from muscle biopsies. Despite this, it is important to note that while different genes may be affected, most of the same pathways were highlighted in both Bakay's and our study. Collagens COL6A1, COL6A2, COL6A3 and COL12A1 are linked to Bethlem muscular dystrophy (42)(43)(44)(45) and, interestingly, all these collagens were upregulated in group 1 patient cells and downregulated in group 3 patient cells (Supplementary Material, Fig. S5). Matrix metalloproteinases, which participate in the degradation and remodeling of the ECM, were also altered with 13 out of 28 matrix metalloproteinases exhibiting changes in at least one of the comparisons and all patients had multiple of these genes altered ( Fig. 4C and Supplementary Material, Fig. S5). Notably the metalloproteinase MMP1 (collagenase I), which has been proposed to resolve fibrotic tissue (46), was downregulated in all but one patient, as well as in Bakay EDMD samples. Likewise, multiple genes associated with fibrosis from FibroAtlas (  Figures S5-11. A few genes that stand out for their functions within the INF-gamma signaling pathway include IRF4 that is a regulator of exercise capacity through the PTG/glycogen pathway (47) and ILB1 that helps maintain muscle glucose homeostasis (48) such that both could also feed into the metabolic pathways altered.
Another subpathway critical for myogenesis and the timing and integrity of myotube fusion is cell cycle regulation. Cell cycle defects could lead to spontaneous differentiation and were previously reported in myoblasts from EDMD patients and in tissue culture cell lines expressing emerin carrying EDMD mutations which could lead to depletion of the stem cell population (41,49). All tested EDMD patients exhibited downregulation of multiple genes Other pathways in addition to ECM deposition directly associated with myogenic differentiation, myoblast fusion and muscle regeneration were also altered in all patients (Fig. 4H-J and Supplementary Material, Figs S9 and S10), though, again, no single gene in these pathways was altered in the same way in all patients' cells. Poor differentiation and myotubes with nuclear clustering were observed in differentiated EDMD myoblast cultures (14) and in the mouse C2C12 differentiation system when EDMD-linked NETs were knocked down (17).
Previous work using C2C12 cells identified six genes whose products are required in the early differentiation stages and were under the regulation of muscle-specific genome-organizing NETs (17). These genes (NID1, VCAM1, PTN, HGF, EFNA5 and BDNF) are critical for the timing and integrity of myotube fusion and need to be expressed early in myoblast differentiation but shut down later or they inhibit myogenesis (51)(52)(53)(54)(55). All six genes were misregulated in at least five but none were affected in all patients (Fig. 4K). In general terms, these genes were upregulated in group 1, downregulated in group 3 and mixed in group 2. All six genes were upregulated in Bakay's EDMD data, although only NID1 and HGF were statistically significant at 5% FDR. Both were upregulated only in group 1 and downregulated in group 3. PTN showed a similar pattern of expression as HGF although the only statistically significant changes were for upregulation in group 1.
Several myogenic signaling pathways were altered such as MAPK, PI3K, BMP and Notch signaling, and several alternate differentiation pathways were de-repressed such as adipogenesis that could disrupt myotube formation and function ( Fig. 2 and Supplementary Material, Table S3). Myogenesis and adipogenesis are two distinct differentiation routes from the same progenitor cells and whichever route is taken the other becomes repressed during normal differentiation (56,57). We previously showed that knockout of fat-or muscle-specific genome organizing NETs yield derepression of the alternate differentiation pathway (17,58) and the Collas lab showed that Lamin A/C lipodystrophy point mutations yield de-repression of muscle differentiation genes in adipocytes (59). We now find here that adipogenesis genes are upregulated in both Bakay's EDMD data and our data (Fig. 2). This is especially prominent for the five patients in group 1 while group 3 showing strong downregulation of a subset of the same genes and group 2 broadly looking like an intermediate of the other two groups ( Fig. 4L and Supplementary Material, Figs S11 and S12), and thus could also contribute to the metabolic defect differences between patients.

Splicing pathways uniformly altered in EDMD patients yield loss of muscle-specific splice variants
Among the downregulated functional categories, mRNA splicing stood out with many genes uniformly downregulated in all patient samples ( Fig. 2B and C and Supplementary Material, Fig. S14). Because of that we decided to investigate various subcategories and we found that there was a striking and uniform upregulation of factors supporting alternative splicing (AS) while constitutive splicing factors involved in spliceosome assembly and cis splicing are downregulated (Fig. 5A and Supplementary Material, Fig.  S15). Expression changes of as little as 10% (log2FC > 0.1) have been shown to result in biologically relevant changes for vital proteins like kinases and splicing factors (60). We thus assume that upregulation and downregulation of whole spliceosome subcomplexes even in low log2FC ranges lead to significant splicing misregulation. Notably, snRNAs of the U1 spliceosomal subcomplex, responsible for 5' SS recognition, constitute as much as 20% of all downregulated splicing factors (|log2FC| > 0.1, RNU1s and RNVU1s). Interestingly, a similar sharp cut-off between alternative and constitutive splicing has been reported in myotonic dystrophy (DM1/DM2) with similar genes being affected, namely CELFs, MBNLs, NOVA, SMN1/2 and SF3A1, among others (61). DM1 is one of the best studied splicing diseases and shares typical muscular dystrophy symptomatology with EDMD, namely progressive muscle weakness and wasting, cardiac arrhythmia and contractures. Moreover, a number of splicing changes in DM1 and DM2 also occur in other muscular dystrophies (62). Of note, MBNL3 is 4-fold transcriptionally upregulated in EDMD compared with controls. Its protein product impairs muscle cell differentiation in healthy muscle and thus needs to be downregulated upon differentiation onset (63).
Next, we performed splicing analysis to determine whether mis-splicing could drive some of the pathway alterations observed in the EDMD samples. For this purpose, we used three different methods: DEXSeq analyses exon usage, rMATS provides information about the five most common AS events and isoform Switch Analyzer (ISA) indicates which splicing events lead to annotated isoform switches. This revealed varying amounts of alternatively spliced genes in all samples and the three subgroups (Fig. 5B). Since every method focuses on a different event type/aspect of splicing, a higher amount of unique than overlapping genes is to be expected. Accordingly, an overlap of all three methods indicates genes with exon skipping events that lead to annotated isoform switches. The number of mis-spliced genes overlapping between the three algorithms was only 1 gene, ZNF880, in G1. In contrast, when analyzing group 1, 2 and 3 separately, each patient grouping had many mis-spliced genes identified by all three algorithms with 18 mis-spliced genes in the intersect for group 1, the group including half of the patients, and as much as 95 in group 3 (Supplementary Material, Table S4). These genes include Nesprin 3 (SYNE3), the splicing factor kinase CLK1 and the chromatin regulator HMGN3, all of which are potentially contributing to EDMD, given their functions. All results can be found in Supplementary Material, Table S5. The rMATS analysis includes five AS events: exon skipping (SE), intron retention (RI), mutually exclusive exons (MXE), alternative 3 splice site (A3SS) and alternative 5 splice site (A5SS) usage. Using this comprehensive dataset, we searched for AS events that are significantly differentially used |(percent-spliced-in-(psi)-value| > 0.1 and Pvalue < 0.05, Supplementary Material, Table S5). Comparing AS event inclusion between control and EDMD samples, we find thrice as much intron retention in EDMD, while all other events are similarly included as excluded. We hypothesize that this could be a result of downregulated U1 snRNAs which are necessary for proper spliceosome assembly. Supporting the likely importance of the splicing pathway to the EDMD pathomechanism, pathway analysis on these genes revealed a strong enrichment for pathways associated with metabolism, gene expression and the cytoskeleton (Fig. 5C). Moreover, group 1 and group 3 display an enrichment for myogenesis and muscle contraction. Using a custom-made set of genes either specific or relevant for muscle development and structure (see Materials and Methods section), we then scanned all significant and differential AS events (Fig. 5D and Supplementary Material, Fig. S16). Notably, mis-splicing led to the absence of many muscle-specific splice variants (Fig. 5E),  , gp2 and gp3). Overlap between all three methods indicates genes with exon skipping events that lead to annotated isoform switches. (C) Bar chart for functional pathways reveals an enrichment in altered splice variants for pathways associated with metabolism, gene expression, cytoskeleton organization, DNA repair, proliferation/differentiation, stress, ECM/fibrosis/and sarcomere structure. Number of genes detected displayed as log2. (E) Pie chart of significant AS events with psi > |0.1| in gp1 detected with rMATS, which were used to scan for muscle specific-splice variants, shown as heatmap. AS events, alternative splicing events, SE = exon skipping, MXE = mutually exclusive exons, RI = intron retention, A3SS = alternative 3 splice site, A5SS = alternative 5 splice site. (F) Isoform switches as analyzed using isoformSwitchAnalyzer for ZNF880 and TMEM38A. ZNF880 shows the same isoform switch in all groups with preferential use of the shorter isoform that contains the KRAB domain (black), but not the zinc finger domain (blue). TMEM38A shows a clear switch from the muscle isoform to a shorter isoform. among them vital muscle structural genes like TTN, TNNT3, NEB, ACTA4 and OBSCN as well as developmental regulators of the MEF2 family. Importantly, many of these genes mis-spliced in the EDMD patients are linked to a variety of other muscular dystrophies. For example, TTN, CAPN3, PLEC and SGCA are linked to Limb-Girdle muscular dystrophy (64)(65)(66)(67)(68)(69), DMD is linked to Duchenne muscular dystrophy and Becker muscular dystrophy (70,71), and BIN1, TNNT2/3 and MBNL1 are mis-spliced in myotonic dystrophy (72) and all of these are mis-spliced and/or have missing muscle-specific splice variants in many of the patients in our cohort.
Intriguingly, one of the mis-spliced genes that also displays an isoform switch in group 3 is TMEM38A that has been linked to EDMD (10). The altered splicing map for TMEM38A reveals that not only is its expression highly elevated in EDMD patients (log2FC = 2.9) but also that the protein-coding isoform displays a higher usage relative to abundance compared with the noncoding isoform (Fig. 5F) Many other notable mis-spliced genes are involved in myotube fusion such as the previously mentioned NID1 that is under spatial genome positioning control of NET39, another of the genome organizing NETs causative of EDMD. Most compellingly, three mis-spliced genes having to do with myogenesis/myotube fusion had muscle-specific splice variants absent in all patients (G1 rMATS). These were CLCC1 whose loss yields muscle myotonia (73), HLA-A/B that disappears during myogenesis and is linked as a risk factor for idiopathic inf lammatory myopathies (74,75), and SMAD2 that shuts down myoblast fusion (76). The above examples were found in all patients within a particular group, but not always amongst all patients from the study or determined by all algorithms; however, there were also some mis-spliced genes that are potentially even more interesting because they were mis-spliced in all patients and with all three algorithms yielding the same results. One of these was ZNF880. While overall transcript numbers remained similar, the isoform predominantly expressed in control cells, ENST00000422689, is strongly downregulated in group 1 while the shorter isoform, ENST00000600321, is strongly upregulated (Fig. 5F). Interestingly, the dominant isoform in EDMD loses the zinc finger domain (light and dark blue) and is left with the repressive KRAB domain (black). Little is known about ZNF880 except that it has an unclear role in breast and rectal cancer (77,78), and additional experiments are necessary to elucidate its role in EDMD.

miRNA-Seq analysis of EDMD patient cells
Changes in miRNA levels have been observed in a number of muscular dystrophies and are often used as biomarkers (79)(80)(81), but a comprehensive investigation of miRNA levels in EDMD has thus far not been engaged. Thus, the in vitro differentiated EDMD patient cells used for the preceding analysis were also analyzed by miRNA-Seq. We identified 28 differentially expressed miRNAs with some variation among patients (Fig. 6A). We extracted their putative targets from the miRDB database (http://mirdb.org) and selected those targets whose expression changed in the opposite direction of the miRNAs. Pathway analysis revealed misregulation of miRNAs largely associated with the same pathways that were misregulated from the RNA-Seq data, e.g. metabolism, ECM/fibrosis and signaling/differentiation (Fig. 6B and Supplementary Material, Table S6). More specifically, for metabolism 9 of the misregulated miRNA were linked to metabolic functions and with only partial overlap another 9 linked to mitochondria function, for ECM/fibrosis 19 of the misregulated miRNAs were linked to ECM and again with only partial overlap 10 to fibrosis and 13 to cytokines and inf lammation. As noted before the ECM category in addition to potentially contributing to fibrosis is also relevant for myotube fusion along with cell cycle regulation that was targeted by 13 misregulated miRNAs and myogenesis that was targeted by 5 miRNAs. Several misregulated miRNAs had functions relating to alternative differentiation pathways with 4 relating to adipogenesis, 12 to neurogenesis, 17 to angiogenesis and for signaling there were 9 misregulated miRNAs affecting MAPK pathways, 8 for Akt signaling, 1 for JAK-STAT signaling, 5 for TGF-beta signaling, 2 for Notch signaling and 3 for TLR signaling. Interestingly, some misregulated miRNAs were also reported as being linked to disease states such as miR-140-3p to dilated cardiomyopathy through its repressive effect on the integrin metalloproteinase gene ADAM17 (82). As well as working within cells, miRNAs are often detected within a circulating exosomal microvesicle population that can be harvested from blood serum. This makes them especially attractive as potential biomarkers when compared with more invasive biopsies, but a much larger sample size together with more clinical information will be required to clarify these as biomarkers.

EDMD gene expression signature suggests relationships to other muscular dystrophies
The earlier Bakay study analyzed patient samples from other muscular dystrophies for comparison to EDMD. Several of the disorders show a high degree of pairwise similarity from a transcriptome point of view (Supplementary Material, Fig. S12A). In fact, many of the functional categories misregulated in EDMD are also altered in the same direction in several other muscular dystrophies, although with some differences (Supplementary Material,  Fig. S12B). For example, collagens as a general category showed the highest correlations between EDMD and most other muscular disorders with the exception of hereditary spastic paraplegia, suggesting that the ECM is a major player in most muscular dystrophies. In contrast functional categories related to metabolism and mitochondrial function such as glycolysis, oxidative phosphorylation and mitophagy were more uniquely changed, i.e. weaker correlated between EDMD and other muscular disorders. Splicing also showed a weaker correlation with other muscular disorders compared with other muscle-specific functions (Supplementary Material, Fig. S12B). Because of the overall degree of similarity among neuromuscular diseases, and the fact that Bakay et al. used terminally differentiated muscle while our data come from a very early stage of differentiation in vitro, we reasoned that GSEA could be a sensitive method to compare the datasets. To this effect, we reanalyzed Bakay's data and extracted the DE genes for each of the neuromuscular diseases. We then checked our data for enrichment of those specific genesets. We then plotted the GSEA normalized enrichment score against the -log10(P-value). This way we can visualize positive or negative associations on the x-axis, and the higher on the y-axis the higher the confidence (lower P-values).
When we looked at all patients as a single group (G1), EDMD was the best match, with the highest score and lowest P-value (FDR 0.001), although unsurprisingly a few other diseases came very close, notably LGMD2A and facioscapulohumeral muscular dystrophy (FSHD) (Fig. 7A). This indicates that despite the differences in the individual DE genes between the mature muscle data from Bakay's EDMD geneset and our early in vitro differentiation geneset, a clear EDMD gene expression signature was displayed in our data. The next best match is Limb-Girdle muscular dystrophy 2A (LGMD2A), which shares some symptomatology with EDMD including muscle wasting, contractures and mild cardiac conduction defects although cardiac involvement is infrequent (83). The differences in gene signatures that broke down the 10 patients into three EDMD patient subgroups could ref lect an underlying cause of clinical disease spectrum or indicate that a group may not be adequately classified as EDMD. Therefore, we performed the same GSEA analysis on each subgroup separately. Group 1, which had both classic emerin and lamin A EDMD mutations, showed an even better match with the Bakay EDMD group which was again very close to LGMD2A but also to DMD, Becker muscular dystrophy (BMD), FSHD and Limb-Girdle muscular dystrophy 2I (LGMD2I) (Fig. 7B). LGMD2B was still separate and closer to juvenile dermatomyositis (JDM). For group 2, none of the diseases matched at 5% FDR, although the Bakay EDMD set remained the most like our set. Interestingly, two diseases exhibited an anticorrelation: DMD and BMD, which are both caused by mutations in the dystrophin gene DMD. In contrast, group 3 appeared to be the most distinct and in many ways opposite to group 1, which is a pattern that was often observed in the functional gene subsets analyzed (Supplementary Material, Figs S5-S11). Group 3 was anti-correlated with EDMD and most of the other muscular dystrophies, while the neurogenic amyotrophic lateral sclerosis appeared as the best match, possibly suggesting a neuronal bias in this group (Fig. 7B). This is further supported by the appearance of axonal neuropathy, ataxia, undergrowth and speech problems in one of the two patients from this group (patient 4, SYNE1; Table 1), while none of the others exhibited any signs of neuropathy.
The relationship of the patient groups segregated by gene signatures to potential differences in clinical presentation is underscored by the functional pathways enriched in each group over the others (Fig. 7C). Group 1 showed a strong enrichment of pathways associated with ECM and fibrosis, such as interferon signaling, TNF signaling, ECM organization, ECM proteoglycans, integrin cell surface interactions, collagen formation and signaling by PDGF all upregulated. Adipogenesis was also particularly promoted in group 1 compared with the others, and cardiac conduction defects were also highlighted. Group 2 was more uniquely associated with Hippo signaling and BMP2-WNT4-FOXO1 pathway and had fewer links to ECM and fibrosis. Group 3 was more uniquely associated with metabolism, particularly upregulation of oxidative phosphorylation, mitochondrial biogenesis, glucagon signaling pathway, gluconeogenesis, glycolysis and gluconeogenesis, metabolism, TP53 regulates metabolic genes and thermogenesis pathways. This would suggest that group 1 pathophysiology may have more characteristics of fibrosis and altered myofibers, while group 2 may have more differentiation or mechanosignaling defects and group 3 more metabolic defects (Fig. 7D).

Discussion
Attempts to identify the EDMD pathomechanism or clinical biomarkers purely through gene expression signatures are limited because there is too little uniformity in differential gene expression between all patients, although there have been some promising reports using miRNA profiling or detection of cytokines in serum (81,84). We therefore engaged a functional pathway analysis using in vitro differentiated myotubes derived from 10 unrelated EDMD patients with known mutations in seven EDMD-linked genes. While it is difficult to detect many individual genes that were uniformly changed in all patients, we found many pathways that were affected in all patients. Thus, although different genes may have been targeted in different patients, the same functional pathway would be disrupted and thus yield a pathology with similar clinical features. Many pathways were disrupted when we re-analyzed data from the previously published Bakay study (20) and we postulated that, as they just analyzed mutations in two of the over two dozen genes linked to EDMD, analyzing a larger set of linked genes might narrow down the number of pathways to highlight those most relevant to EDMD pathophysiology. Indeed, when we considered a wider set of patients with mutations in seven different genes the set of affected pathways narrowed to the point that we could identify four likely candidate umbrella pathways. (D) Each patient subgroup was relatively more enriched for certain functional pathways than other subgroups, suggesting that treatments for example targeting different metabolic pathways for groups 1 and 3 might partially ameliorate some patient difficulties.
These four umbrella pathways all make sense for contributing to or even driving the EDMD pathomechanism (85). Disruption of metabolism pathways was consistent with the significantly reduced glycolysis and mitochondrial respiration output we showed in patient myoblasts compared with controls and it makes sense that this could lead to fatigue, weakness and muscle atrophy. ECM changes and fibrosis pathways are consistent with pathology observed in EDMD and similarly could drive some of the initial pathology and, as fibrosis accumulates, contribute to disease progression. De-repression of genes from alternate differentiation pathways and defects in myogenesis through disrupted signaling pathways and cell cycle regulation could generate aberrant myotubes to yield pathology. Finally, the last disrupted pathway of splicing yields a loss of muscle-specific splice variants that could impact on all three preceding pathways.
There is much scope for intersection between the four highlighted pathways altered in all sampled EDMD patient cells. For example, amongst the de-repressed differentiation pathways was adipogenesis that could also have impact on the metabolism pathway. Even amongst the few genes that were uniformly altered in all patients sampled, though not originally obvious, a more detailed reading of the literature leads to intersections with these pathways. For example, while the MYH14 general upregulation did not make obvious sense for muscle defects since it is not part of the contractile machinery, it has been shown that a mutation in MYH14 disrupts mitochondrial fission in peripheral neuropathy (31). Thus, MYH14 could potentially feed into the mitochondrial deficits noted in the patient cells. Many of the miRNAs found to be altered in the patients feed into several of these pathways. For example, miR-2392 that is increased in all patients downregulates oxidative phosphorylation in mitochondria (86) but at the same time also is reported to promote inf lammation (87). miR-140 that is up in all groups has roles in fibrosis through collagen regulation (88), is pro-adipogenic (89) and inhibits skeletal muscle glycolysis (90). miRNAs could also be used potentially prognostically between the different groups as for example miR-146a is upregulated in group 1, unchanged in group 2 and downregulated in group 3. This miRNA has a strong effect on inf lammation and has been implicated in fibrosis in the heart (91). miRNAs show some promise as biomarkers, especially if they could be isolated from circulating exosome vesicles in serum. Several miRNAs have been proposed as markers for lamin A/C-associated muscular dystrophies, targeting functions such as muscle repair through TGF-beta and Wnt signaling. Some of those were also identified in our study, such as miR-335, which plays a role in muscle differentiation (81). Other miRNAs identified in that study appear to be misregulated only in one subgroup of patients but not others. For example, miR-100 and miR-127-3p are misregulated in group 1 only, while miR-136, miR-376c and miR-502-3p are only misregulated in group 3, and miR-148a is upregulated in group 1 but downregulated in group 3. Because there is so much functional overlap between miRNA targets and the pathways noted from the RNA-Seq analysis, it is unclear to what extent the gene expression changes observed could be indirect from the misregulated miRNAs. Nonetheless, there are four core functions targeted by multiple mechanisms that we argue are likely to be central to the core EDMD pathomechanism. Interestingly, the literature is filled with many examples of mutation or loss of different splicing factors causing muscle defects though no individual mis-spliced gene was identified as mediating these effects. Similarly, in myotonic dystrophy type 1 (DM1) there are many mis-spliced genes thought to contribute to the disease pathology (72). For example, the splicing factor SRSF1 that is down in most patients is important for neuromuscular junction formation in mice (92). It has to be noted that while the present study used in-vitro differentiated myotubes, these pathways may cross-talk not just at the level of gene expression in the myotubes themselves, but also through effects determined in vivo by the muscle microenvironment which may vary depending on the activation of inf lammatory pathways, for example.
How so many genes become misregulated has not been experimentally proven, but for lamin A/C, emerin, Sun1, nesprin, TMEM214 and PLPP7/NET39, the fact that mutations to all individually yield many hundreds of gene expression changes with considerable overlap strongly suggests that they function in a complex at the nuclear envelope to direct genome organization. Knockdown of Tmem214 and NET39 as well as several other muscle-specific NETs each alters the position and expression of hundreds of genes (17). Separately it was found that lamins and NETs, including emerin, function together in distinct complexes involved in tethering genes to the nuclear envelope in fibroblasts (93) and in muscle cells (94). Thus, disruption of emerin, lamin A/C or any other component of these tethering complexes could yield sufficiently similar gene/pathway expression changes to yield the core clinical features of EDMD. We propose that the different muscle-specific NETs give specificity to a complex containing lamin A/C and emerin and that Sun1 and nesprin proteins can indirectly impact on these complexes through mediating mechanosignal transduction and FHL1 in interpreting such signals. Since 15% of all genes changing here were affected by at least one of the muscle-specific genome-organizing NETs that were tested by knockdown, this would provide a core set of genome organization and expression changes to cause the core EDMD pathology. Since the majority of genes affected by each NET tested were unique to that NET with the exception of Tmem214, this could account for other gene expression changes that drive the segregation into subgroups which could in turn contribute to clinical variation. This interpretation is consistent with the numbers of genes changing for mutations in different nuclear envelope proteins (Fig. 1B). That the genome organizing NET mutations yielded fewer genes changing than the lamin A/C mutations may be because lamin A/C mutations disrupt multiple genome tethering complexes and thus affect more genes. The segregation of the two LMNA mutation gene signatures into separate groups might ref lect separate complexes for lamin C or each mutation disrupting different sets of complexes. In either case, the extreme differences in lamin mutations gene expression profiles is not entirely surprising as different lamin mutations also exhibited large differences in studies of nuclear mechanics (95); so this could also impact on mechanosignal transduction. The Sun1 mutation may have affected fewer genes because of redundancy with Sun2 in its mechanosignal transduction function while the Nesprin 1 mutation may have had more genes changing because it is more central to mechanosignal transduction. More work is needed to clarify on all these possibilities. The FHL1 mutations add another level of complexity to EDMD as there are several splice variants of FHL1 and only the B variant (ENST00000394155) targets to the nuclear envelope (96). That EDMD is a nuclear envelope disorder is underscored by the fact that none of the FHL1 mutations occur in exons found in the much shorter C variant (ENST00000618438) and the patient 8 mutation p.V280M is in an exon unique to FHL1B. Thus, the nuclear envelope splice variant is the only one that could yield pathology in all patients, though some of the variation could come from one of the patients also expressing the mutant A splice variant (ENST00000543669).
While further work is needed to validate the correlations between the gene expression profile subgroupings and their clinical presentation and disease progression, our finding of such distinct gene expression profiles amongst clinically diagnosed EDMD patients argues that the currently used clinical phenotype spectrum umbrella of the EDMD classification may be too broad and it might be reclassified in more precise subtypes. What is clear is that the original classifications of EDMD subtypes based just on the mutated gene often allow for cases with very dissimilar gene profiles to be classified together, while similar gene signature cases are classified as separate classes. The two mutations in LMNA yielded changes in gene expression profiles that were far more different from one another than the group 1 lamin A mutation gene profile was from the TMEM214, NET39, emerin and FHL1 mutation gene signatures. Similarly, the FHL1 p.C224W mutation yielded greater gene expression differences from the other two FHL1 mutations than it did for the other proteins in group 1. Thus, EDMD might be better classified by similarities in gene expression profiles than by the particular gene mutated and our study shows proof of principle for this, even if the groupings may change slightly once a larger patient cohort can be established and examined. Regardless, these groups have distinctive gene expression and miRNA signatures that could be used as biomarkers both diagnostically and perhaps prognostically. To get to that point will require a more comprehensive modern description of clinical Gestalt phenotypes including e.g. imaging datasets and disease progression timelines deciphering unique groups. Importantly, and regardless of the disease nomenclature, the different pathways we found enriched for in each subgroup could be converted to clinical recommendations based on the much more conserved individual gene expression changes for each subgroup (Fig. 7C). For example, EDMD patients have been considered by some clinicians to be at risk for malignant hyperthermia (97), though a consensus was never achieved. Our data show that the three genes currently associated with malignant hyperthermia (RYR1, CACNA1S and STAC3) are all misregulated in groups 1 and 3, but not group 2 ( Supplementary Material, Fig. S17). Thus, checking expression of these genes might indicate whether a patient is likely to be at risk or not. Finally, an additional new aspect coming up from our datasets is that it might be worth further investigating the role of splicing in muscle differentiation because it might be of wider relevance to muscular dystrophy beyond DM and EDMD.

Patient materials
The sources of patient samples were the Muscle Tissue Culture Collection (MTCC) at the Friedrich-Baur-Institute (Department of Neurology, Ludwig-Maximilians-University, Munich, Germany) and the MRC Centre for Neuromuscular Disorders Biobank (CNDB) in London.

Ethical approval and consent to participate
All materials were obtained with written informed consent of the donor at the CNDB or the MTCC. Ethical approval of the rare diseases biological samples biobank for research to facilitate pharmacological, gene and cell therapy trials in neuromuscular disorders is covered by REC reference 06/Q0406/33 with MTA reference CNMDBBL63 CT-2925/CT-1402, and for this particular study was obtained from the West of Scotland Research Ethics Service (WoSRES) with REC reference 15/WS/0069 and IRAS project ID 177946. The study conduct and design complied with the criteria set by the Declaration of Helsinki.

Myoblast culture and in vitro differentiation into myotubes
Myoblasts were grown in culture at 37 • C and 5% CO 2 using a ready to use formulation for skeletal muscle (PELOBiotech #PB-MH-272-0090) and maintained in subconf luent conditions. In order to induce differentiation, the cells were grown to conf luency and 24 h later the growth medium replaced with skeletal muscle differentiation medium (Cell Applications #151D-250). The differentiation medium was replaced every other day. Myotubes were selectively harvested after 6 days by partial trypsinization followed by gentle centrifugation (Supplementary Material, Fig.  S1). Each differentiation experiment was performed in triplicate. To avoid batch effects, experiments were distributed over several weeks, so that no two samples or replicates were differentiated at the same time. Myotubes were stored in Trizol at -80 • C until all samples were collected.

RNA extraction
Myotubes were stored in Trizol at -80 • C until all samples were available. Subsequent steps were performed simultaneously for all samples. Total RNA was extracted from each sample and separated into a high molecular weight fraction (>200 nt, for mRNA-Seq) and a low molecular weight fraction (<200 nt, for miRNA-Seq) with the Qiagen RNeasy (#74134) and miRNeasy (#1038703) kits, according to the manufacturer's instructions. RNA quality was assessed with a Bioanalyzer (Agilent Technologies), and all samples had an RIN > 7, with an average of 9.4 (Supplementary Material, Table S1).

miRNA-Seq analysis
miRNA was sent to RealSeq Biosciences Inc. (SC, USA) for sequencing using an Illumina NextSeq 500 v2 sequencer in single end mode, 1 × 75 nucleotides. The sequencing library was prepared with Somagenics' Low-bias RealSeq-AC miRNA library kit (#500-00012) and quality assessed by Tapestation (Lab901/Agilent). On average, 5 million good quality reads were obtained per sample. Mapping and quality trimming was performed using the NextFlow nf-core/smrnaseq pipeline (https://nf-co.re/smrnaseq) with default parameters, which summarizes the reads per miRNA using the annotations from mirTop (https://github.com/miRTop/ mirtop). Differential expression analysis was performed in R with DESeq2 v1.32.0. We used an FDR threshold of 0.2 for differential expression. Putative miRNA targets were extracted from miRDB (https://mirdb.org/) for each differentially expressed miRNA and their expression compared against the miRNA. We kept as potential targets those genes whose expression changed in the opposite direction of the miRNA.

Bakay muscular dystrophy dataset analysis
Normalized (MAS5.0) microarray transcriptome data for a panel of 11 muscular dystrophies and healthy controls were downloaded from the Gene Expression Omnibus database (GEO), accession GSE3307 (https://www.ncbi.nlm.nih.gov/geo/). Differential expression analysis comparing each disease to the controls was performed using Limma 3.48.1 (102). We used an FDR threshold of 5% for differential expression.

Functional analyses
Functional analyses were performed with g:Profiler (103) and Gene Set Enrichment Analysis (GSEA v4.1.0) (35) tools. g:Profiler was used to determine enriched categories within a set of DE genes, with an FDR of 5% as threshold. GSEA was performed with default parameters, in particular using 'Signal2Noise' as ranking metric and 'meandiv' normalization mode. Redundancy in category lists was reduced by comparing the similarity between each pair of enriched categories using Jaccard similarity coefficients. Hierarchical clustering (k-means) was then applied to the resulting matrix in order to identify groups of similar functional categories, and a representative from each group chosen. Full unfiltered results are shown in Supplementary Material, Table  S3. Tissue-specific gene enrichment analysis was evaluated with TissueEnrich (104).
For the miRNA-Seq experiments, functional analysis was first performed using g:profiler on the set of DE miRNA genes. Then, putative targets for each miRNA were extracted and their expression compared with the relevant miRNA. Putative targets whose expression was not altered in the opposite direction as the miRNA were removed from the list. Significant functions were displayed using Cytoscape v3.8.2 (105), with the size of the functional labels proportional to the number of miRNAs assigned to each function.

Real-time metabolic measurements
Metabolic measurements on primary human myoblast cultures were performed using a Seahorse XFp Extracellular Flux Analyzer (Agilent Technologies). For this, myoblasts of matched passage number were seeded in XFp Cell Culture Miniplates (103025-100, Agilent Technologies) at a density of 1.5 × 10 4 cells per well. Cell density was assessed using an automated cell counter (TC20, BioRad). Oxygen consumption rates and extracellular acidification rates were measured using the Mito Stress Test Kit and the Glycolysis Stress Test Kit (Agilent Technologies #103020-100), respectively, according to the manufacturer's instructions. Samples were measured in triplicates and each measurement was repeated between two and four times. Data were normalized to the number of cells and analyzed for each well.

Fuel dependency tests
Glucose dependency and fatty acid dependency were determined according the instruction of Agilent Seahorse XF Mito Fuel Flex Test kit (Agilent Technologies #103260-100). The glutamine dependency was determined from the glucose and fatty acid measurements.
DEXseq: Mapped reads were counted using the in-built DEXseq counting function in Python 3.9. Standard DEXseq workf low was followed. Exons with |logFCs| > 1 and P-values < 0.05 were set to be significantly different.
rMATS: Standard workf low was followed, and code was executed in Python 2.7. Results were analyzed in R and set to be significantly differentially spliced with |psi-values| > 0.1 and Pvalues < 0.05. Pie charts displaying the distribution of event usage were generated for all groups (Supps. something). GO term enrichment analysis was performed using g:profiler2 (113). All events were searched for muscle-specific genes using a set of 867 genes relevant for muscle system process, development, structure and contraction, combined from GO terms (GO:0003012, GO:0006936, GO:0055001 and GO:0061061).
ISA: Kallisto counts were read into R and standard ISA procedure was followed, including splicing analysis using DEXseq, coding potential using CPC 2.0 (114), domain annotation using HmmerWeb Pfam 35.0 (115), signal peptides using SignalP 5.0 (116) and prediction of intrinsically unstructured proteins using IUPred2A (117). In-built visualization tools were used for splicing maps.

Data availability
Bakay et al. muscular dystrophy dataset is available at NCBI GEO with accession GSE3307.
RNA-Seq and miRNA-Seq datasets have been deposited at NCBI GEO with accession GSE204804 and GSE204826, respectively.

Supplementary Material
Supplementary Material is available at HMG online.