The functional O-mannose glycan on α-dystroglycan contains a phospho-ribitol primed for matriglycan addition

Multiple glycosyltransferases are essential for the proper modification of alpha-dystroglycan, as mutations in the encoding genes cause congenital/limb-girdle muscular dystrophies. Here we elucidate further the structure of an O-mannose-initiated glycan on alpha-dystroglycan that is required to generate its extracellular matrix-binding polysaccharide. This functional glycan contains a novel ribitol structure that links a phosphotrisaccharide to xylose. ISPD is a CDP-ribitol (ribose) pyrophosphorylase that generates the reduced sugar nucleotide for the insertion of ribitol in a phosphodiester linkage to the glycoprotein. TMEM5 is a UDP-xylosyl transferase that elaborates the structure. We demonstrate in a zebrafish model as well as in a human patient that defects in TMEM5 result in muscular dystrophy in combination with abnormal brain development. Thus, we propose a novel structure—a ribitol in a phosphodiester linkage—for the moiety on which TMEM5, B4GAT1, and LARGE act to generate the functional receptor for ECM proteins having LG domains. DOI: http://dx.doi.org/10.7554/eLife.14473.001


Introduction
Twenty-five years ago it was proposed that alpha-dystroglycan (a-DG) played a major role in bridging the extracellular matrix to the muscle plasma membrane and actin cytoskeleton as a component of the multi-protein dystrophin glycoprotein complex (Ervasti et al., 1990;Yoshida and Ozawa, 1990). Defects in the proper formation of this complex have been shown to be causal for various forms of muscular dystrophy (Carmignac and Durbeej, 2012). However, to date, only 3 patients have been found that have primary defects in the coding sequence of a-DG (Dong et al., 2015;Geis et al., 2013;. Over fifteen years ago, it was suggested that defects in the glycosyltransferases needed for proper glycosylation of a-DG were causal for a subset of muscular dystrophies, the so-called secondary dystroglycanopathies that can range in severity from mild Limb-Girdle Muscular Dystrophy (LGMD) to severe Walker-Warburg syndrome (WWS) (Chiba et al., 1997;Holt et al., 2000;Wells, 2013). In the early 2000's, it became clear that the defects causal for disease involved enzymes that initiated and elaborated the extended O-mannose (O-Man) glycan structures covalently attached to Ser/Thr residues of a-DG (Beltrán-Valero de Bernabé et al., 2002;Michele et al., 2002). Since then, steady progress has been made in elucidating the subset of mammalian O-Man structures that directly interact with extracellular matrix components and the candidate genes necessary for the functional glycosylation of a-DG . These include the identification of a subset of phosphorylated O-Man structures containing extended, LARGE-dependent, repeating disaccharide polymers, structures that have been recently termed matriglycan (Yoshida-Moriguchi and Campbell, 2015).
O-Mannosylation begins in the endoplasmic reticulum (recently reviewed in Dobson et al., 2013;Endo, 2015). The addition of O-Man in an alpha linkage to serine and threonine residues of a select set of mammalian glycoproteins is catalyzed by the POMT1/2 complex using dolichol-phosphomannose as the donor (Beltrán-Valero de Bernabé et al., 2002). At this point, by mechanisms that have yet to be fully elucidated, there is a divergence in elaboration ( Figure 1). The vast majority of O-Man sites are extended in the Golgi in a beta-1,2 linkage with N-acetylglucosamine (GlcNAc) by POMGNT1 that can then be further branched by GlcNAc and/or elaborated by galactose, fucose, and sialic acid to generate the M1 and M2 glycans Lee et al., 2012;Yoshida et al., 2001). A small subset of O-Man modified sites, apparently exclusively on a-DG, are extended in the endoplasmic reticulum by a GlcNAc in a beta-1,4 linkage by POMGNT2 to generate the M3 glycans (Manzini et al., 2012;Yoshida-Moriguchi et al., 2013). This is further elaborated into a trisaccharide by the action of a beta-1,3-N-acetylgalactosamine (GalNAc) transferase, B3GALNT2 (Yoshida-Moriguchi et al., 2013). This trisaccharide is a substrate for POMK that phosphorylates the initiating O-Man residue at the 6-position (Yoshida-Moriguchi et al., 2013). After unknown elaboration of the phosphotrisaccharide in a phosphodiester linkage, presumably by one or more of the as-yet to be assigned CMD-causing gene products (ISPD, TMEM5, Fukutin, and FKRP), B4GAT1 adds a glucuronic acid (GlcA) in a beta-1,4 linkage to an underlying beta-linked xylose (Xyl) Willer et al., 2014). The addition of the xylose by an unspecified enzyme followed by the action of B4GAT1 serves as a primer for LARGE to synthesize the repeating Xyl-GlcA disaccharide, matriglycan, that serves as the binding site for several extracellular matrix proteins Willer et al., 2014).
Here we report a M3 glycan structure with a phosphodiester linked ribitol that TMEM5, B4GAT1, and LARGE act on to generate the functional receptor for extracellular matrix (ECM) ligands characterized by Laminin G domain-like (LG) protein domains. We demonstrate that ISPD is a CDP-ribitol pyrophosphorylase employed for the synthesis of the required sugar (alcohol) nucleotide needed for ribitol insertion into the M3 glycan. We establish TMEM5 as the candidate xylose transferase and demonstrate the impact of its knockdown in a zebrafish model consistent with a CMD phenotype. We also identify and characterize a novel TMEM5 mutation identified in a WWS family. Finally, we present a model of the functional M3 glycan structure along with the more than a dozen assigned or proposed enzymes required for its synthesis.

Results
Ribitol-Xyl-GlcA is released from a-DG-340 following cleavage of the phosphodiester linkage A truncated, secreted version of a-DG with a COOH-terminal GFP and His tag (a-DG-340 that is only 28 amino acids (313-340) derived from a-DG following endogenous furin cleavage) was overexpressed in HEK293F cells, which express low levels of LARGE, and purified from the media. This 28 amino acid sequence contains only one putative M3 glycan consensus site (TPT,(317)(318)(319). We previously demonstrated that the generation of peptides bearing the phosphotrisaccharide could be isolated from a-DG constructs following cleavage of the phoshodiester linkage and that such treatment resulted in loss of IIH6 (an antibody that binds functionally glycosylated a-DG) binding ( Following addition of Man to serine/threonine residues on protein substrates by POMT1/2 there is a divergence in elaboration. Known enzymes are displayed in purple and unknown enzymes prior to this manuscript are displayed in red. (A) POMGnT1 adds a beta-2 linked GlcNAc onto the underlying Man. Shown is just one potential extended M1 glycan structure. (B) Following the action of POMGnT1, GnT-Vb can add a beta-6 linked GlcNAc onto a M1 glycan structure to convert it to a M2 glycan structure. Shown is just one potential extended M2 glycan structure. (C) Instead of extension in a beta-2 linkage, M3 glycan structures get the core Man extended by POMGNT2 with a beta-4 linked GlcNAc. This disaccharide is further elaborated to contain a phosphodiester linkage to an unknown moiety (shown as (X)) that becomes modified with repeats of (Xyl-GlcA). DOI: 10.7554/eLife.14473.002 figure supplement 2) (Yoshida-Moriguchi et al., 2010). Thus, we decided to interrogate the released glycan portion to further elucidate the functional glycan structure. Following aqueous HF treatment, which selectively cleaves phosphodiesters, released glycans were isolated from a-DG-340 and permethylated. Tandem mass spectrometry of the permethylated glycans revealed the presence of a linear pentitol-Xyl-GlcA ( Figure 2) as well as further disaccharide extended structures (pentitol-(Xyl-GlcA)n, Figure 2-figure supplement 1).

ISPD is a CDP-ribitol (ribose) pyrophosphorylase
Given the identification of pentitol in the glycan structure, we investigated whether ISPD might be able to generate CDP-ribitol. Our rationale for this was that mutations in ISPD are causal for CMD (Roscioli et al., 2012;Willer et al., 2012), ISPD is a putative nucleotidyltransferase found in the cytosol (Vuillaumier-Barrot et al., 2012), and homologs in bacterial systems are involved in the generation of identical, CDP-ribitol (Baur et al., 2009), or similar structures, 4-CDP-2-C-methyl-Derythritol (Richard et al., 2001). ISPD was expressed as a His fusion protein in HEK293F cells and   Figure 2. Ribitol-Xyl-GlcA is released from a-DG upon cleavage of phosphodiester linkage. Following treatment of purified a-DG-340 with aqueous HF, released glycans were captured, permethylated and analyzed by tandem mass spectrometry. On the left is a zoom view of the full MS scan demonstrating the intact mass and to the right the fragmentation pattern leading to the identification of GlcA-Xyl-pentitol. Experiments were performed in triplicate with identical findings each time. DOI: 10.7554/eLife.14473.003

MS MS/MS
The following figure supplements are available for figure 2: purified from whole cell extracts ( Figure 3-figure supplement 1). The enzyme was capable of generating CDP-ribitol or CDP-ribose using CTP and ribitol-5-phosphate or ribose-5-phosphate, respectively, but was not able to generate the sugar (alcohol) nucleotides with ribitol or ribose ( Figure 3). Thus, ISPD is a CDP-ribitol (ribose) pyrophosphorylase that generates the needed reduced sugar (alcohol) nucleotide for integration of a ribitol into the functional O-Man glycan structure. During the preparation of our manuscript, a study was published demonstrating that CDP-ribose, CDP-ribitol, and CDP-ribulose could all be generated by mammalian ISPD consistent with our present findings .

Phosphodiester-linked ribitol is connected to the M3 Glycan
Having identified the ribitol directly attached to xylose, we sought to identify whether the ribitol was the bridge between xylose and the phosphotrisaccharide on the polypeptide. We also wanted to confirm the open ring structure enforced by the addition of ribitol, a reduced sugar. In such a linear structure, mild periodate treatment would be expected to cleave vicinal diols and release matriglycan from the polypeptide. Thus, we performed mild periodate cleavage of the purified a-DG-340 protein and demonstrated a loss of reactivity with the IIH6 antibody that recognizes functional LARGE-modified glycoprotein ( Figure 4-figure supplement 2). Furthermore, following tryptic cleavage, we examined the resulting a-DG peptides by tandem mass spectrometry (Figure 4-figure supplement 1). We were able to identify multiple glycopeptides where Thr317 (a known site for M3 glycans) was modified to indicate a cleaved ribitol fragment on the phosphotrisaccharide. We observed pairs of essentially identical glycopeptides that showed cleavage between C2 and C3 or between C3 and C4 ( Figure 4). We also observed peptides with an additional phosphate suggesting that there can be 2 phosphates on the trisaccharide of which at least one is in a phosphodiester linkage to ribitol ( Figure 4). Thus, the ribitol appears most likely to be connected at C1 to the oxygen of a phosphate group on the M3 glycopeptide and at the C5 to xylose (note that ribitol C1 and C5 are indistinguishable due to the nature of the reduced sugar and thus nomenclature for closed-ring standard sugar nucleotides is being used to describe linkage).

TMEM5 is a xylose transferase
In our previous manuscripts describing B4GAT1 Willer et al., 2014), we had provided evidence for an underlying xylose in the linker region of matriglycan between the phosphotrisaccharide and the LARGE-dependent Xyl-GlcA repeat disaccharide. Among the uncharacterized genes harboring mutations in patients with CMD, TMEM5 shows sequence similarity to glycosyltransferases (Vuillaumier-Barrot et al., 2012). TMEM5 was also among the genes uncovered in a screen for loss of IIH6 binding and Lassa virus entry, readouts that are both dependent on functional glycosylation of a-DG (Jae et al., 2013). Thus, we investigated this enzyme as a candidate xylose transferase. We overexpressed and purified the truncated catalytic domain fused to GFP and a His tag (Figure 5-figure supplement 1A). Using the UDP-Glo assay, we were able to show that transmembrane deleted dTM-TMEM5 can hydrolyze UDP-Xyl, but not other UDP-sugars ( Figure 5A) demonstrating selective hydrolytic activity in the absence of acceptor glycans. Furthermore, we were able to show that recombinant full length TMEM5 (Figure 5-figure supplement 1B) can be used to label a-DG-Fc340 (Fc-tagged a-DG-340) with radiolabeled UDP-Xyl [Xyl-14 C] expressed from TMEM5-deficient patient cells compared to a-DG-Fc340 mutated at the M3 site (TPT, 317-319, converted to APA) as a negative control ( Figure 5B). Also, we were able to show that dTM-TMEM5 can use UDP-Xyl to transfer Xyl to CDP-ribitol as the acceptor but not ribitol, ribitol-5-P, or CMP-Neu5Ac suggesting the need for the ribitol to be in a phosphodiester linkage in order to be an acceptor ( Figure 5C). In summary, these results provide strong, but not direct, evidence that TMEM5 is the xylosyl transferase enzyme for modification of ribitol that is in a phosphodiester linkage to the M3 glycan on a-DG.

TMEM5 knockdown in zebrafish recapitulates WWS phenotype
The highly conserved TMEM5 gene, which has formerly been implicated in CMD, encodes a type II transmembrane protein with a predicted glycosyltransferase domain. To investigate the role of TMEM5 in vertebrates, we knocked down the zebrafish (Danio rerio) orthologue using antisense morpholino oligonucleotides (MO). The endogenous zebrafish tmem5 transcript was detected throughout early embryonic development ( Figure 6-figure supplement 1). Injection of tmem5 MO specifically inhibited the expression of green fluorescent protein-tagged TMEM5 in a dose-dependent manner ( Figure 6-figure supplement 1). Knockdown of tmem5 caused an increased percentage of embryos with mild to severe hydrocephalus (95% in total) and significantly reduced eye size, reminiscent of pathological defects in WWS ( Figure 6A). To test whether the brain and eye abnormalities were caused by MO off-target effects mediated through a p53-dependent cell death pathway (Robu et al., 2007), we inhibited p53 activation by co-injection of p53 MO. 88% of embryos still displayed hydrocephalus and significantly reduced eye size was still observed in embryos co-injected with p53 and tmem5 MOs ( Figure 6B), suggesting that the brain and eye abnormalities were not caused by MO off-target effects. As knockdown of tmem5 also caused reduced motility and lesions   (Pyro) QIHAT*PT # PVT # AIGPPT # T # AIQEPPS # R (Na) * = phosphotrisaccharide-ribitol fragment # = potential sites of additional sugars Figure 4. Mild periodate cleavage reveals that ribitol is connected to the M3 glycan. Following treatment of purified a-DG-340 under mild periodate conditions, the protein was reduced, alkylated, and digested with trypsin. The resulting peptides were analyzed by tandem mass spectrometry. The a-DG-340 tryptic peptide was detected with multiple glycoforms containing a phosphotrisaccharide glycopeptide. Specifically, at least 12 tryptic peptides were identified at less than 9 ppm mass accuracy that contained the phosphotrisaccharide with a ribitol fragment at threonine 317 (the red T*) in two separate analyses. These peptides differed in additional glycosylation by either hexose (Hex) or HexNAc sugars at additional hydroxyl amino acids (indicated by #). We also observed that the N-terminal glutamine was cyclized to pyroglutamic acid on some peptides (Pyro) and that some glycopeptides were sodiated (Na). Furthermore, we noticed the presence of a second phosphate on some phosphoglycopeptides. The 12 peptides observed are grouped by having the same number of additional Hex and HexNAc glycans (2,3 or 3,3 or 2,4). For all 3 sets of glycopeptides we observed glycopeptides that had ribitol fragments between C2 and C3 as well as C3 and C4 indicating that C2, C3, and C4 are not involved in linkages to other moieties. We also observed for each set of glycopeptides the presence of an additional phosphate. DOI: 10.7554/eLife.14473.008 The following figure supplements are available for figure 4:   in the myotome (data not shown), we assessed the sarcolemma integrity using Evans Blue dye (EBD), which does not penetrate into intact muscle fibers. Muscle fibers were infiltrated by EBD before undergoing degeneration ( Figure 6C), suggesting a pathological mechanism in which knockdown of tmem5 leads to compromised sarcolemma integrity. As defective glycosylation of a-DG is a pathological hallmark of WWS, we tested whether knockdown of zebrafish tmem5 would affect the glycosylation of a-DG. Compared to control embryos, knockdown of tmem5 caused a 44% reduction of glycosylated a-dystroglycan (IIH6 epitope) on Western blots ( Figure 6D). Together, these results clearly illustrate a role for TMEM5 in functional glycosylation of a-DG and knockdown of this enzyme generating a CMD phenotype in vertebrates.

Identification of a new TMEM5 missense mutation in a family with WWS
Previously, it was reported that mutations in TMEM5 can cause WWS, a congenital form of muscular dystrophy with severe brain involvement (Vuillaumier- Barrot et al., 2012;Jae et al., 2013). To investigate if one of the unidentified consanguineous WWS families was affected by a mutation in TMEM5 we performed linkage analysis and whole exome sequencing (WES) in three siblings (Figure 7-figure supplement 1A). Genomic DNAs from the three siblings were genotyped and the call rates for the genotyping were 99.7%, 96.0% and 91.9% for 02243-d (P1), 02243-a (p2) and 02243-b, respectively. Using~69K SNPs that overlapped between the two platforms, homozygosity-bydescent (HBD) analysis was performed. All three samples had multiple long (>10 cM) stretches of homozygous genotypes, confirming that they were descendants of a consanguineous marriage. Four regions that were homozygous in the two affected siblings, but heterozygous or homozygous for the other alleles in the unaffected sibling, were identified (Figure 7-figure supplement 1B). All 6,647 coding exons from these four intervals were subject to targeted sequencing in all three samples. The genomic library was sequenced and variant filter strategies were applied and retained for variants on chromosome 12 and chromosome X. A homozygous, non-reference c.997G>A (p.G333R) sequence variant was found in the TMEM5 gene (Chr. 12) of the two affected siblings (P1 and P2), while the unaffected sibling was homozygous for the wild type sequence (c.997G, p.G333) ( The new p.G333R TMEM5 mutation was identified in a consanguineous family of Pakistani descent. Two out of three pregnancies resulted in fetuses with WWS (P1 and P2), while the third child is normal (Figure 7-figure supplement 1A). Both WWS fetuses had classical brain developmental abnormalities detected by prenatal ultrasound (see Figure 7-figure supplement 2A). The first pregnancy (P1) went to term with the child passing away at 4 months of age. The second pregnancy was terminated at 18 weeks (P2); a quadriceps skeletal muscle sample utilized in our studies was obtained after termination. Severe dystrophic features were noted in cryosections of the muscle, and immunofluorescence studies showed an abnormal pattern of dystrophin-glycoprotein complex expression characteristic of a severe dystroglycanopathy (Figure 7-figure supplement 2B).
Although patients with TMEM5 mutations have been reported before (Vuillaumier-Barrot et al., 2012;Jae et al., 2013) the a-DG glycosylation status in these patients has not been investigated. Immunofluorescence and western blot analysis of skeletal muscle from the 18 week fetus (P2) showed an a-DG glycosylation defect similar to previously described glycosylation-deficient WWS patients (Willer et al., 2012) with complete loss of both functional glycosylation and laminin binding (    well as in cultured skin fibroblast samples (P2, Figure 7A). To demonstrate the pathogenicity of the identified TMEM5 mutations, we conducted complementation assays on skin fibroblasts derived from the first child (P1). In the patient cells, expression of wildtype TMEM5 fully restored functional glycosylation while the p.G333R mutant protein did not ( Figure 7B). Functional rescue of patient cells supports the interpretation that TMEM5 p.G333R has pathogenic relevance and causes WWS. Furthermore, we determined whether the identified TMEM5 p.G333R variant affects expression and localization of the mutant protein. HA-tagged wildtype and p.G333R TMEM5 constructs were transfected into in HEK293 cells. Immunofluorescence and co-localization with a Golgi marker Giantin confirmed that both proteins are expressed and localize to the Golgi apparatus without significant ER mislocaliztion ( Figure 7D). This result shows that the loss of TMEM5 function in the WWS patients is not caused by a cellular processing defect, but rather directly affects the catalytic domain and abrogates the proposed xylose glycosyltransferase activity.
Model of the proposed a-DG functional glycan structure and enzymes that contribute to its synthesis Collectively our data supports a model of a functional glycan structure on a-DG where the phosphorylated trisaccharide, which we previously identified (Yoshida-Moriguchi et al., 2010), is extended by a ribitol (by an unknown enzyme(s) but presumably Fukutin and/or FKRP) in a phosphodiester linkage followed by the addition of a priming Xyl (added by TMEM5) and GlcA (added by B4GAT1) before extension with the repeating disaccharide matriglycan by LARGE ( Figure 8A). Mutations in all of the enzymes in this pathway have been demonstrated to generate secondary dystroglycanopathies in patients (Wells, 2013;Yoshida-Moriguchi and Campbell, 2015;Dobson et al., 2013;Endo, 2015;Stalnaker et al., 2011). Further, ISPD, similar to the DPM1/2/3 enzyme complex that generates dolichol-phosphomannose ( Figure 8B) for initial mannosylation, generates CDP-ribitol ( Figure 8C) for ribosylation of the phosphotrisaccharide. Both of these processes are involved in the generation of donors for the enzymes involved in functional glycosylation of a-DG and as such CMD resulting from these enzymes should then be referred to as tertiary dystroglycanopathies (Vuillaumier- Barrot et al., 2012;Lefeber et al., 2009).

Discussion
Initiation and further extension of O-Man glycans on a-DG is required for proper recognition by ECM proteins (Yoshida-Moriguchi and Campbell, 2015;Dobson et al., 2013;Endo, 2015;Endo and Manya, 2006). Failure of proper glycan elaboration is causal for a significant subset of congenital muscular dystrophies (CMD) referred to as dystroglycanopathies that range from severe Walker-Warburg syndrome (WWS) to the much milder Limb-Girdle muscular dystrophy (LGMD) presumably resulting from the severity of the mutation on enzyme expression, stability, localization, and/or activity (Wells, 2013;Stalnaker et al., 2011;Muntoni et al., 2011). We and others have worked for the last two decades on elucidating the functional O-Man glycan structures that are essential for effective interactions with extracellular matrix proteins and defective in CMD (reviewed recently in . Here, we have further elucidated the functional glycan structure and attempted to assign the enzymes responsible for each step in the biosynthetic pathway. This work extended our previous studies demonstrating a phosphodiester linkage bridging from the phosphotrisaccharide to the extended LARGE-dependent Xyl-GlcA repeat on a-DG (Yoshida-Moriguchi et al., 2010). We previously identified the M3 phosphotrisaccharide glycan structure following aqueous HF treatment and assigned its sites of modification on the a-DG polypeptide (Yoshida- Moriguchi et al., 2010). In the present study we examined the structure of the glycan released from polypeptide upon HF cleavage of the phosphodiester linkage ( Figure 8). To simplify the analysis, the length of the LARGE-dependent repeat was reduced by overexpression of a small IIH6-reactive fragment of a-DG (28 amino acids) containing only one M3 site (TPT, 317-319) as a fusion protein in HEK293F cells, which express low levels of LARGE. Following HF release, we were able to identify pentitol-Xyl-GlcA by tandem mass spectrometry, as well as further extended versions ( Figure 2).
We were somewhat surprised to find pentitol (a reduced sugar polyol) attached to the xylose. Assuming that this reduced sugar was likely transferred as an activated sugar nucleotide, we attempted to identify a mammalian CDP-ribitol pyrophosphorylase. ISPD was a reasonable candidate since, as a putative cytosolic nucleotidyltransferase (Vuillaumier-Barrot et al., 2012), it is mutated in a subset of CMD and LGMD patients (Willer et al., 2012;Cirak et al., 2013), and bacterial homologs participate in the synthesis of CDP activated alcohols (Baur et al., 2009;Richard et al., 2001). Purified recombinant enzyme was able to synthesize CDP-ribitol from ribitolphosphate and CTP demonstrating that it is indeed a CDP-ribitol pyrophosphorylase (Figure 3). During the preparation of our manuscript, a study was published demonstrating that CDP-ribose, CDPribitol, and CDP-ribulose could all be generated by mammalian ISPD consistent with our present results . These findings also lend themselves to considering the source of ribitol-P in mammals that likely is generated from ribose-5-phosphate, from the pentose phosphate pathway, by the action of aldose reductase (Perl et al., 2011). Alternatively, given that ISPD can act on ribitol-phosphate and ribose-phosphate, it is possible that the reduction occurs after the formation of the sugar nucleotide CDP-ribose to form CDP-ribitol. Further investigation of kinetic constants and cellular abundances need to be pursued to determine the order of events. Furthermore, no CDP-ribitol transporter into the secretory pathway has been identified yet. Given the facts that mutations in SLC35A1, a proposed CMP-sialic acid transporter, have been implicated in decreased functional glycosylation of a-DG that is independent from sialic acid (Jae et al., 2013; and that sugar nucleotide transporters often have higher selectivity for the nucleotide than the sugar, it is inviting to speculate that SLC35A1 is a CDP-ribitol transporter though this remains to be formally tested. Given that ISPD could generate CDP-ribose and CDP-ribitol we wanted to confirm that it was ribitol and not ribose that was transferred into the functional O-Man glycan on a-DG. Further, we wanted complementary experimental evidence that would be consistent with the phosphodiester cleavage study that the ribitol was connected directly to the phosphotrisaccharide of a-DG. Both reduced and non-reduced glycans are susceptible to cleavage by mild periodate treatment between vicinal diols (Collins et al., 1997). If ribitol was indeed inserted between the phosphotrisaccharide and the priming xylose, one would expect loss of IIH6 reactivity from the a-DG-340 fusion protein as we observed (Figure 4-figure supplement 2). Furthermore, mass spectrometry analysis of the resulting peptides revealed cleavage between carbons 2-3 and between 3-4 of the ribitol (Figure 4). We also observed phosphotrisaccharide glycopeptides with an additional phosphate moiety suggesting that the M3 trisaccharide can have 2 phosphates connected directly to it (one is at the 6position of Man and the other we could not resolve). These data are highly suggestive that an M3 phosphate oxygen is connected to the C1 carbon while the xylose is connected to the C5 carbon of the ribitol (note that given the structure of ribitol, C1 and C5 are equivalent and thus we have chosen to use a numbering system that is most consistent with other sugar nucleotides) (Figure 8).
Based on our proposed structure, we also predicted a xylose transferase to further elongate the structure and serve as a substrate for B4GAT1 extension with GlcA as a primer for LARGE addition of matriglycan. Given that mutations in TMEM5 have recently been described as being causal for CMD and that the putative enzyme shows sequence homology to glycosyltransferases (Vuillaumier-Barrot et al., 2012;Jae et al., 2013), we explored the possibility that TMEM5 was a xylose transferase ( Figure 5). We were able to confirm that recombinant TMEM5 was able to hydrolyze UDP-Xyl but not other UDP-sugars. We also observed that while ribitol and ribitol-5-P were not acceptors in vitro that CDP-ribitol was an acceptor for TMEM5 catalyzed transfer of Xyl from UDP-Xyl likely due to mimicking the phosphodiester linkage of the ribitol in the glycopeptide structure. Further, recombinant TMEM5 was able to transfer radiolabeled Xyl from UDP-Xyl to a-DG-Fc340 expressed in a CMD patient cell line with a homozygous mutation in TMEM5 but was not able to effectively transfer to a-DG-340 when the putative M3 sites (TPT, 317-319, converted to APA) were eliminated. This data taken together strongly suggests that TMEM5 adds Xyl to ribitol when the ribitol is in a phosphodiester linkage. Given the scarcity of studies on TMEM5, we further characterized the functional relevance of TMEM5 in a zebrafish knockdown model and observed a severe CMD phenotype (Figure 6). Further, we identified a novel p.G333R TMEM5 mutation in a consanguineous family with 2 children affected by WWS. We demonstrated that wildtype TMEM5 but not the p.G333R TMEM5 mutant construct could complement the mutation in the patient cell line with regards to functional a-DG ( Figure 7B,C). Subcellular localization studies showed normal expression and localization of the mutant protein, suggesting that the p.G333R missense mutation in the predicted catalytic site likely interferes with enzymatic activity ( Figure 7D). Thus, we propose that TMEM5 modifies the ribitol in a phosodiester linkage with a xylose and B4GAT1 then extends the structure with a GlcA to serve as a primer for LARGE extension with the repeating Xyl-GlcA disaccharide, matriglycan.
In summary, it appears that synthesis of the functional glycan on a-DG requires nearly a dozen enzymes, many of which presumably are exclusively used on a select set of sites on a-DG (Figure 8). While we have further elucidated a functional glycan structure required for interaction with extracellular matrix proteins, several questions remain. For instance, while we have postulated that FKRP and/or Fukutin are involved in the transfer of ribitol using CDP-ribitol as the donor, we have not formally tested this activity due primarily to technical difficulties with expressing these two proteins. Also, while we have confirmed that TMEM5 is a xylose transferase that can add to a ribitol that is in a phosphodiester linkage, we have not experimentally validated its transfer to a ribitol-phosphotrisaccharide modified peptide for the same reasons as above. Furthermore, the generation of the functional glycan in quantities necessary to assign anomeric configurations and confirm linkages in the structure is still required. During the revision of this manuscript, a manuscript was published that addresses many of our outstanding questions (Kanagawa et al., 2016). In particular, Toda and colleagues were able to establish that Fukutin acts to add a ribitol-P in a phosphodiester linkage to the GalNAc of the phosphotrisaccharide that is then extended with an additional ribitol-P by FKRP before elaboration with Xyl by an unknown enzyme (shown here to be TMEM5) (Kanagawa et al., 2016). All of our data presented here is completely consistent with the structure proposed by Toda and colleagues.
In future studies, we would like to determine if there are other sites of functional glycosylation beyond the 317/319 and 379/381 sites on a-DG that we have previously identified (Yoshida-Moriguchi et al., 2013;Yoshida-Moriguchi et al., 2010). It is also puzzling that while many sites on a-DG and even on other proteins, such as cadherins (Vester-Christensen et al., 2013), are O-Man modified, only a select few sites on a-DG appear to become modified with M3 glycans containing matriglycan for interaction with ECM proteins. The evolution of such a complex biosynthetic pathway for a few sites on a single protein is an enigma. Understanding this exclusivity as well as the role of M1 and M2 glycans in biology is a major future challenge. In closing, we have defined the role of additional enzymes in the O-Mannosylation pathway and further elucidated a functional glycan structure on a-DG for binding to ECM proteins that is lacking in many secondary and tertiary dystroglycanopathies.

Materials
Chemical reagents were primarily purchased from Sigma-Aldrich (St. Louis, MO) at reagent grade or better. Microcon centrifugal filters were purchased from EMD Millipore (Billerica, MA) as were IIH6C4 antibody stocks. Liquid chromatography and mass spectrometry systems were purchased from Thermo Scientific (Waltham, MA). Glycosidases were purchased from ProZyme (Hayward, CA). Spin columns were purchased from Nest group (Southborough, MA). Software was from Thermo Scientific, Protein Metrics (San Carlos, CA) and EuroCarbDB (Damerell et al., 2015).

a-DG-340 preparation and purification
A truncated form of recombinant a-DG (residues 1-340 of human DAG1, Uniprot Q14118) was expressed by transient transfection of suspension culture HEK293F cells as a soluble secreted fusion protein (lacking amino acids 1-312 following furin cleavage in the secretory pathway). The fusion protein coding region was designed, codon optimized, and chemically synthesized by Life Technologies (Thermo Fisher Scientific, Waltham, MA) and subcloned into a mammalian expression vector (Barb et al., 2012) that provided a COOH-terminal fusion of the 7 amino acid recognition sequence of the tobacco etch virus (TEV) protease (Tö zsér et al., 2005), the 'superfolder' GFP coding region (Pédelacq et al., 2006), an AviTag recognition site for in vitro biotinylation (Beckett et al., 1999), followed by an 8xHis tag (construct designated a-DG340-pGEc2). The vector employs a CMV-based promoter and enhancer sequences to drive recombinant protein expression and the NH 2 -terminal signal sequence of a-dystroglycan to target entry into the secretory pathway and secretion from the cell. Recombinant expression of the a-DG340 fusion protein in HEK293F cells (Freestyle 293-F cells, ThermoFisher Scientific, verified by RNA-Seq and tested routinely for mycoplasma contamination by PCR, not on the commonly misidentified cell line registry) and purification of the protein from the conditioned medium by Ni 2+ -NTA chromatography was carried out as previously described (Meng et al., 2013). Briefly, cells were pelleted by centrifugation at 1000 x g for 10 min, 5 days post-transfection, and the cell culture medium containing GFP-a-DG340-was subjected to Ni-NTA chromatography. GFP-hTMEM5 was eluted with 300 mM imidazole. Fractions containing GFP-a-DG340 were pooled, buffer exchanged into phosphate-buffered saline (PBS, pH 7.2) and concentrated to 1 mg/mL using an Amicon Ultra-15 Centrifugal Filter Unit equipped with a 30,000 NMWL membrane (EMD Millipore).

a-DG post-phospho moiety release and purification
Purified a-DG-340 (~100 mg) that only contains AA 313-340 of a-DG was buffer exchanged into Milli-Q water using an EMD Millipore 10 kDa NMWL Microcon centrifugal filter according to manufacturer's instructions (20 min centrifugation, 13,500 x g, 25˚C, final ratio 1:250). Final imidazole concentration and protein concentration were assessed by NanoDrop ND-1000 spectroscopy (A230 and A280). The purified protein preparation was dried down in a SpeedVac and treated with cold aqueous 48% hydrofluoric acid (Sigma) on ice at 4˚C overnight to cleave phosphodiester bonds. HF was removed by drying with N 2 gas on ice. Residual trace HF was removed by resuspending with 100 ml Milli-Q H 2 O several times and SpeedVac drying (Yoshida-Moriguchi et al., 2010). Released glycans were separated from protein by C18 reverse-phase desalting using Nest Group Macrospin columns with 0.1% formic acid (flow-through and washes containing released glycans).

Glycan permethylation
Permethylation was carried out as described (Kumagai et al., 2013). The dichloromethane and aqueous fractions were each analyzed separately.

Full MS and total ion monitoring of permethylated glycans
Permethylated glycans were resuspended in~30 ml of 50% MeOH with 1 mM NaOH and loaded into a Hamilton syringe for direct infusion at 0.5 ml/min into an Orbitrap XL (Thermo Scientific). Orbitrap full MS scans and total ion monitoring (TIM) data were acquired for the organic fraction and separately for the aqueous fraction from the permethylation procedure. Full MS scans were acquired for 30 s in m/z range 300-2000 and separately in m/z range 600-2000 with AGC target 2e5, spray voltage 2 kV in positive mode. The TIM method consisted of ion trap scans in positive mode from 400-2000 with parent mass step 2.0, CID activation, isolation width 2.2, 38 normalized collision energy and ITMS MSn AGC target 1e4. Data were analyzed and annotated using a combination of Chem-Draw Professional 15.0 with additional structure information from pubchem.ncbi.nlm.nih.gov and Glycoworkbench 2.1 (Damerell et al., 2015).

Mild periodate treatment
Mild periodate treatment was carried out according to protocols published previously (Collins et al., 1997). Purified a-DG-340 (~200 mg) was buffer exchanged into PBS pH 7.2 using an EMD Millipore 10 kDa NMWL Microcon centrifugal filter as above. An aliquot was removed for mock treatment (omission of periodate only). NaIO4 in Milli-Q H 2 O was added to achieve a final concentration of 2 mM and the sample was incubated in the dark at 4˚C for 90 min. Ethylene glycol was added to a final concentration of 10 mM to both samples (for consumption of excess periodate in the periodate treated sample) and incubated for 5 min at room temperature. NaBH 4 (5 M in 2.5% NaOH) was then added to both samples to a final concentration of 250 mM (pH>13) and the samples were incubated at room temperature for 1 hr. Samples were neutralized with 10% acetic acid and then concentrated and buffer exchanged into 40 mM NH 4 HCO 3 using EMD Millipore 10 kDa NMWL Microcon centrifugal filters (final ratio 1:100). Flow-through was saved for glycan analysis. Aliquots of buffer exchanged protein were saved for dot blot analysis.

Trypsin digestion
Prior to glycosidase treatment, samples for MS analysis were digested according to standard protocols (Fakhouri et al., 2006;Stalnaker et al., 2010).

Dot blots
a-DG-340 and processed samples were analyzed by dot blot on PVDF membranes. IIH6C4 antibody (Millipore) was used as primary. Dot blots were carried out with serial dilutions and secondary controls alone (not shown).

Glycopeptide LC-MSn and data analysis
Glycopeptides were analyzed on Orbitrap Fusion and Orbitrap Fusion Lumos instruments with liquid chromatography carried out using an Acclaim PepMap RSLC C18 2 mm particle 15cm column on an Ultimate 3000 (Thermo/Dionex). Buffer A was 0.1% formic acid, buffer B was 80% acetonitrile and 0.1% formic acid. The column was heated to 45˚C, equilibrated for 5 min, and a linear gradient from 5%B to 45%B was run over the course of 120 min with a 300 ml/min flow rate. The column was cleaned after each run by ramping to 99%B for 10 min and then returning to 5%B to re-equilibrate. Spray was via a stainless steel emitter with spray voltage set to 2200 V, ion transfer tube temperature 280˚C. MS methods consisted of full MS scans in the Orbitrap, generally from 500-1700 m/z with quadrupole isolation. Peptide MIPs, charge state selection allowing for states 2-7, dynamic exclusion for 30 s after 2 selections with tolerance of 15 ppm on each side, and precursor priority of MostIntense were used. All MS2 and MS3 scans were analyzed in the ion trap. One branch consisted of CID scans with pseudo-loss triggered CID and HCD for phosphate, hexose and hexnac combinations with either no charge loss or a charge loss of +1. A second branch consisted of an HCD node leading to product ion triggered ETD given the observation of at least 3 oxonium ions generated by either hexose or hexnac with at least 20% relative intensity. ETD nodes were typically set for reaction times of 100-200 ms, ETD reagent targets of 2e5 or 4e5 and supplemental activation (primarily EThcD) depending on m/z. Higher m/z ions were subjected to to 40%-45% supplemental activation to disrupt low charge density clusters. Data was analyzed with Preview and Byonic versions 2.6 and 2.7 as well as by manual interpretation using Thermo Xcalibur.

Expression and purification of human ISPD
The DNA coding sequence for human ISPD (Residues 43-451, Uniprot A4D126) was generated by gene synthesis (Life Technologies, ThermoFisher Scientific) with sequences appended to the NH 2terminus comprised of a Kozak sequence followed by an initiating methionine, an 8ÂHis-tag and a TEV-protease cleavage site and the ISPD synthetic gene followed by a termination codon at the end of the ISPD coding region. The resulting sequence was subcloned into the pGEc2 vector as described for the a-DG340-pGEc2 construct except the presence of the termination codon precluded the inclusion of the vector encoded COOH-terminal fusion sequences. Suspension culture HEK293 cells (FreeStyle 293-F cells, ThermoFisher Scientific) were transiently transfected with the ISPD-pGEc2 plasmid as described for the a-DG340-pGEc2 construct. Cells were harvested by centrifugation at 1000 x g for 10 min, 5 days post-transfection. The cell pellet was resuspended in lysis buffer [25 mM HEPES-NaOH pH 7.2, 400 mM NaCl, 20 mM imidazole, 0.3% Triton X-100, and 1Â Protease Inhibitor Cocktail Set V, EDTA-Free (Calbiochem)] on ice, lysed by probe sonication for 3 cycles (15 s on, 15 s off at 40% intensity), and the cell lysate was centrifuged at 18,000 x g at 4˚C for 30 min. The supernatant was subjected to Ni-NTA chromatography, and His8-hISPD was eluted using 300 mM imidazole. Fractions containing His8-hISPD were pooled, buffer exchanged into Trisbuffered saline (TBS, pH 8.0) and concentrated to 0.3 mg/mL using an Amicon Ultra-15 Centrifugal Filter Unit equipped with a 30,000 NMWL (nominal molecular weight limit) membrane (EMD Millipore).

Expression and purification of human dTM-TMEM5
The DNA coding sequence for human transmembrane deleted dTM-TMEM5 (residues 33-443, Uniprot Q9Y2B1) was generated by gene synthesis (Life Technologies, ThermoFisher Scientific) and subcloned into a mammalian expression vector (pGEn2) containing an amino-terminal signal sequence, 8ÂHis-tag, AviTag, and 'superfolder' GFP followed by a TEV-protease cleavage site (Meng et al., 2013). Suspension culture HEK293 cells (FreeStyle 293-F cells, Invitrogen) were transiently transfected with the TMEM5-pGEn2 plasmid to generate a soluble secreted GFP-dTM-hTMEM5 and the protein was harvested, purified, and concentrated as described for a-DG340.
TMEM5 sugar-nucleotide specificity assay Ultra Pure UDP-Glc, UDP-GlcNAc, UDP-Gal, UDP-GalNAc, and UDP-GlcA, were purchased from Promega. Ultra Pure UDP-Xyl was prepared by incubating 10 mmol UDP-Xylose (Carbosource, University of Georgia) with 3 units of Calf Intestinal Alkaline Phosphatase (CIAP, Promega) in 1Â CIAP Buffer (50 mM Tris-HCl pH 9.3, 1 mM MgCl 2 , 0.1 mM ZnCl 2 and 1 mM spermidine) at 37˚C for 16 hr. CIAP was used to degrade any contaminating nucleotide diphosphates that may contribute to background levels in the downstream UDP-Glo Glycosyltransferase Assay (Promega). Reactions were allowed to incubate for 16 hr and were stopped by removal of CIAP by filter centrifugation through a Microcon-10 kDa Centrifugal Filter Unit with Ultracel-10 membrane (EMD Millipore).
Specificity of GFP-hTMEM5 sugar-nucleotide hydrolysis was performed by incubation of 3 mM GFP-hTMEM5 with 50 mM of UDP-sugar (Ultra Pure UDP-Glc, UDP-GlcNAc, UDP-Gal, UDP-GalNAc, UDP-GlcA, or UDP-Xyl) in the absence of an acceptor substrate in a 20 mL reaction containing 0.1 M MES pH 6.0 and 10 mM MgCl 2 at 37˚C for 18 hr. Detection of free UDP after hydrolysis of the sugar-nucleotide was performed using the UDP-Glo Glycosyltransferase Assay Kit (Promega) which detects released UDP by converting UDP to ATP and then light in a luciferase reaction, which can be measured by a luminometer. Luminescence detected is directly proportional to UDP concentration, as determined by a UDP standard curve from 0 to 25 mM. Essentially, each sugar-nucleotide hydrolysis reaction was combined in a ratio of 1:1 (5 mL:5 mL) with the UDP-Glo Detection Reagent from the assay kit in separate wells of a white, flat bottom 384-well assay plate (Corning) and allowed to incubate at room temperature for 1 hr. Luminescence was measured in triplicate using a Promega GloMax-Multi+ Microplate Luminometer.

TMEM5 glycosyltransferase assay
GFP-dTM-hTMEM5 (from 0 to 2500 nM) was incubated with 1 mM acceptor substrate and 50 mM UDP-Xyl donor substrate at 37˚C in a reaction containing 0.1 M MES pH 6.0 and 10 mM MgCl 2 for 18 hr. Detection of free UDP after hydrolysis of the sugar-nucleotide was performed using the UDP-Glo Glycosyltransferase Assay (Promega) as described above. CMP-Neu5Ac was purchased from Sigma-Aldrich.

Measuring pyrophosphorylase activity
To prepare cytidine diphosphate ribitol (CDP-ribitol) and cytidine diphosphate ribose (CDP-ribose), 2 mM His8-hISPD was incubated at 37˚C with either 1 mM ribitol-5-phosphate or 1 mM ribose-5phoshate as the acceptor substrate in a reaction containing 50 mM Tris-HCl pH 7.4, 1 mM MgCl 2 , 1 mM DTT, and 1 mM CTP. Reactions were allowed to incubate for 16-18 hr and were stopped by removal of His8-hISPD by filter centrifugation through a Microcon-10kDa Centrifugal Filter Unit with Ultracel-10 membrane (EMD Millipore). Reaction products were confirmed using a linear ion trap-Fourier transform mass spectrometer (LTQ-Orbitrap Discovery, Thermo-Fisher, San Jose, CA). Reaction products were mixed with an equal volume of 80% acetonitrile and 0.1% formic acid and analyzed by direct infusion in negative ion mode using a nanospray ion source with a fused-silica emitter (360 Â 75 Â 30 mm, SilicaTip, New Objective) at 1.5 kV capillary voltage, 200˚C capillary temperature, and a syringe flow rate of 1 mL/min. All products were confirmed by MS/MS ion trap mass spectrometry (ITMS) acquired at 45% collision-induced dissociation (CID) and 2 m/z isolation width.
DGFc4 glycosidase and IIH6 assays 5mg aliquots of DGFc4 previously purified from the media of HEK293H cells co-transfected with LARGE (Yoshida-Moriguchi et al., 2010) was incubated overnight with each of the following glycosidases: chondroitinase A,B,C, sialidase A, heparinase I, heparinase II, and b-N-acetylhexosaminidase. Additionally, 5 mg of DGFc4 was treated with HF as reported previously (Yoshida-Moriguchi et al., 2010). All samples were later separated by SDS-PAGE, transferred, and immunoblotted with IIH6 for assessment of functionally active a-DG.

Cell cultures
Cells were maintained at 37˚C and 5% CO 2 in Dulbecco's modified Eagle's medium (DMEM) plus fetal bovine serum (FBS: 10% in the case of HEK293T cells, 20% in the case of fibroblasts from patient skin) and 2 mM glutamine, 0.5% penicillin-streptomycin (Invitrogen, Carlsbad, CA). Mycoplasma free conditions were verified by PCR and cell lines used are not on the registry of commonly misidentified cells.
Cloning of C-terminal HA-tagged TMEM5 wildtype and TMEM5 p.

G333R mutant constructs
The human TMEM5 coding sequence was PCR amplified and a C-terminal HA epitope-tag was introduced with PCR adapters using the following primer sequences: hTMEM5-HA wildtype (1.3 kb), pTW292: forward (5'-agactcgagaccATGcggctgacgcggaagcg-3', where the XhoI adapter is bolded and the start ATG codon is shown in capital letters) and reverse (5'-cttgcggccgcCTAAGCGTAGTCTGGGACGTCGTATGGGTAgctagccccacttttattattcattaaaaatg-3'; the NotI adapter is bold and the HA-tag is shown in capital letters). The hTMEM5-HA PCR fragment (TMEM5, NM_014254) was subloned in pIRES-hygromycin.
hTMEM5-HA mutant (c.997 G>A, p.G333R) (1.3 kb), pTW293: To generate the human TMEM5-HA p.G333R mutant expression construct, we introduced the missense mutation in the wildtype template pTW292 using a QuikChange site-directed mutagenesis kit (Agilent Technologies, Santa Clara, CA) with overlapping primers that included the respective mutation: forward 5'-gtgcccggtcAgagtaaacacagaatg-3' and reverse: 5'-gtgtttactcTgaccgggcacaatgtg -3'. The introduced mutation is highlighted (capital bold letter). The sequence of the insert DNA was confirmed by Sanger sequencing.
We also used nucleofection as a non-viral method for gene transfer into cells. Nucleofection of fibroblasts was performed using the Human Dermal Fibroblast Nucleofector Kit, according to an optimized protocol provided by the manufacturer (Amaxa Biosystems, Germany).

Glycoprotein enrichment and biochemical analysis
Zebrafish embryos (48 h.p.f.) were deyolked, followed by microsome preparation and western blot analysis as previously described (Roscioli et al., 2012;Link et al., 2006). Relative signal intensity of western blot was quantified using ImageJ software.
WGA-enriched glycoproteins from frozen samples and cultured cells were processed as previously described (Michele et al., 2002). Immunoblotting was carried out on polyvinylidene difluoride (PVDF) membranes as previously described (Michele et al., 2002). Blots were developed with IRconjugated secondary antibodies (Pierce Biotechnology, Rockford, IL) and scanned with an Odyssey infrared imaging system (LI-COR Bioscience, Lincoln, NE).

Laminin overlay assays
Laminin overlay assays were performed on PVDF membranes using standard protocols (Michele et al., 2002).

On-Cell complementation and western blot assay
The On-Cell complementation assay was performed as described previously (Willer et al., 2012). In brief, 2 Â 10 5 cells were seeded into a 48-well dish. The next day the cells were co-infected with 200 MOI of Ad5RSV-DAG1 (Barresi et al., 2004) and Ad5CMV-TMEM5-Myc/RSVeGFP in growth medium. Three days later, the cells were washed in TBS and fixed with 4% paraformaldehyde in TBS for 10 min. After blocking with 3% dry milk in TBS + 0.1% Tween (TBS-T), the cells were incubated with primary antibody (glyco a-DG, IIH6) in blocking buffer overnight. To develop the On-Cell Western blots we conjugated goat anti-mouse IgM (Millipore, Billerica, MA) with IR800CW dye (LI-COR Bioscience, Lincoln, NE), subjected the sample to gel filtration, and isolated the labeled antibody fraction. After staining with IR800CW secondary antibody in blocking buffer, we washed the cells in TBS and scanned the 48-well plate using an Odyssey infrared imaging system (LI-COR Bioscience, Lincoln, NE). For cell normalization, DRAQ5 cell DNA dye (Biostatus Limited, United Kingdom) was added to the secondary antibody solution.

TMEM5 [Xyl-14 C] radioactive sugar donor in vitro assay
To generate a-DG-Fc340 and a-DG-Fc340-mut (TPT, 317-319 to APA) acceptor proteins, we infected control and glycosylation-deficient TMEM5-WWS patient skin fibroblasts with Ad5-CMV a-DG-Fc340 adenoviral vectors at an MOI of 400. At 4 days post-infection the secreted proteins were isolated from the culture medium using Protein A-agarose beads (Santa Cruz, Dallas, TX). a-DG-Fc340 bound Protein A-agarose beads were washed three times with TBS and Protein A slurry prebound with~25 mg a-DG-Fc340 was added to the in vitro TMEM5 assay. Enzyme reactions (25 ml) were carried out at 37˚C for 6 hr, with 0.1 mCi UDP-Xyl [Xyl-14 C] (final conc. 15.8 mM), in 0.1 M MOPS buffer (pH 6.5) supplemented with 10 mM MnCl 2 , 10 mM MgCl 2 , 0.2% BSA and 1 mg purified TMEM5 protein (Origene, Rockville, MD). The reaction was terminated by adding 25 ml of 0.1 M EDTA. After four washes with TBS the Protein A-agarose-bound a-DG-Fc340 samples were analyzed by scintillation counting. [ 14 C] labeled sugar nucleotides were purchased from ARC (American Radiolabeled Chemicals, St. Louis, MO).

Antisense morpholino oligonucleotides (MO)
An antisense tmem5 MO targeting the translation start site was designed and ordered from Gene Tools (Philomath, OR). The sequence of tmem5 MO is: 5'-CCGGCGAAAAAATCT(CAT)GTTGGAT-3' (start codon in brackets with the 5'-UTR sequence underlined). Sequences of p53 and dag1 MOs have been described (Robu et al., 2007;Parsons et al., 2002). MOs were injected into zebrafish embryos by the 2-cell stage with concentration specified in the figures.

Molecular cloning of zebrafish tmem5 and mRNA synthesis
Full length tmem5 coding sequence was amplified from IMAGE cDNA clone (7450642) using PCR primers to obtain a PCR product including 7 bases before start codon, yet excluding the stop codon. The PCR product was subsequently cloned into pCS2+_EGFP expression vector using Gateway clonase system (Invitrogen, Carlsbad, CA). Sense tmem5-egfp mRNA containing full tmem5 MO binding site was synthesized using mMESSAGE mMACHINE SP6 kit (Ambion, Austin, TX).

Reverse transcription, cDNA synthesis and PCR
Wildtype or MO-injected Zebrafish embryos were collected at specific stages and homogenized to extract total RNA using TRIzol (Invitrogen, Carlsbad, CA). First-strand cDNA was synthesized using SuperScript III (Invitrogen, Carlsbad, CA) with either oligo dT or random primers.

Statistical analysis in zebrafish
Eye width measurements were plotted as mean ± s.d. and statistical significance was determined using one-way ANOVA, followed by Tukey HSD test. A p value smaller than 0.01 was considered statistically significant.
Evans blue dye (EBD) injection 0.1% EBD (Sigma) was injected into the blood circulation of zebrafish embryos at least 1 hr before analysis. The sarcolemma integrity was then assessed using confocal and differential interference contrast (DIC) microscopy at 48 h.p.f.

Human subjects and samples
All tissues and patient cells were obtained and tested according to the guidelines set out by the Human Subjects Institutional Review Board of the University of Iowa; informed consent was obtained from all subjects or their legal guardians.

Genotyping and IBD/HBD analysis
High molecular weight genomic DNA samples from three cases were genotyped on Illumina Omni-1 Quad BeadChip at the Southern California Genotyping Consortium (SCGC, http://scgc.genetics. ucla.edu/) or Affymetrix Human Mapping 250K NSP at the UCLA Genome Sequencing Center (http://gsc.ucla.edu/). Homozygosity-by-descent (HBD) analysis was performed using a custom Mathematica script available at Wolfram Research; B. Merriman, http://genome.ucla.edu/~hlee/script_ public/HBD_IBD/HBD_IBD_Script.nb and the interval file used is available as Source code. The HBD analysis simply searches for long stretches of homozygous calls within each individual (Lee et al., 2008). A conservative error rate of 1% was used to allow the algorithm to tolerate possible genotyping errors. Intervals over 10 cM with stretches of homozygous genotypes are indications that the individual is a descendent of a consanguineous marriage. To evaluate the sharing between the siblings, SNP positions where the genotypes were not only homozygous in each individual but also the same allele were noted as shared.

Capture array design and targeted sequencing
Regions over 3 cM that were homozygous in each sample were identified and pairwise comparisons were performed to find subset of the regions where the two affected individuals, TR and 02243-a, were homozygous with the same allele and the unaffected individual, 02243-b, was heterozygous or homozygous with an alternate allele. All coding exon regions within those regions were retrieved using six different exon/gene prediction models (RefSeq (refGene), UCSC (knownGene), Vertebrate Genome Annotation genes (vegaGene), Ensembl genes (ensGene), consensus coding sequence (ccdsGene), and Mammalian Gene Collection genes (mgcGene). In total, 6647 coding exons plus the regions extending 10 bp on each side of each exon across 4 intervals were subject to capture probe design using Agilent eArray (www.agilent.com). Agilent custom CGH array platform was chosen, requiring the probes to have the melting temperature (Tm) around 80˚C (default) with probe trimming allowed. Repeat regions were excluded for probe design by turning on the 'Avoid Standard Repeat Masked Regions' option.
Genomic DNA was extracted from skin fibroblasts using Qiagen (Hilden, Germany) DNeasy Blood & Tissue Kit was run on Qubit Fluorometer (Invitrogen) and Bioanalyzer (Agilent) for quality check. For each sample, 3 mg of high molecular weight genomic DNA was used as starting material, the sequencing library was prepared following Agilent SureSelect SureSelect Target  After amplification, samples were pooled at equal molar concentration, captured on one array following an inhouse protocol (Lee et al., 2009) and sequenced on approximately 3/11th lane of Illumina HiSeq2000 as 50bp paired-end reads, following the manufacturer's protocol. The base-calling was performed by the real time analysis (RTA) software provided by Illumina.

Sequence read alignment
The sequence reads were first de-barcoded using Novobarcode from Novocraft Short Read Alignment Package (http://www.novocraft.com/ index.html) and aligned to the Human reference genome, human_g1k_v37.fasta using Novoalign. The reference genome downloaded from the GATK (The Genome Analysis Toolkit) resource bundle (http://www.broadinstitute.org/gsa/wiki/index.php/Main_ Page) in November, 2010 was indexed using novoindex (-k 14 -s 3). The output format was set to SAM and default options for alignment were applied except for the adaptor stripping option (-a) and base quality calibration option (-k). Using SAMtools (http://samtools.sourceforge.net/) version 0.1.15, the SAM file of each sample was converted to BAM file and sorted, and potential PCR duplicates were removed (rmdup) using Picard (http://picard.sourceforge.net/). Local realignment was performed using GATK 'IndelRealigner' tool per sample. First, the 'RealignerTargetCreator' tool was used to determine the locations that are potentially in need of realignment. The post-rmdup bam file and the known SNP positions (Single Nucleotide Polymorphisms) in dbSNP132 were included as inputs and '-mismatchFraction 0.10' and '-realignReadsWithBadMates' options were used. Using the intervals created, reads were realigned and the mates were fixed using Picard's 'FixMateInformation' tool. Base qualities were recalibrated using GATK 'TableRecalibration' tool by analyzing the covariates for the reported base quality score of a base (QualityScoreCovariate), the combination of a base and the previous base (DinucCovariate) and the machine cycle for a base (CycleCovariate).

Variant calling
Variants were called using GATK 'Unified Genotyper' tool simultaneously for all 8 samples. Small indels were called with the '-glm DINDEL' option. The dbSNP132 file downloaded from the GATK resource bundle was used so that the known SNP positions are annotated in the output VCF (variant call format) file. Variants with phred-scale Qscore of 50.0 or greater were reported as 'PASS'-ed calls and those with Qscore of 10.0 or greater and less than 50.0 were reported as 'Low Qual' calls. Variants with Qscore less than 10.0 were not reported. Only the variants found within the protein coding regions of the captured exons were reported by using the -L option. The interval file used is available as Source code. Using GATK 'VariantFiltrationWalker' tool, both the SNPs and INDELs were hard-filtered to filter out low quality variants. The following parameters were used as suggested by GATK as standard filtration: 1) clusterWindowSize 10; 2) MAPQ0 (mapping quality of zero) >40; 3) QD (Quality-by-depth) < 5.0; 4) SB (Strand Bias) > -0.10.

Variant annotation
The 'PASS'-ed variants that are not found at dbSNP132 positions were annotated using SeattleSe-qAnnotation version 6.16 (http://snp.gs.washington.edu/SeattleSeqAnnotation131/), separately for SNPs and INDELs. Both NCBI full genes and CCDS 2010 gene models were used for the annotation. For variants with multiple annotations, if the 'functionGVS' (Genome Variation Server class of function) was different while the 'genelist' value was same, the one with protein level change was retained. Variants present in the 1000 Genomes database (March 2010 release) or dbSNP131 as well as those resulting in coding-synonymous changes or found outside the coding region were removed from further analysis.

Sanger sequencing
Genomic DNA was extracted from dermal fibroblast cell lines using a Qiagen (Hilden, Germany) DNeasy Blood & Tissue Kit. The coding regions (6 exons) and exon-intron boundaries of TMEM5 were amplified using PCR (Primers sequences and PCR conditions are shown below). Primer sets for PCR were designed using the web-based design tool ExonPrimer. After PCR amplification the purified products were evaluated by Sanger sequencing using standard protocols. experiments, Conception and design, Acquisition of data, Analysis and interpretation of data; HL, SFN, Performed next generation sequencing and data filtering, Acquisition of data, Analysis and interpretation of data; SHS, SW, Carried out experimental work, Analyzed and interpreted the data; PKP, Carried out experimental work, Analyzed and interpreted the data, Acquisition of data, Analysis and interpretation of data; DLS, Supervised zebrafish experiments, Conception and design, Analysis and interpretation of data; SAM, Performed muscle histology and clinical data interpretation, Co-wrote the manuscript; KWM, Designed protein expression, Analyzed and interpreted the data, Co-wrote the manuscript, Supervised the protein expression research; KPC, LW, Co-designed the project, Analyzed and interpreted the data, Co-wrote the manuscript, Supervised the research