Direct Mapping of Additional Modifications on Phosphorylated O-glycans of α-Dystroglycan by Mass Spectrometry Analysis in Conjunction with Knocking Out of Causative Genes for Dystroglycanopathy*

Dystroglycanopathy is a major class of congenital muscular dystrophy caused by a deficiency of functional glycans on α-dystroglycan (αDG) with laminin-binding activity. Recent advances have led to identification of several causative gene products of dystroglycanopathy and characterization of their in vitro enzymatic activities. However, the in vivo functional roles remain equivocal for enzymes such as ISPD, FKTN, FKRP, and TMEM5 that are supposed to be involved in post-phosphoryl modifications linking the GalNAc-β3-GlcNAc-β4-Man-6-phosphate core and the outer laminin-binding glycans. Herein, by direct nano-LC-MS2/MS3 analysis of tryptic glycopeptides derived from a truncated recombinant αDG expressed in the wild-type and a panel of mutated cells deficient in one of these enzymes, we sought to define the full extent of variable modifications on this phosphorylated core O-glycan at the functional Thr317/Thr319 sites. We showed that the most abundant glycoforms carried a phosphorylated core at each of the two sites, with and without a single ribitol phosphate (RboP) extending from terminal HexNAc. At much lower signal intensity, a novel substituent tentatively assigned as glycerol phosphate (GroP) was additionally detected. As expected, tandem RboP extended with a GlcA-Xyl unit was only identified in wild type, whereas knocking out of either ISPD or FKTN prevented formation of RboP. In the absence of FKRP, glycoforms with single but not tandem RboP accumulated, consistent with the suggested role of this enzyme in transferring the second RboP. Intriguingly, the single GroP modification also required functional FKTN whereas absence of TMEM5 significantly hindered only the addition of RboP. Our findings thus revealed additional levels of complexity associated with the core structures, suggesting functional interplay among these enzymes through their interactions. The simplified analytical workflow developed here should facilitate rapid mapping across a wider range of cell types to gain better insights into its physiological relevance.

Dystroglycanopathy is a group of congenital muscular dystrophies that arise from glycosylation defects of ␣-dystroglycan (␣DG). 1 The hallmark and cause of these diseases, which include Fukuyama-type congenital muscular dystrophy, muscle-eye-brain diseases, Walker-Warburg syndrome, and congenital muscular dystrophy type 1D, are a lack of functional O-mannosyl glycans of ␣DG capable of binding laminin (1)(2)(3). To date, dozens of causative gene products of dystroglycanopathy have been identified, all of which have been demonstrated or are assumed to be involved in the synthesis of the laminin-binding glycans. A plethora of O-glycans including the normal O-GalNAc mucin types and all three core types of O-mannosyl glycans, namely, M1 (Gal-␤4-GlcNAc-␤2-Man), M2 [Gal-␤4-GlcNAc-␤2-(Gal-␤4-GlcNAc-␤6-)Man], and M3 (GalNAc-␤3-GlcNAc-␤4-Man) (4), have been identified on various sites of ␣DG. Remarkably, the functional laminin-binding glycans were found to be exclusively carried on the core M3 at specific Thr 317 /Thr 319 sites (5). The formation of this innermost base structure involves at least three enzymes, namely, protein O-mannosyltransferase 1/2 heterodimer (POMT1/2), protein O-mannose ␤-1,4-N-acetylglucosaminyltransferase 2 (POMGNT2/AGO61), and ␤-1,3-N-acetylgalactosaminyltransferase 2 (B3GALNT2) (6 -8). Notably, this trisaccharide core can be further phosphorylated at the 6-position of O-mannose by protein-O-mannose kinase (POMK/SGK196) (6). On the other hand, functional laminin binding is known to require a polymeric Xyl-GlcA repeat sequence, the elongation of which is catalyzed by LARGE (9). Because the Xyl-GlcA repeat is released from the core upon chemical hydrolysis of phosphoester linkages by hydrogen fluoride treatment, the lamininbinding glycan synthesized by LARGE was inferred to be extended from the phosphorylated core M3 via the phosphate added by POMK. However, the exact structural element bridging this missing link is lost by such chemical treatment and thus remains undescribed until very recently (10 -13).
A major technical problem in identifying the elusive structural module linking the phosphorylated core M3 and the polymeric GlcA-Xyl units is the limiting mass spectrometry (MS) sensitivity in detecting a very large glycan or glycopeptide carrying multiple negative charges in the form of HexA and phosphate. Most of the analytical work to date has used recombinant ␣DG truncated at the C terminus to different extents but all containing at least the first 10 amino acid residues following the putative endogenous furin cleavage site R312 that releases the N-terminal domain (5,14). For convenience in the purification, the truncated C terminus is also commonly fused to antibody Fc or other tags, which may or may not be further removed during sample preparation. Such a strategy typically yielded very large and heterogeneous tryptic glycopeptides that contain additional O-glycosylated serine/threonine downstream of the critical Thr 317 / Thr 319 attachment sites (Fig. 1), which hampered high-quality MS/MS sequencing and unambiguous data interpretation (10). Alternatively, a tryptic site can be engineered by converting the first threonine after T319 to lysine (T322K), which would not only further truncate the target tryptic glycopeptide carrying Thr 317 /Thr 319 , but at the same time abolish the extra O-glycosylation. After further glycosidase treatment to remove mucin-type O-glycans, a recent study using such truncated ␣DG constructs expressed in and purified from mouse NIH 3T3 cells succeeded for the first time in detecting the further modified phosphorylated core by MS analysis in negative ion mode (11).
It was shown that a tandem ribitol phosphate unit (RboP-RboP) can be attached to the terminal GalNAc of phosphorylated core M3, which can then be extended further by repeating HexA-Pent units, consistent with the proposed model that is supported by in vitro enzymatic assays of the involved causative gene products including isoprenoid synthase domain containing (ISPD), fukutin (FKTN), and fukutin-related protein (FKRP) (11). It also indicated that ISPD is a cytidine diphosphate ribitol (CDP-Rbo) synthase and FKTN transfers a Rbo5P from CDP-Rbo to phosphorylated core M3, thereby providing an acceptor site for FKRP to form the RboP-RboP tandem repeat. The possibility has also been raised that TMEM5 serves as a xylosyltransferase using the RboP-RboP structure as an acceptor site to initiate the first step of Xyl-GlcA repeat sequence synthesis (10). However, this rather unique glycosylation structural motif has yet to be detected intact on ␣DG expressed in other cells, including the widely used HEK293T cells or indeed any human cell type, owing to the aforementioned technical difficulties. It is not known whether there is structural variation associated with core M3 modifications that contributes to the tissue-specific glycosylation status of ␣DG (4) and impacts on its functions as an extracellular matrix receptor in the brain, heart, skeletal muscle, and kidney, or correlates with its reduced expression in human breast and colon cancers in relation to tumor progression (15).
By adopting similar truncation and tryptic site engineering of recombinant ␣DG, we demonstrate a highly sensitive analytical workflow from in-gel digestion to direct nano-LC-MS 2 /MS 3 analysis without additional chemo-enzymatic treatments, for unambiguous identification of the target glycopeptides carrying a further modified phosphorylated core M3. Diagnostic fragment ions were afforded by complementary modes of MS 2 in positive ion mode, which can be programmed for target MS 3 and/or used for rapid filtering of large spectral data sets to allow meaningful manual data interpretation in anticipation of novel substituents. We provide the requisite MS evidence for similar occurrence of tandem RboP-Xyl-GlcA substituent and its incompletely, extended forms on the phosphorylated core M3 in HEK293T cells. Moreover, we performed CRISPR/Cas9 genome editing to create a panel of mutated HCT116 colon cancer cells lacking ISPD, FKTN, FKRP, or TMEM5 and mapped the glycosylation variants of recombinant ␣DG expressed by these cells to address the in vivo functions of these causative gene products.
cDNA Construction-For the expression of Fc-fused ␣DG recombinant proteins, amino acid substitutions and deletion mutants of ␣DG (␣DG373(T322R)) ( Fig. 1) were made by standard PCR and genetic engineering techniques using the expression plasmids as constructed previously (7,16). For Flag-tagged expression vectors of human FKTN, FKRP, TMEM5, and ISPD, the encoding sequence of individual open reading frames was amplified by PCR and cloned into pCMV14 -3ϫFLAG vector (Sigma). For the human FKTN-myc expression vector, the FKTN coding sequence amplified by PCR was cloned into pSecTag2 (Invitrogen).
Generation of Knockout Cells Using CRISPR/Cas9 Genome Editing-The construction of vector (pKO) for TMEM5-KO cells was performed basically in accordance with previous reports (17,18). The 1.5-kb fragment of the human TMEM5 gene used for the 5Ј arm was amplified by PCR from HCT116 cell genomic DNA using the primers 5Ј-ggggtacCAAATTATGCAGTCATTTGC-3Ј and 5Ј-ctagctagcGA-TAAGAAACGAGCAGAGCC-3Ј, and then subcloned between the KpnI and NheI sites to create the pKO 5Ј arm (hTMEM5). The 1.5-kb fragment of the TMEM5 gene used for the 3Ј arm was amplified similarly using the primers 5Ј-ataagaatgcggccGCCCTGTACTGC-CTATTCTC-3Ј and 5Ј-ataagaatgcggccgcGTTGACATATGCATTG-CAATC-3Ј, and then subcloned into the NotI site of the pKO 5Ј arm (hTMEM5) to create pKO-hTMEM5. The target DNA sequence of the guide RNA (gRNA) was as follows: GCCCGAAGAAGACGTGGTAGG, which was inserted into the pX330 vector (Addgene) to create pX330guide-TMEM5. To generate the KO cells of TMEM5, pX330-guide-TMEM5 and pKO-TMEM5 were transiently transfected into HCT116 cells in a six-well plate by Lipofectamine 2000 (Invitrogen). At 96 h post-transfection, cells were incubated in puromycin (0.5 g/ml). Approximately 40 colonies were obtained 20 days later. Homologous recombination in HCT116 cells was confirmed by genomic PCR and sequencing.
For the preparation of ISPD-, FKTN-, or FKRP-KO cells, the gRNAs flanking the target enhancer regions were designed by ToolGen (Seoul, South Korea) dRGEN synthesis services. The target DNA sequences of the gRNAs used in this study were as follows: for ISPD deletion, ATTGAAAATTGACCTGTGGCGGG; for FKTN deletion, GAGTAGAATCAATAAGAACGTGG; and for FKRP deletion, CAT-GCGGCTCACCCGCTGCCAGG. The gRNA expression plasmids (pR-GEN-U6-sgRNA) were purchased from ToolGen. To generate the KO cells lacking ISPD, FKTN, or FKRP, individual gRNA plasmids and a plasmid expressing Cas9-GFP (Addgene Cambridge, MA,, catalog #44719) were transiently transfected into HCT116 cells in a six-well plate with a Cas9:gRNA molar ratio of 1:5 using Polyethylenimine Max (Polysciences, Inc., Warrington, PA), in accordance with a previous report (19). At 48 h post-transfection, cells with high green fluorescent protein (GFP) expression were identified using fluorescence-activated cell sorting with the Aria II cell sorter (BD Biosciences). Sorted cells were plated into individual wells of a 96-well plate and then re-plated as single cells in 6-cm dishes. The deletions in the clones were confirmed by sequencing.
Expression of Fc-fused ␣DG Recombinant Proteins-HEK293T as well as HCT116 and its mutated cells were maintained in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% fetal bovine serum (FBS) in 5% CO 2 at 37°C. For the expression of ␣DG373(T322R)-Fc, cells were grown overnight and transfected using Polyethylenimine Max (Polysciences, Inc.), in accordance with a previous report (19). Protein expression was performed in DMEM supplemented with 10% FBS treated with a Protein G affinity column (GE Healthcare) to remove immunoglobulin G. The secreted Fc-fused ␣DG fragments in culture medium were purified using a Protein G affinity column.
Identification of O-glycopeptides-The gel band containing ␣DG was excised and subjected to in-gel digestion by sequential steps of reduction with 10 mM dithiothreitol at 37°C for 1 h, alkylation with 50 mM iodoacetamide in 25 mM ammonium bicarbonate buffer for 1 h in the dark at room temperature, destaining with 50% acetonitrile in 25 mM ammonium bicarbonate buffer, and then overnight digestion with sequencing-grade trypsin (Promega) at 37°C. The digested products were sequentially extracted with distilled water, 1% formic acid, and 50% acetonitrile/1% formic acid, dried down, and then redissolved in 0.1% formic acid for further cleaned up by ZipTip C18 (Millipore) before analysis. The peptide mixtures were analyzed by nanospray LC-MS/MS on an Orbitrap Fusion Tribrid (Thermo Scientific) coupled to an UltiMate 3000 RSLCnano System (Dionex, Sunnyvale, CA). Peptide mixtures were loaded onto an Acclaim PepMap RSLC 25 cm ϫ 75 m i.d. column (Dionex) and separated at a flow rate of 500 nL/min using a gradient of 5% to 35% solvent B (100% acetonitrile with 0.1% formic acid) in 60 min. Solvent A was 0.1% formic acid in water. The parameters used for MS and MS/MS data acquisition under the HCD product ion trigger CID mode were: top speed mode with 3-s cycle time; FTMS: scan range (m/z) ϭ 550 -2000; resolution ϭ 120 K; AGC target ϭ 2 ϫ 10 5 ; maximum injection time ϭ 60 ms; monoisotopic precursor selection on; including charge state 2-6; dynamic exclusion after two times within 10 s and then exclusion for 40 s with 10-ppm tolerance. FTMSn (HCD): isolation mode ϭ quadrupole; isolation window ϭ 1.6; collision energy 28% with stepped collision energy 5%; resolution ϭ 30 K; AGC target ϭ 1 ϫ 10 5 ; maximum injection time ϭ 75 ms; HCD production ions m/z 204.0867 or 366.1396 (z ϭ 1) within top 20 product ions were used to trigger CID; ITMSn (CID): isolation mode ϭ quadrupole; isolation window ϭ 1.6; collision energy ϭ 30%; AGC target ϭ 1 ϫ 10 4 ; ion trap scan rate ϭ rapid. HCD and CID MS 2 data sets were filtered for candidate glycopeptide spectra based on the presence of MS 2 ions corresponding to the expected tryptic peptide core containing the target O-glycosylation sites and/or peptide fragment ions, and then manually interpreted and assigned as described in the Results. Further rounds of searching through the acquired data set were based on thus identified diagnostic RboP/GroP-HexNAc ϩ oxonium ions. For MS 3 analysis, HCD MS 2 ions at m/z 358.0895 and 418.1105 were targeted for CID MS 3 using the inclusion list feature. The parameters used for FTMS 3 (CID) were: isolation mode ϭ iontrap; MS isolation window ϭ 1.6; MS 2 isolation window ϭ 3.0; scan range mode: auto normal; collision energy ϭ 30%; detector type: orbitrap; resolution ϭ 30 K; AGC target ϭ 1 ϫ 10 5 ; maximum injection time ϭ 120 ms.

RESULTS AND DISCUSSION
The Identification of Additional Modifications on Phosphorylated O-glycans of ␣DG Expressed in HEK293T Cells-To facilitate direct detection and sequencing of the ␣DG glyco-peptides carrying the target O-glycans, we have opted for an analytical strategy that would preserve as much of the native O-glycosylation and other modifications as possible without introducing any chemical cleavages or glycosidase digestions prior to the MS analysis. Our choice as minimum manipulation was truncation of ␣DG at R373 and fusion with Fc for secretion and ease of purification ( Fig. 1 and supplemental Fig. S1). The resulting Fc-tagged and purified proteins appeared as two major bands on SDS-PAGE at positions corresponding to ϳ35 kDa and just above 50 kDa (Fig. 1), neither of which was stained by IIH6, an anti-␣DG antibody recognizing lamininbinding glycans, as would be expected from the apparent size. Following in-gel tryptic digestion and recovery of peptides/glycopeptides from the gel, we initially searched for the target glycopeptide Q313-R337, but were unsuccessful. Hence, we decided to introduce T322R mutation to shorten the resulting tryptic glycopeptide further and to confine the O-glycosylation to only Thr 317 and Thr 319 , which have been implicated as sites carrying the functional laminin-binding glycans based on a mutagenesis study (5). Although the T322R substitution may affect the overall O-glycosylation occupancy and heterogeneity, previous study has shown that functional laminin-binding glycans could still be formed at Thr 317 and Thr 319 of the T322R mutant expressed in mouse NIH 3T3 cells (17). We then anticipated finding the resulting Q313-R322 and its companion A323-R337 tryptic glycopeptides by searching among the MS 2 spectra for the presence of respective peptide core ions. However, we succeeded in identifying only the A323-R337 glycopeptides, all of which were O-glycosylated to various extents, including glycoforms that carried an additional phosphate, which could be assigned as phosphorylated core M3 (supplemental Fig. S2). No evidence could be found for the presence of any additional glycan modifications in this tryptic glycopeptide that contains T328, T329, and S336.
We suspected that the failure to detect the target Q313-R337 peptide core ion may have been caused by an unknown modification of the peptide core itself, particularly at the nascent N terminus generated by an endogenous cleavage. Our search strategy was modified accordingly and extended to use several of the expected peptide y ions, which led to the identification of a peptide core with a mass discrepancy of minus 17 u localized to the N terminus, likely because of pyroglutamylation of the terminal glutamine residue. In fact, a total of over 8000 MS 2 spectra containing the modified peptide core ion at m/z 551.8038 2ϩ were filtered out using this strategy, which comprised 673 redundant spectra that could be collectively assigned to 65 unique precursor glycoforms (supplemental Table S1). From the compiled data, it was clear that there were glycoforms carrying only the core M1/M2 without an additional phosphate, some of which might correspond instead to mucin-type O-GalNAc cores. These nonphosphorylated core-type structures were difficult to distinguish unambiguously by current HCD/CID MS 2 analyses alone and were not pursued further. We focused our efforts instead on glycoforms that contain at least a phosphorylated core M3 structure. Among the major structures satisfying this criterion were the glycoforms carrying two phosphorylated core M3 moieties, with and without additional modifications (supplemental Table S1, Fig. 2).
All of the spectra of this series contain the same sets of ions corresponding to (1) HexNAc ϩ and HexNAc 2 ϩ at m/z  (14), yielding a mature product starting at Q313. The inclusion of this domain in the recombinant ␣DG construct is, however, necessary as it is required for proper laminin-binding glycan formation at Thr 317 /Thr 319 (5) (marked with asterisks). For the ␣DG373(T322R)-Fc product used in this study, an Fc domain is fused to R373, in place of the mucin-like and C-terminal domains. A tryptic cleavage site was introduced via T322R mutation, which allowed formation of a 10-amino acid tryptic peptide carrying only two threonine sites for O-glycosylation. The recombinant products, expressed and purified by protein G Sepharose from HEK293T cells, ran as two major bands on SDS-PAGE corresponding to ϳ35 and 50 -60 kDa. Only the upper band, excised, in-gel digested, and analyzed as bands 1 and 2 yielded similar tryptic peptides corresponding to the ␣DG mucin domain, whereas the lower band only gave peptides derived from the Fc domain. The entire amino acid sequence of ␣DG373(T322R)-Fc is shown in supplemental Fig. S1. Additional representative spectra for glycoforms carrying only one phosphorylated core M3, two phosphorylated core M3 moieties with two RboP, and two phosphorylated core M3 moieties with one RboP ϩ one GroP are shown in supplemental Fig. S3. The pentitol phosphate substituent was assumed and annotated as ribitol phosphate based on recent reports (11), whereas the analogous three-carbon substituent was tentatively assigned as glycerol phosphate (GroP) based on accurate mass increment alone, without further analysis to define its stereochemistry. Similarly, the identities of GlcA, Xyl, and GalNAc versus GlcNAc and Man were inferred from the literature and are represented by symbols following the SNFG (Symbol Nomenclature for Glycans) system (26), the details of which can be found at NCBI [http://www.ncbi.nlm.nih.gov/books/NBK310273/]. Their linkages and anomericity were not further established in this work. In (D), the RboP, P-RboP, RboP-RboP, and P-RboP-HexNAc ϩ ions as annotated were each accompanied by further loss of a H 2 O moiety, giving an additional ion at 18 u lower. 204.0866 and 407.1660, respectively; (2) singly and doubly charged peptide core at m/z 1102.6003 and 551.8038, respectively; (3) peptide b 3 , b 4 , b 5 , y 3 , and y 5 ions confirming the core sequence; and (4) the peptide core extended by a string of phosphorylated Hex and HexNAc that can be mapped to two phosphorylated core M3 moieties attached at two separate sites (annotated in Fig. 2A). These common sets of ions indicate that the MS 2 spectra filtered out for further manual assignment here all derived from the same glycopeptide core, differing only in additional modifications. Importantly, the prominent ion at m/z 418.110, representing a mass increment of 214.024 Da from the HexNAc ϩ oxonium ion at m/z 204.086, identifies the presence of a recently reported ribitol phosphate (RboP) substituent on HexNAc (Fig. 2B). This is further supported by ions at m/z 233.042 (RboP), 284.052 (phospho-HexNAc ϩ ), and 621.189 (RboP-HexNAc 2 ϩ ). More intriguingly, based on similar sets of ions, we identified another analogous substituent represented by a mass increment of 154.002 Da from HexNAc ϩ , which can be tentatively assigned as glycerol phosphate (GroP) and supported by ions at m/z 173.021 (GroP), 358.089 (GroP-HexNAc ϩ ), and 561.169 (GroP-HexNAc 2 ϩ ) (Fig. 2C). By virtue of these diagnostic ions, the GroP-HexNAc ϩ and RboP-HexNAc ϩ substituents were found on several but not all glycoforms of ␣DG373(T322R)-Fc expressed in HEK293T cells (supplemental Table S1). Notably, a tandem RboP-RboP unit was only found in the context of a fully extended HexA-Pent-RboP-RboP-HexNAc substituent, identified by the corresponding oxonium ion at m/z 940.213, further supported by ions at m/z 632.134 (RboP-RboP-HexNAc ϩ ), 498.077 (P-RboP-HexNAc ϩ ), 447.066 (RboP-RboP), 294.997 (P-RboP), 284.053 (P-HexNAc ϩ ), and 233.042 (RboP) (Fig. 2D). This critical oxonium ion at m/z 940.2 was more prominent in the corresponding trap CID MS 2 spectrum (Fig. 3D).
The complementary trap CID MS 2 spectra were in general characterized by prominent losses of terminal substituted and unsubstituted HexNAc, thereby producing the common doubly and singly charged peptide core ions retaining two phospho-Hex and 1-3 HexNAc (as annotated in Fig. 3A-3D). In particular, the common base peak ion was derived from the loss of terminal ϮR-HexNAc, which corroborated the identification of terminal substituents (r ϭ RboP, GroP, or HexA-Xyl-RboP-RboP) by the oxonium ions. Glycoforms carrying two phosphorylated core M3 moieties but only one terminal HexNAc of which was further substituted would thus afford the common doubly charged ion at m/z 1098.8 after
losing ϮR-HexNAc, accompanied by the one derived from losing an unsubstituted HexNAc. Such characteristics suggest that the R substituents are all carried on the terminal and not the penultimate HexNAc. We further subjected the prominent RboP-HexNAc ϩ and GroP-HexNAc ϩ ions at m/z 418.1 and 358.1, respectively, to MS 3 ( Fig. 3E and 3F), either by targeted MS 2 /MS 3 or by product ion-dependent MS 3 in additional LC-MS/MS runs. As expected, the RboP-HexNAc ϩ afforded MS 3 ions at m/z 233.043 (RboP) and 284.053 (P-HexNAc ϩ ), which were also present in the MS 2 spectrum ( Fig. 2B and 2D). Likewise, GroP-HexNAc ϩ yielded MS 3 ions at m/z 173.021 (GroP) and 284.053, albeit very weakly, consistent with the overall lower intensity of the precursor glycopeptide carrying this novel substituent. Although not an accurate quantitative profiling given the lack of knowledge on the response factors for each of the glycoforms carrying different modifications on 1-2 phosphorylated core M3, an extracted ion chromatogram overlay of select glycoforms indicates that glycoform with two phosphorylated core M3 moieties and that further substituted by a single RboP-were by far the more abundant compared with those substituted with GroP-or fully extended HexA-Xyl-RboP-RboP- (Fig. 4). Moreover, the glycoforms with both Thr 317 and Thr 319 occupied were also more prominent than those with only single-site occupancy.
Establishment of KO Cells of Individual Gene Products Causative of Dystroglycanopathy-ISPD-, TMEM5-, FKTN-, or FKRP-deficient cells were established by using the CRISPR/ Cas9 genome editing of HCT116 cells, a colon cancer cell line known to express the laminin-binding glycans on ␣DG (15). As expected, individual KO cells exhibited no laminin-binding glycans (Fig. 5A), whereas the target product expression rescued the defect (supplemental Fig. S4). It has been reported that ISPD is a cytosolic protein for the synthesis of CDP-Rbo (11,13), whereas FKRP, FKTN, and TMEM5 are located at the Golgi apparatus (21)(22)(23)(24). These three Golgi enzymes were not functionally compensating one another (supplemental Fig.  S4). Because FKRP exists as a homodimer and in multimeric protein complexes (21), we conducted analyses of the interactions among these causative gene products by a pull-down assay (Fig. 6). FKRP-flag, TMEM5-flag, and/or FKTN-myc were transiently coexpressed in HEK293T cells. The immunoprecipitants with anti-FKTN antibody contained FKRP-flag and TMEM5-flag, individually (Fig. 6A). The lysate of HEK293 expressing FKRP-flag and TMEM5-TAP were pulled down with IgG-Sepharose. The pulled-down proteins were found to contain FKRP-flag as well as TMEM5-TAP (Fig. 6B). The results indicate that FKRP, FKTN, and TMEM5 form a heterooligomeric complex, suggesting that their cooperative interplay is indispensable for the formation of the laminin-binding glycans. To address the functional roles of these causative gene products in cells, glycopeptides derived from in-gel digestion of ␣DG373(T322R)-Fc expressed by the individual KO cells (Fig. 5B) were subjected to similar LC-MS/MS analyses.
The Additional Modifications on Phosphorylated O-glycans of ␣DG Expressed by HCT116 Cells Lacking Individual Causative Gene Products of Dystroglycanopathy-Taking advantage of the established fragmentation pattern, particularly the diagnostic oxonium ions, we were able to rapidly filter out relevant glycopeptide spectra for manual assignment and confirmed the similar presence of RboP/GroP modifications FIG. 4. Overlays of the extracted ion chromatograms of select 313 pyrQIHATPTPVR 322 glycoforms carrying phosphorylated core M3 with and without additional substituents. The glycoforms with two phosphorylated core M3 moieties but carrying a single RboP, GroP, or GlcA-Xyl-RboP-RboP substituent were each clearly eluted as two major peaks, likely because of the substituent being carried on either of the two sites. For simplicity, not all of the glycoforms identified from the tryptic digest of ␣DG373(T322R)-Fc expressed in HEK293T cells (see supplemental Table S1) are plotted here. on phosphorylated core M3 attached to Thr 317 /Thr 319 of ␣DG373(T322R)-Fc expressed in HCT116 cells (supplemental Table S1). However, unlike HEK293T cells, fully extended HexA-Pent-RboP-RboP-HexNAc could not be found among the ␣DG glycoforms identified for WT HCT116 cells or any of its derived mutants. This probably reflects the observation that HCT116 cells express a higher level of endogenous laminin-binding glycans relative to HEK293T cells and thus the HexA-Pent-primed core would be readily extended, albeit not fully on the truncated ␣DG373(T322R)-Fc constructs, rendering their detection difficult. Moreover, in HCT116 cells, the glycoforms with single RboP, two RboP, and one RboP ϩ one GroP modifications were mostly detected in the FKRP mutant, or at much higher abundance in the FKRP mutant relative to the WT (Fig. 7), consistent with the lack of further extension and consequent accumulation because of impaired FKRP activity. It should also be pointed out that, in cases where two RboP could be found without HexA-Pent, they were not found as a tandem repeat unit but rather as a single RboP on each phosphorylated core M3 (supplemental Fig. S3B). Likewise, no supporting evidence could be found for a tandem RboP-GroP in either direction (supplemental Fig. S3C).
The absence of any RboP unit in ⌬ISPD and ⌬FKTN mutants is expected because the former is required to synthesize the CDP-ribitol, whereas the latter is the transferase suggested to be responsible for the addition of RboP from CDPribitol onto phosphorylated core M3 (11)(12)(13). More puzzling is the near absence of this substituent also in the ⌬TMEM5 mutant. TMEM5 was shown to be able to transfer Xyl from UDP-Xyl to CDP-ribitol and thus the true donor substrate of FKRP may be the presynthesized CDP-ribitol-Xyl instead of CDP-ribitol. It follows that both TMEM5 and FKTN may use and compete for the same CDP-ribitol pool. Proper in vivo transfer of RboP from CDP-ribitol to phosphorylated core M3 by FKTN would thus require its functional and coordinated interactions with TMEM5 facilitated by their physical association into an enzyme complex, as has been demonstrated (Fig. 6). Our data showing convergent glycosylation defects between ⌬FKTN and ⌬TMEM5 mutants are supportive of this model in which impairment of either would hinder the activity of the other. A different scenario may apply to GroP modification as we observed that this novel modification was prevented in ⌬FKTN but not ⌬TMEM5 mutant. This suggests that FKTN can equally transfer a GroP unit in place of RboP to the phosphorylated core M3 acceptor, independent of TMEM5.
Indeed, RboP and GroP modifications have been found in the structures of teichoic acids from Gram-positive bacteria cell walls (25). A series of enzymes are involved in the formation of RboP-RboP, GroP-GroP and GroP-RboP structures using CDP-ribitol and CDP-glycerol as donor substrates in the biosynthesis of teichoic acids. Although mammals do not synthesize teichoic acids, the evolutionary relationship between the biosynthesis pathways of teichoic acids and laminin-binding glycans is of interest. It is currently unclear whether GroP may substitute for RboP in linking the lamininbinding glycans to phosphorylated core M3 under unspecified physiological stages or in specific cell types, as we have not been able to detect such a functional structural unit, nor any tandem GroP or GroP-RboP modification. CONCLUSION In this study, using our nano-LC-MS/MS strategy that allows for the parallel acquisition of both HCD and trap CID MS 2 /MS 3 data, we directly identified the previously reported HexA-Pent-RboP-RboP-and RboP-substituents, along with another unique modification tentatively assigned as GroP, on phosphorylated core M3 attached to the truncated ␣DG expressed in HEK293T cells. Moreover, many other glycoforms of Thr 317 /Thr 319 carrying a combination of single RboP, GroP, or none of these modifications, on the two phosphorylated core M3 structures could be detected but, intriguingly, not tandem RboP modifications lacking further HexA-Xyl extension. The facile formation of ϮX-RboP-HexNAc ϩ and ϮX-GroP-HexNAc ϩ oxonium ions via HCD and their corroborative losses from precursors in trap CID can both be detected at high mass accuracy in the Orbitrap and be used as diagnostic ions to facilitate future rapid screening and identification of these unusual glycosyl modifications in a wide range of sample sources. The modifications identified in a panel of mutated HCT116 cells lacking ISPD, FKTN, FKRP, or TMEM5 not only confirmed the enzymatic activities of these causative gene Lysates of HEK293T cells transiently expressing FKRP-flag and TMEM5-TAP were pulled down with IgG-Sepharose, and subjected to SDS-PAGE and Western blots (Blot) with anti-myc or anti-flag antibodies. To examine the levels of TMEM5-TAP and FKRP-flag, the cell lysates were directly subjected to SDS-PAGE and Western blots with anti-myc or anti-Flag antibodies. The tandem affinity purification (TAP) tag consists of two immunoglobulin (IgG)-binding domains of Staphylococcus aureus, protein A, tandem myc epitope and a calmodulin-binding peptide, separated by a cleavage site for the tobacco etch virus (TEV) protease. products but also provided new insights into their functional roles during the formation of the laminin-binding glycans on ␣DG.
FIG. 7. Graphical representation of the identified phosphorylated core M3-carrying glycoforms of 313 pyrQIHATPTPVR 322 derived from ␣DG373(T322R)-Fc expressed in HCT116 and its various KO mutants. Each of the detected glycopeptides was quantified based on XIC peak intensity, normalized to the protein abundance of ␣DG373(T322R)-Fc recovered from each mutant, and graphically represented here as "bands" with thickness corresponding to their calculated relative abundance (see supplemental Table S1 for details and actual data). The red bands indicate the glycoforms with compositions consistent with carrying 1-2 phosphorylated core M3, with and without additional glycosyl residues. Glycoforms of the same composition with additional RboP and GroP substituents are represented by green and blue bands, respectively. Only one glycoform with both GroP and RboP was detected and is represented here as a green band. The glycoforms with RboP substituents (green bands) are notably identified only in samples from the WT and the FKRP mutant. FKTN-KO apparently led to an inability to produce any glycoforms with either RboP or GroP substituent.