Targeted Analysis of Lysosomal Directed Proteins and Their Sites of Mannose-6-phosphate Modification

Mannose-6-phosphate (M6P) is a distinctive post-transla-tional modification critical for trafficking of lysosomal acid hydrolases into the lysosome. Improper trafficking into the lysosome, and/or lack of certain hydrolases, results in a toxic accumulation of their substrates within the lysosomes. To gain insight into the enzymes destined to the lysosome these glycoproteins can be distinctively enriched and studied using their unique M6P tag. Here we demonstrate, by adapting a protocol optimized for the enrichment of phosphopeptides using Fe 3 (cid:2) -IMAC chromatography, that proteome-wide M6P glycopeptides can be selectively enriched and subsequently analyzed by mass spectrometry, taking advantage of exclusive phosphomannose oxonium fragment marker ions. As proof-of-concept of this protocol, applying it to HeLa cells, we identified hundreds of M6P-modified glycopeptides on 35 M6P-modified glycoproteins. We next targeted CHO cells, either wild-type or cells deficient in Acp2 and Acp5, which are acid phosphatases targeting M6P. In the KO CHO cells we observed a 20-fold increase of the abundance of the M6P-modification on endogenous CHO glycoproteins but also on the recombinantly over-expressed lysosomal human alpha-galactosidase. We conclude that our approach could thus be of general interest for characterization of M6P glycoproteomes as well as characterization of lysosomal enzymes used as treatment in enzyme replacement therapies targeting lysosomal storage diseases. Molecular & Cellular Proteomics 18: acetonitrile and 0.07% TFA, and mobile-phase solvent B consisted of 0.3% NH 4 OH in water. Loading was performed at a flow rate of 0.1 ml/min for 7 min with 0% B and nonphosphory-lated or nonphosphomannosylated peptides were washed out using a flow rate of 1 ml/min for 5 min with 0% B. Phosphopeptides and phosphomannose peptides were eluted at a flow rate of 1 ml/min with 50% B for 1.5 min, followed 0.5 ml/min for 2.5 min 50% B, ml/min of for Collected phosphopeptides and phosphomannose peptides dried down using a lyophilizer. analysis of Fe TFA,


In Brief
Here a Fe 3ϩ -IMAC-based enrichment method is presented targeting mannose-6-phosphate (M6P) modified glycopeptides enabling the detailed profiling of these in both HeLa and CHO cells. A gene engineering strategy was used to remove the Acp2/5 phosphatases, which increased the abundance and coverage of the M6P-modified glycoproteome. We demonstrate that such a gene engineering strategy can be beneficial for the expression of alpha-galactoside and other enzymes used in the treatment of lysosomal storage diseases. Mannose-6-phosphate (M6P) is a distinctive post-translational modification critical for trafficking of lysosomal acid hydrolases into the lysosome. Improper trafficking into the lysosome, and/or lack of certain hydrolases, results in a toxic accumulation of their substrates within the lysosomes. To gain insight into the enzymes destined to the lysosome these glycoproteins can be distinctively enriched and studied using their unique M6P tag. Here we demonstrate, by adapting a protocol optimized for the enrichment of phosphopeptides using Fe 3؉ -IMAC chromatography, that proteome-wide M6P glycopeptides can be selectively enriched and subsequently analyzed by mass spectrometry, taking advantage of exclusive phosphomannose oxonium fragment marker ions. As proof-ofconcept of this protocol, applying it to HeLa cells, we identified hundreds of M6P-modified glycopeptides on 35 M6P-modified glycoproteins. We next targeted CHO cells, either wild-type or cells deficient in Acp2 and Acp5, which are acid phosphatases targeting M6P. In the KO CHO cells we observed a 20-fold increase of the abundance of the M6P-modification on endogenous CHO glycoproteins but also on the recombinantly over-expressed lysosomal human alpha-galactosidase. We conclude that our approach could thus be of general interest for characterization of M6P glycoproteomes as well as characterization of lysosomal enzymes used as treatment in enzyme replacement therapies targeting lysosomal storage diseases. Lysosomes and lysosome associated organelles are cytosolic acidic compartments responsible for the degradation and digestion of a variety of biological macromolecules (1). Its degradative activity originates from the presence of over 60 lysosomal acid hydrolases. Most of the lysosomal enzymes are incorporated into the organelle triggered by a specific post-translational modification involving a unique mannose-6-phosphate (M6P) 1 moiety. This sugar moiety can be recognized by two distinct mannose-6-phosphate receptors in the Golgi complex, that subsequently transport their cargo toward the lysosome (2,3).

Graphical Abstract
Biosynthesis of the M6P tag starts in the rough endoplasmic reticulum (ER), where co-translational modification results in the N-glycosidic linkage of a 14-sugar glycan entity on selected asparagine residues in targeted proteins ( Fig. 1.1 to  1.4). Subsequent removal of three glucose molecules by glucosidases and a terminal mannose moiety by mannosidase, produces uniform glycoproteins prepared for export from the ER, to the Golgi apparatus ( Fig. 1.5 to 1.8) (4). In the cis-Golgi network, selective phosphorylation of the proteins is achieved by the recognition of a lysine surface patch by the GlcNAcphosphotransferase (5). The enzyme initiates a two-step reaction by catalyzing the addition of a GlcNAc-1-phosphate molecule to the outermost mannose residues ( Fig. 1.9). The subsequent removal of the terminal GlcNAc group occurs by the GlcNAc-1-phosphodiester-N-acetylglucosaminidase ("uncovering enzyme" (UCE)), that finally produces the M6P that is recognized by the mannose-6-phosphate receptors. In the trans-Golgi network, the mannose-6-phosphate receptors recognize and bind to the M6P sugar moiety initiating the incorporation of newly synthesized lysosomal acid hydrolases in clathrin coated vesicles for their ultimate delivery to the lysosomes. Here, the mannose-6-phosphate receptor substrates dissociate from their receptor because of the substantial drop in pH, upon which many proteins undergo additional modifications to produce a fully functional enzyme. Some of such modifications are activated by the phosphatases Acp2 and Acp5 that remove phosphate groups from lysosomal acid hydrolases, triggering them to execute their function as hydrolases, activating the degradation of macromolecules (6).
Providing lysosomes with the correct array of enzymes is of uttermost importance in cellular function. Absence or improper activity of one of the lysosomal acid hydrolases results in diverging phenotypes ranging from improper antigen processing because of improper cathepsin function (7)(8)(9) all the way to lysosomal storage diseases. The most prominent phe-notype of lysosomal storage diseases is the accumulation of glycoproteins, unprocessed lipids, mucopolysaccharides, and combinations thereof. Two well-known examples include Fabry's and Gaucher disease, resulting from the absence of alpha-galactosidase A and glucocerebrosidase enzymes, respectively, causing sphingolipid accumulation inside the lysosome (10,11). Currently, the only therapeutic option for such disorders is enzyme replacement therapy where a recombinant version of the enzyme is provided. One of the critical attributes of such replacement therapies in the presence of the M6P-moitey on the enzyme, ensuring proper targeting to the lysosome where they can exert their therapeutic function.
To understand the origin of lysosomal storage diseases, it is essential to obtain an overview of the wide range of proteins that are contained within the lysosome. The current most 1 The abbreviations used are: M6P, Mannose-6-Phosphate; PSM, peptide to spectrum match; Man, Mannose; IMAC, immobilized metal affinity chromatography; ER, endoplasmic reticulum; GlcNAc, Nacetylglucosamine; UCE, uncovering enzyme; Acp2, acid phosphatase 2; Acp5, acid phosphatase 5; MS, mass spectrometry; HCD, higher energy c-trap dissociation; EThcD, electron-transfer higherenergy collision dissociation; KO, knock out.

En do pla sm ic ret icu lum
Go lgi P P P P P P P P P P P P Biosynthesis starts with a co-translational transfer of a 14-sugar glycan precursor, onto an asparagine residue of a nascent polypeptide (steps 1-4). Next, the modified protein undergoes sequential trimming by glucosidases and mannosidase that remove glucose and mannose moieties, respectively (steps 4 -7). Loss of one mannose residue in step 7 serves as a checkpoint of a proper folding state of a protein and enables transfer to the cis-Golgi (steps 7-8). As the protein traverses the Golgi it undergoes further modification by a GlcNAc-phosphotransferase to obtain the M6P tag enabling recognition of the M6P modified proteins and subsequent transfer into the lysosome. Note that the steps 1-7 occur in the endoplasmic reticulum, whereas steps 8 -11 occur in the Golgi compartment.
widely used technique in the proteomic analysis of the lysosome is the affinity purification from tissue samples using columns with immobilized receptors (12)(13)(14)(15). In these earlier used methods, M6P-modified proteins were extracted, deglycosylated with peptide N-glycosidase F and subsequently subjected to mass spectrometry (MS) analysis. However, these approaches can result in false positive identifications because of the applied removal of the M6P-tag. A few studies focused on intact M6P glycopeptides, although they were mainly focused on the analysis of a single lysosomal protein (16,17).
Here we use a well-known enrichment method, for phosphopeptides, based on iron immobilized metal ion affinity chromatography (Fe 3ϩ -IMAC) that we show enables the coenrichment of M6P-modified glycopeptides from a complex cell lysate. To distinguish the M6P-modified glycopeptides from phosphopeptides we utilize a specific signature fragment ion of M6P (m/z 243.026) (18) observed during higherenergy c-trap dissociation (HCD). Observation of this ion in HCD triggers electron-transfer/higher-energy collision dissociation (EThcD), which results in a more confident characterization of the exact sites of the M6P-modified glycopeptides, and provides signature glycan fragments enabling confident glycoform identification. Initial analysis of a HeLa cell lysate with our modified enrichment and targeted MS approach resulted in the identification of 35 M6P-modified glycoproteins and 46 M6P glycosites from hundreds of detected M6P glycopeptides. Next, we applied our approach to wild-type CHO cells and CHO cells carrying a gene knockout (KO) of the acid phosphatases Acp2 and Acp5, which dephosphorylate lysosomal proteins. KO of these phosphatases resulted in Ͼ20 fold increase in abundance of M6P glycopeptides, enabling a deeper coverage of the CHO M6P-glycoproteome. Finally, we demonstrate that recombinant expression of lysosomal human alpha-galactosidase in WT CHO cells and CHO cells carrying the Acp2 and Acp5 KO results in a drastic increase of the M6P-moiety on alpha-galactosidase in the KO CHO cell line. We conclude that Fe 3ϩ -IMAC enables facile enrichment of intact M6P glycopeptides and could be used to extend biological insight gained from standard phosphoproteomic studies, additionally, when coupled with glycoengineering strategies it can be used for characterization of improved therapeutic enzymes used for the treatment of lysosomal storage diseases.

EXPERIMENTAL PROCEDURES
Hela Cell Culture-HeLa cells were cultured in Dulbecco's modified Eagle's medium containing 10% fetal bovine serum (Invitrogen, Landsmeer, the Netherlands) and 0.05 mg/ml penicillin/streptomycin (Invitrogen) at 37°C in 5% CO 2 in 15 cm plate. Cells were collected at 80% confluence by centrifugation and washed three times with icecold PBS.
CHO Cell Culture and Generation of Acp2/5 Knockout-A CRISPR/ Cas9 based approach was used for the gene knockout (KO) in CHO cells as described preciously (19). CHOZN GSϪ/Ϫ cells with stable expression of alpha-galactosidase were used as the parental clone (as below: wild type) for the Acp2/5 KO. Cells were maintained as suspension cultures in EX-CELL CHO CD Fusion serum-free media (Sigma-Aldrich, Brøndbyvester, Denmark) in 50 ml TPP TubeSpin® Bioreactors with 180 rpm shaking speed at 37°C and 5% CO 2 . Cells were seeded at 0.5 ϫ 10 6 cells/ml in T25 flask (NUNC, Hvidovre, Denmark) 1 day prior to transfection. Electroporation was conducted with 2 ϫ 10 6 cells with a DNA mixture of 1 g of Cas9-GFP plasmid and 1 g of gRNA plasmid (U6GRNA, Addgene Plasmid #68370) using an Amaxa kit V and program U24 with Amaxa Nucleofector 2B (Lonza, Copenhagen, Denmark). Forty-eight hours after nucleofection the 10 -15% highest GFP expression pools of cells were enriched by FACS, and after 1 week cultured cells were single-cell sorted by FACS into 96-wells. KO clones were identified by Indel Detection by Amplicon Analysis (IDAA) as described (20). Selected clones were further verified by Sanger sequencing. Around 1.5 ϫ 10 8 of wild type and KO cells were collected by centrifugation and washed three times with ice-cold PBS.
Experimental Design and Statistical Rationale-For both Hela and CHO cells, three technical replicates were performed. The values of peptide to spectrum matches (PSM) from each detected glycan composition across all M6P-modified peptides were summed then mean Ϯ standard deviation of three technical replicates were calculated.
Fe 3ϩ -IMAC Enrichment-Fe 3ϩ -IMAC was performed in technical triplicates for each biological group as described previously (21). Tryptic peptides (2 mg) were re-suspended with ice-cold buffer A and the pH was adjusted to 2.3 using 10% TFA before injection into the Fe 3ϩ -IMAC column (Propac IMAC-10 4 ϫ 50 mm column, Thermo-Fisher Scientific, Landsmeer, the Netherlands). Mobile-phase solvent A consisted of 30% acetonitrile and 0.07% TFA, and mobile-phase solvent B consisted of 0.3% NH 4 OH in water. Loading was performed at a flow rate of 0.1 ml/min for 7 min with 0% B and nonphosphorylated or nonphosphomannosylated peptides were washed out using a flow rate of 1 ml/min for 5 min with 0% B. Phosphopeptides and phosphomannose peptides were eluted at a flow rate of 1 ml/min with 50% B for 1.5 min, followed 0.5 ml/min for 2.5 min with 50% B, and finally held at 1 ml/min of 0% B for 9 min. Collected phosphopeptides and phosphomannose peptides were dried down using a lyophilizer.
LC-MS/MS Analysis-Nanoflow LC-MS/MS was performed by coupling an Agilent 1290 (Agilent Technologies, Middelburg, Netherlands) to an Orbitrap Fusion Lumos Tribrid mass spectrometer (Thermo Scientific, Bremen, Germany) for the analysis of peptides enriched by Fe 3ϩ -IMAC. After resuspension in 0.1% TFA, peptides were separated by using a 100 m inner diameter 2 cm trap column (in-house packed with ReproSil-Pur C18-AQ, 3 m) (Dr. Maisch GmbH, Ammerbuch-Entringen, Germany) coupled to a 50 m inner diameter 50 cm analytical column (in-house packed with Poroshell 120 EC-C18, 2.7 m) (Agilent Technologies, Amstelveen, The Netherlands). Mobile-phase solvent A consisted of 0.1% FA in water, and mobile-phase solvent B consisted of 0.1% FA in ACN. Trapping was performed at a flow rate of 5 l/min for 5 min with 0% B and peptides were eluted using a passively split flow of 300 nl/min for 85 min with 8% to 40% B over 75 min, 40% to 100% B over 3 min, 100% B for 1 min, 100% to 0% B over 1 min, and finally held at 0% B for 10 min. Peptides were ionized using 2.0 kV spray voltage and a capillary temperature of 320°C. The mass spectrometer was set to acquire full-scan MS spectra (375-1700 m/z) for a maximum injection time of 50 ms at a mass resolution of 60,000 and an automated gain control (AGC) target value of 4e5. The dynamic exclusion was set to 20s at exclusion window of 10 ppm with a cycle time of 3s. Charge-state screening was enabled, and precursors with ϩ2 to ϩ6 charge states and intensities Ͼ1e5 were selected for tandem mass spectrometry (MS/MS). HCD MS/MS (150 -1800 m/z) acquisition was performed in the HCD cell, with the readout in the Orbitrap mass analyzer at a resolution of 30,000 (isolation window of 1.6 Th) and an AGC target value of 5e4 or a maximum injection time of 75 ms with a normalized collision energy of 30%. When the oxonium ion of phosphomannose (Man P, 243.026ϩ) was observed, EThcD MS/MS on the same precursor was triggered (isolation window of 1.6 Th) and fragment ions (150-2100 m/z) were analyzed in the Orbitrap mass analyzer at a resolution of 30,000, AGC target value of 2e5 or a maximum injection time of 200 ms with activation of ETD and supplemental activation with a normalized collision energy of 27%.
Data Analysis-Raw data files of peptides enriched by Fe 3ϩ -IMAC were processed using Byonic software (ver 2.15.10) (Protein Metrics Inc.) with the following search parameters: trypsin digestion with a maximum of 2 missed cleavages, unreviewed CHO database (Uniprot, 34 962 entries,March 2018), and reviewed human database for HeLa samples (Uniprot, 26 339, March 2018). Precursor ion mass tolerance, 10 ppm; fragmentation type, both HCD & EThcD; product ion mass tolerance for HCD, 20 ppm; product mass tolerance for EThcD, 20 ppm; carbamidomethylation of cysteines as a fixed modification; variable modifications: methionine oxidation, phosphorylation on serine, threonine and tyrosine residues. Byonic database of the 50 common biantennary N-glycans and 20 M6P-modified Nglycans (full list of M6P compositions is provided in supplemental Table S1). Byonic cut-off score of 100 was used and all M6P-modified identified glycopeptide spectra were further manually validated for the presence of the signature ion (243.026 m/z). Skyline (Skyline-daily, Version 4.0.9.11664) was used to build a spectral library from search results and extracted ion chromatograms (XICs) of precursor ions with the first five isotopes being used to calculate peak areas of selected peptides shown in the Figs. 5 and 6. Raw files of previously published study (22) of three standard approaches for phosphopeptide enrichment were downloaded from PRIDE partner (identifier: PXD005366) and searched with same above-mentioned parameters in Byonic software except fragmentation type was only HCD.

Evaluation of the HeLa Cell Mannose-6-phosphate Modified
Glycoproteome-Somewhat inspired by studies (24 -26) demonstrating the applicability of common phosphopeptide enrichment protocols to enrich also sialylated glycopeptides we investigated whether also M6P-modified glycopeptides could be enriched. To this end we first set out to reanalyze previously published data (22) in which three standard approaches for phosphopeptide enrichment were compared: Ti 4ϩ and Fe 3ϩ based IMAC, and TiO 2 based enrichment, now allowing variable M6P-modifications, GlcNAc 2 Man 3-8 Phospho 1-2 and GlcNAc 3-4 Man 6 -8 Phospho 1 NeuAc 0 -1 (supplemental Table S1), on Asn residues in addition to phosphorylation on Ser, Thr and Tyr. Our reanalysis of these data sets (Fig. 2) revealed that next to several thousands of phosphopeptides also tens of M6P-modified glycopeptides were coenriched. As previously noted (22) the three enrichment approaches performed equally well for the phosphopeptides with a great overlap in identified unique phosphopeptides. In the co-enrichment of M6P-modified glycopeptides, we seem-

Targeted Analysis of Lysosomal Directed Proteins
Molecular & Cellular Proteomics 18. 1 19 ingly did observe a bias, whereby the Fe 3ϩ -IMAC based enrichment provided the highest number of identifications with 25 identified M6P-modfied glycopeptides. Ti 4ϩ IMAC provided a slightly lower efficiency with 14 identified M6P glycopeptides, and more surprisingly, only 3 M6P glycopeptides were identified by using the TiO 2 based enrichment. Although in this experiment Fe 3ϩ -IMAC and Ti 4ϩ IMAC show slightly different performance, we identified the same glycoforms, GlcNAc 2 Man 6 -7 Phospho 1-2 , albeit on a different peptide backbones indicating that the difference is likely because of stochastic nature of precursor ion selection during MS analysis. From these experiments, we decided to perform the rest of the experiments using Fe 3ϩ -IMAC and we sought to further optimize the enrichment and analysis protocol to extend the coverage of M6P-modified glycoproteome. We hypothesized that there are two main reasons for the low numbers of M6P-modified glycopeptides. First, the MS fragmentation scheme based only on HCD is not optimal for glycopeptide identifications because of the preferential cleavage of the glycan moiety (27,28). Second, combination of low starting material used in (22) and the inherent low abundance of M6P glycoproteins probably precluded their detection during MS analysis. To test our hypothesis, we performed an experiment where we increased the amount of starting material to 2 mg of protein and optimized the MS fragmentation scheme to combine HCD with EThcD fragmentation, which were triggered upon observation of the signature fragment ion in the HCD spectra for M6P-modified glycopeptides, namely the phosphomannose oxonium ion (243.026 m/z) (18). Using this approach, we could now identify almost a hundred of unique M6P-modified glycopeptides (supplemental Table S2) in Hela cells corresponding to 35 M6P-modified glycoprotein and 46 M6P glycosites (Table I), which is significantly more than what we could identify from the published dataset (22) (Fig. 2C). We observed that the pool of M6P glycopeptides in frequency comprise around 2-3% of the total peptide (M6P glycopeptide and phosphopeptide) pool. Manual inspection of the spectra elucidated that the prime reason for the observed increase in the number of M6P-modified glycopeptide identifications is because of the inclusion of EThcD as fragmentation technique. Additionally, we observed a cleavage between the first and second GlcNAc residues of M6P glycans, albeit only during EThcD fragmentation, which allows unambiguous assignment of the glycan form attached to the peptide backbone ( Fig. 3B and supplemental Fig. S1). This combined with the observation that the M6P modification resulted in an increase of miss cleavage rates, resulting in larger peptides carrying more charges makes EThcD fragmentation attractive for characterization of M6P modified glycopeptides (29). However, as demonstrated in Fig. 3A and supplemental Fig.  S1, it is also possible to successfully identify M6P modified glycopeptides with HCD fragmentation alone, but greater care must be devoted to manual validation as a large fraction of the identifications is based solely on precursor mass. Inspection of all the M6P-modified glycopeptide spectrum matches revealed that in HeLa cells each of the glycosites is predominantly modified with a GlcNAc 2 Man 7 Phospho 2 (Man7 PP) glycoform, but in total we did observe at least 7 different glycan compositions ranging from Man3 P to Man8 PP ( Fig  4A). However, we did not observe any glycans corresponding to hybrid N-glycans harboring the M6P tag. Considering that hybrid N-glycans are the lowest abundant class of N-glycans it is most likely that glycopeptides harboring hybrid M6P glycans are simply too low abundant to be detected. We then further checked the specificity of our enrichment and counted the N-glycan signature oxonium ions (m/z 204). In total, we detected over 2000 scans containing both the 204 and 243 oxonium ions specific for N-glycans in general, and M6P modified glycans, respectively (supplemental Table S3). When compared with scans that contained only 204 ion and no 243 ion, around 200 scans, we calculated that over 90% of all glycopeptide scans are in fact originating from M6P-harboring glycopeptides. Our findings support in vitro studies, which revealed that the Man7 PP glycoform exhibits the highest binding affinity toward mannose-6-phosphate receptors (30,31). We also further validated our results by performing a pathway enrichment analysis (32), which showed, as expected, significant enrichment for lysosomal proteins (Fig.  4B). Additionally, the Cell Atlas (33) was used to validate protein hits not previously reported to be lysosomal proteins, such as CREG1 protein, which verified that it can be a bone fide lysosomal protein.

Molecular & Cellular Proteomics 18.1 21
(CHO) is often used in biological and medical research and commercially in the production of therapeutic proteins (34,35), including enzymes. It has been used to produce recombinant lysosomal human acid alpha-galactosidase for the treatment of the lysosomal storage diseases: Fabry's disease (36,37). Therefore, we were interested to apply our method to reveal the M6P landscape in CHO cells. We used a CHO cell line with a stable expression of recombinant human acid alpha-galactosidase (referred to as CHO WT) and subjected it to the same protocol as described above for the HeLa cells.
To our surprise, starting with similar amounts of input we identified considerably fewer (i.e. 15) M6P-modified glycoproteins (Table II and supplemental Table S4) with a very limited number of detected M6P-modified glycopeptide PSMs (Fig.  5A). Because we have performed all the CHO and HeLa cells simultaneously in triplicates we could eliminate the possibility of experimental variation that could cause such drastic differences in the numbers of identified M6P-modified glycoproteins between HeLa and CHO cell lines. One other potential reason for such a low number of identifications may originate from, that under the applied conditions, less M6P-modified glycoproteins are synthesized and transported into the lysosome in these CHO cells.
To test this hypothesis and motivated by previous studies (15,38) demonstrating increased selectivity of M6P affinity purification in Acp5 or Acp2/5 deficient mice, we created a CHO cell line with a double knock-out (KO) of the Acp2 and Acp5 phosphatases, responsible for the dephosphorylation of M6P-modified glycoproteins upon their entry into the lysosome. As expected, when using a same input and identical protocol, but now applied to the CHO Acp2/5 KO cell line, we were able to double the number of identified M6P-modified glycoproteins (28 in CHO KO versus 15 in CHO WT) and we observed 5-fold increase in scans containing the signature M6P ion (supplemental Table S3). We also identified a couple of hundreds of M6P-modified glycopeptide PSMs mapping to 160 unique M6P glycopeptides in the Acp2/5 KO cells, which represents a 4-fold increase in M6P glycopeptide identifications when compared with the 50 M6P-modified glycopeptide PSMs corresponding to 42 unique M6P glycopeptides detected in CHO WT ( Fig. 5A and supplemental Table S4). Furthermore, even for M6P-modified proteins observed in both WT and Acp2/5 KO CHO cells, such as Cathepsin Z (Fig.  5B), we observed an increase in glycoform coverage from 4 distinct glycoforms identified in CHO WT cell to 8 glycoforms identified in CHO Acp2/5 KO. To confirm that the increase of our identifications is because of the increase in M6P modifications itself and not the up-regulation of M6P modified glycoproteins we compared the intensities of unmodified peptides and M6P-modified peptides in both cell lines. We observed that the abundance of the unmodified tryptic peptides of Cathepsin Z remains alike in both CHO WT and CHO Acp2/5 samples (Fig. 5C and 5D), whereas the M6P modified peptides from the same protein ( Fig. 5C and 5D) showed ϳ20-fold increase in abundance in the CHO Acp2/5 KO cells, demonstrating that Acp2/5 KO results in increased abundance of the M6P-modifications. A similar trend was also observed when comparing the entire ion chromatogram traces between CHO WT and CHO Acp2/5 experiments (supplemental Fig. S2). Interestingly, we also observed a GlcNAc 2 Man 8 Phospho (Man3 P) modification on multiple proteins from the CHO Acp2/5 KO cell line. A representative spectrum is shown in Fig. 5E. Although we cannot pinpoint the exact mannose residue the phosphate is attached to, the ion signal observed at m/z 1051.45 (S/N Ͼ 5) indicates that it is likely attached to either an ␣-3 or ␣-6 mannose representing a previously undescribed M6P modification. However, this phosphate could also be located on an elongated mannose branch, which would be more in line with the known biosynthetic pathways. It remains to be answered whether these more processed high mannose M6P structures originate during biosynthesis or are products of mannosidase action inside the lysosomes.
When we compared the characteristics of the M6P-modified glycoproteome of the HeLa and CHO cells we observed a very similar distributions of glycoforms, with possibly one notable exception being the absence of the GlcNAc 2 Man 8 Phospho (Man8 P) glycoform in CHO cells, for which we have now no plausible explanation. Notably, we detected the protein CREG1 also in CHO cells as a M6Pmodified glycoprotein, as we did in the HeLa cells, further providing credence that this is a "novel" genuine lysosomal protein.
Expression of Alpha-galactosidase in CHO Acp2/5 KO for Improved Lysosomal Targeting-Encouraged by the observed increase in frequency and abundance of M6P-modifications in the CHO Acp2/5 KO cells we hypothesized that the same could hold for a human protein recombinantly expressed in this CHO KO cell line. Therefore, we overexpressed human acid alpha-galactosidase, which is a 47 kDa N-glycosylated lysosomal hydrolase involved in sphingolipid metabolism. Deficiency of this alpha-galactosidase enzyme results in a lysosomal storage disorder known as Fabry disease and the current treatment option relies on enzyme replacement therapy with recombinant human alpha-galactosidase enzyme. The presence of the M6P modification on recombinant human alpha-galactosidase used for therapy is critical for its efficacy and targeting to the lysosomes.
Analysis of the recombinant human alpha-galactosidase enzyme expressed in CHO WT cell resulted in around 70 PSMs, while when it was expressed in the CHO Acp2/5 KO cell line the number of PSMs increased to over 300 PSMs (Fig.  6A). Next, the M6P glycoform analysis revealed a higher than expected heterogeneity, whereby we could observe 6 dominant glycoforms (GlcNAc 2 Man 5-8 Phospho 1-2 ). This was in notable contrast with the glycoform distributions we observed in both the CHO and HeLa M6P glycoproteome analyses, which were dominated by the Man7 PP glycoform. Interestingly, we also noticed that in both sets of CHO cells we could observe the human alpha-galactosidase enzyme with a GlcNAc 2 Man 8 Phospho 1 (Man8 P) glycoform, which was absent from all other endogenous CHO M6P glycoproteins detected in CHO cells.
Finally, we also wanted to confirm that this increase is because of the increased abundance of the M6P modifications rather than up-regulation of the expressed protein. To this end we extracted ion chromatograms for both unmodified an M6P-modified tryptic peptides of human alpha-galactosidase (Fig. 6B). We observed that in the CHO Acp2/5 KO cells the abundance of the M6P-modified peptides was ϳ20-fold higher, whereas the abundance of unmodified peptides showed very little variation. This suggest that glycoengineering efforts could represent a promising future avenue for

DISCUSSION
In this study we demonstrated that Fe 3ϩ -IMAC enrichment typically used for phosphopeptides, can also be used to enrich in a single step M6P-modified glycopeptides. This enrichment combined with optimized LC MS/MS fragmentation schemes combining HCD and EThcD, triggered by specific M6P-glycopeptide oxonium fragment ions, enabled the identification of hundreds of M6P-modified glycopeptides in HeLa and CHO cells. When compared with other studies aimed at profiling the M6P glycoproteome (14,15,39) that included a deglycosylation step and consequently lost all information on the actual glycan compositions that bear the M6P tag, our method provides deeper insights into the complexity of the glycan forms attached to the M6P glycopeptides. However, deeper insight comes with a price of lower protein IDs, as we detected 30 -50% less M6P glycoproteins than described in above mentioned studies. A plausible cause of this observed decrease in M6P glycoprotein identifications may be the low ionization efficiency of glycosylated peptides (40)  CHO ACP2/5 KO M a n 8 P P M a n 7 P P M a n 7 P M a n 6 P P M a n 6 P M a n 5 P M a n 4 P M a n 3 P signal into multiple lower abundant signals. This was maybe best demonstrated in our comparison of CHO WT versus CHO Acp2/5 KO where we could detect twice as many M6P glycoproteins and observe a 700% increase in PSMs in the CHO Acp2/5 cells, demonstrating that the main challenge in identification of intact M6P glycopeptides is their low abundance (supplemental Fig. S2). However, we feel that this tradeoff in M6P glycoprotein identifications is well compensated by the additional information obtained on glycan structures attached to each modified site that will enable further studies of glycanprotein functions. Another challenge in intact glycopeptide profiling of complex mixtures is the high proportion of false positive identifications stemming from the inability to control the FDR for both the peptide and glycan moieties simultaneously (41). To address this issue, we relied on the known biosynthetic pathways that describes possible M6P glycan compositions (Fig. 1), presence of signature oxonium ions, improved sequence coverage and unambiguous glycoform identification obtained with EThcD method, and known subcellular location of M6P modified proteins, which all together enabled us to conclude that we achieve here confident identification of M6P modified glycopeptides.
Additionally, we demonstrated a new gene editing strategy involving the knock-out of Acp2 and Acp5 phosphatases in CHO cells that results in an ϳ20-fold increase in the abundance of M6P-modified peptides enabling us to gain a deeper insight into the lysosomal M6P glycoproteome. Finally, we also established that the CHO Acp2/5 KO cells can be used to produce improved glycoengineered lysosomal enzymes, carrying a higher degree of required M6P modifications, which could be potentially beneficial for lysosmal enzyme replacement therapies. However, it must be noted that in this work we have focused on characterization of lysosomal human alpha-galactosidase, whereas the enzyme used for the treatment Fabry's disease is secreted human alpha-galactosidase.
The main difference is that secreted human alpha-galactosidase does not traffic to the lysosome where membrane bound Acp2 and soluble Acp5 phosphatases reside (38) and exhibit their dephosphorylative function, but it is known that Acp5 can be actively secreted (42). Further studies will be necessary to elucidate the effects of its' KO on M6P levels of secreted human alpha-galactosidase.
In conclusion, the straightforward method described in this paper will enable further characterization of M6P glycoproteomes and provide further biological insight in the areas where lysosomal proteins have critical functions such as modulation of immune response during infection (43). Additionally, this method is also applicable for the characterization and quality control of biological therapeutics wherein M6P is a critical quality attribute of the product, and in combination with genetic glycoengineering strategies could pave the way for better biological products.

DATA AVAILABILITY
The mass spectrometry data have been deposited to the ProteomeXchange Consortium via the PRIDE (23) partner repository with the data set identifier PXD010333 and 10. 6019/PXD010333. For review purposes, review account has been made available. As PRIDE does not fully support glycopeptide data Byonic results, in the form of freely available Byonic Preview files, containing all identified and assigned spectra are available at the following link: https://figshare. com/s/ce1596a84a2fe0927b1e. 120 M a n 8 P P M a n 8 P M a n 7 P P M a n 7 P M a n 6 P P M a n 6 P M a n 5 P M a n 4 P M a n 3 P