Proteomic identification of mammalian cell surface derived glycosylphosphatidylinositol-anchored proteins through selective glycan enrichment

Glycosylphosphatidylinositol-anchored proteins (GPI-APs) are an important class of glycoproteins that are tethered to the surface of mammalian cells via the lipid GPI. GPI-APs have been implicated in many important cellular functions including cell adhesion, cell signaling, and immune regulation. Proteomic identification of mammalian GPI-APs en masse has been limited technically by poor sensitivity for these low abundance proteins and the use of methods that destroy cell integrity. Here, we present methodology that permits identification of GPI-APs liberated directly from the surface of intact mammalian cells through exploitation of their appended glycans to enrich for these proteins ahead of LC-MS/MS analyses. We validate our approach in HeLa cells, identifying a greater number of GPI-APs from intact cells than has been previously identified from isolated HeLa membranes and a lipid raft preparation. We further apply our approach to define the cohort of endogenous GPI-APs that populate the distinct apical and basolateral membrane surfaces of polarized epithelial cell monolayers. Our approach provides a new method to achieve greater sensitivity in the identification of low abundance GPI-APs from the surface of live cells and the nondestructive nature of the method provides new opportunities for the temporal or spatial analysis of cellular GPI-AP expression and dynamics.

of cells that perform or mediate a variety of critical cellular functions including signal transduction, immune recognition, complement regulation, and cell adhesion. The GPI anchor consists of a conserved core glycan linked on its reducing end to the lipid phosphatidylinositol and covalently attached to protein via phosphoethanolamine on its nonreducing end (Fig. 1A). GPIs are assembled and transferred en bloc to the C-termini of various secretory glycoproteins in the ER. GPI-APs are then transported to the cell surface via the secretory pathway. In addition to its GPI anchor, most characterized GPI-APs also possess additional carbohydrate modifications such as N-and/or O-linked glycans (Fig. 1A) [1][2][3][4].
GPI anchors are essential for the correct cell surface localization and function of their appended proteins (for reviews see [5][6][7]). While mammalian cells can survive in vitro without GPI anchoring, a complete loss of GPI anchor biosynthesis is embryonic lethal during mammalian development [8] and a clonal loss of GPI anchoring results in the acquired hemolytic disease paroxysmal nocturnal hemoglobinuria [9]. Furthermore, cell surface GPI-APs serve as important biomarkers in cellular differentiation and disease. For example, the diagnosis of paroxysmal nocturnal hemoglobinuria involves measuring decreased surface expression of GPI-APs [10]. Additionally, increased expression of certain GPI-APs has been observed in various types of cancer, with some being used as prognostic indicators [11][12][13]. Finally, GPI-APs such as CD73, CD106, Sca-1, and CD90 are important stromal cellassociated markers that have been used for the identification of mesenchymal stem cells [14]. Thus, the ability to identify the cohort of GPI-APs present on the surface of cells may uncover novel markers for cell differentiation and disease.
Polarized epithelial cells contain discrete apical (AP) and basolateral (BL) plasma membrane domains that have unique protein and lipid compositions. Many studies have reported preferential localization of GPI-APs to the apical membrane [15][16][17][18][19][20]. Additionally, correct membrane localization of certain GPI-APs has been shown to be critical for specific processes such as viral infection [21] and cell signaling [22]. Since the first report of apical trafficking of GPI-APs almost 25 years ago [23], the mechanism of polarized localization of GPI-APs has been the subject of much work and debate (see [24] for a review). Many of these studies used heterologous GPI-anchored reporter proteins due to low expression of endogenous GPI-APs and challenges in visualizing specific GPI-APs on the surface of live cells. However, little is known about the population of endogenous GPI-APs present in apical and BL membranes. Defining the cohort of endogenous GPI-APs within each membrane domain using proteomics could ultimately contribute to a better understanding of their trafficking.
Computational prediction of GPI-APs from genomic sequence information has suggested that the number of GPI-APs encoded by the human genome is potentially in the hundreds (this study and [25,26]). However, experimentally, the human GPI proteome is still poorly defined. One factor that complicates proteomic analyses of mammalian GPI-APs is their low abundance compared to other cell surface proteins. Thus, prior GPI-AP proteomic studies employed enrichment methods to increase the density of GPI-APs in a sample. Studies in HeLa cells used the detergentbased two-phase partitioning of membrane proteins to concentrate GPI-APs followed by enzymatic release of GPI-APs from these membranes with phosphatidylinositol-specific phospholipase C (PI-PLC) or GPI-specific phospholipase D [27,28]. More recently, a study investigating the GPI-APs present in membranes isolated from breast cancer cells used the binding of PI-PLC-released GPI-APs to bacterial alphatoxin, a protein that binds specifically to the glycan core of the GPI anchor, as a means to further enrich samples for GPI-APs prior to MS analysis [29]. Each of these GPI-AP enrichment strategies had limited success in identifying GPI-APs. Other studies addressing the whole plasma membrane proteome or the proteome of lipid rafts also identified some GPI-APs [30][31][32][33]. Importantly, all of these studies used sample preparation methods that destroyed the cells being analyzed thus eliminating the ability to identify mature GPI-APs that dynamically populate the surface of live cells.
In the present study, we report methodology that permits the identification of GPI-APs en masse directly from the surface of intact mammalian cells. Our approach uses either of two methods that selectively capture and concentrate GPI-APs enzymatically released from the surface via their ubiquitous appended carbohydrates (N-linked, O-linked, and GPI glycans) followed by LC-MS/MS analysis. In the first enrichment scheme, in vivo metabolic labeling of GPI-APs with an azido sugar analog was performed and PI-PLC-released proteins were enriched via capture on alkyne agarose resin. The second enrichment scheme used lectins to capture and enrich for PI-PLC-released GPI-APs. Using these approaches, we performed analysis of three different mammalian cell lines and demonstrated a significant increase in sensitivity of mammalian GPI-AP identification. Furthermore, our ability to identify GPI-APs without first disrupting cellular structure permitted us to separately identify GPI-APs present in discrete membrane domains (apical and BL surfaces) of polarized epithelial cells. Our study advances the use of proteomic methods to define the mammalian GPI-AP proteome and should further enable discovery of novel GPI-AP biomarkers associated with cellular differentiation or disease.

Chemical synthesis of tetraacetylated N-azidoacetylgalactosamine (GalNAz) and reconstitution
Per-O-acetylated GalNAz was prepared in four steps and 59% overall yield according to a protocol described by Laughlin and Bertozzi [34]. In short, bromoacetic acid (Sigma-Aldrich) was converted into azidoacetic acid N-succinimidyl ester and subsequently reacted with galactosamine hydrochloride (Carbosynth). The resulting GalNAz was peracetylated in the presence of acetic acid and pyridine, and purified by flash chromatography, eluting with 7:3 hexanes/ethyl acetate. Gal-NAz was resuspended in 100% ethanol for a 50 mM stock solution.

Alkyne agarose purification of GalNAz-labeled proteins
Purification of GalNAz-labeled proteins was performed using a ClickIT Protein Enrichment Kit (Invitrogen) per manufacturer's instructions. On-bead tryptic digestion and LC-MS/MS analysis are described below. Biological triplicate samples were prepared and analyzed for all sugar analog enrichment (SAE) experiments.

LC-MS/MS
Solution or bead-immobilized samples were digested overnight with 25 ng/L trypsin (Promega) at 37ЊC. Peptides were cleaned and separated from beads using a C 18 ZipTip (Millipore), concentrated to 10 L using a SpeedVac, and analyzed by positive ion Top 7 data-dependent acquisition mode LC-MS/MS using a linear ion trap mass spectrometer (LTQ, Thermo Fisher Scientific

2D LC-MS/MS
Bead-based samples were digested overnight with 50 ng/L Trypsin Ultra Mass Spectrometry Grade (New England Biolabs) at 37ЊC. Beads were spun to the bottom of the tube and only the supernatant was removed, aliquoted, and stored at −80ЊC until further analysis. After sample preparation, individual aliquots of the complex peptide mixture were loaded onto a split phase 2D RPstrong cation exchange (SCX) back column. The SCX phase was 150 m × ß3-5 cm (Luna SCX, 5 m particle size, 100Å pore size, Phenomenex, CA, USA) and the reverse phase was 150 m × ß3-5 cm (AQUA C18, 3 m particle size, 300Å pore size, Phenomenex). Column was packed using a PicoView Pressure Injection Cell (New Objective). After loading, the RP-SCX column was connected to the HPLC and washed with 100% aqueous solvent for 5 min and then ramped up to 100% organic solvent (70% ACN, 0.1% formic acid) over 10 min. This migrates peptides from the RP phase onto the SCX phase and effectively desalts the peptide samples and removes other nonpeptide contaminants which do not bind to the SCX. The back column was then connected to a 100 m × 15 cm RP resolving front column with an integrated Nanospray tip (AQUA C18, 3 m particle size, 300Å pore size, Phenomenex) resting on the Proxeon Nanospray source (Proxeon Biosystems, Odense, Denmark) attached to a Q Exactive mass spectrometer (Thermo Fisher Scientific). An automated 2D LC-MS/MS run was programed into Xcalibur (Thermo Fisher Scientific) and each sample was analyzed with a three-salt step followed by 2 h C18 separation for a total of 6 h analyses time per sample [37]. During the entire 2D LC-MS/MS analyses, the Q Exactive operated in data-dependent mode with top ten MS/MS spectra (one microscan, 17 500 resolution) for every full scan (one microscan, 70 000 resolution). Dynamic exclusion was turned on with a 15-s interval and normalized collision energy was set at 28.0%.
RAW files from each 6 h 2D LC-MS/MS analysis were extracted into mzXML files, using the MSConvert utility from ProteoWizard suite of tools (http://proteowizard.sourceforge.net). The search database was constructed using a recent canine-predicted protein database (NCBI DogRefSeq, CanFam3.1, September 2013 assembly, containing 34 594 proteins), the common contaminants (trypsin, keratin, etc.), and lab protein standards (BSA, hemoglobin, etc.). Searches were done with the MyriMatch search engine (Version 2.1.138) [38]. The parent ion tolerance of 20 ppm and the fragment ion tolerance of 30 ppm were specified. Carbamidomethylation of Cys (+57.0293 Da) was specified as a fixed modification. The maximum number of missed cleavages was set to 2. Resulting pepXML output files were analyzed with IDPicker [39] for assembling the raw peptide identifications from MyriMatch into confident protein identifications. FDR for each sample run was calculated by IDPicker based on the reverse database targetdecoy search strategy [40] with a maximum FDR parameter set to 2%. Calculated FDRs were <1.35% across all sample runs. All MyriMatch configuration information is available in Supporting Information Table 1. The MS proteomics data have been deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository [41] with the dataset identifier <PXD001130>. Also all data from this analysis are publically available via http://proteomicsdata.neb.com/publications/ GPIProteomics/.

Analysis of MS data
For each list of proteins from the LC-MS/MS analyses, candidate GPI-APs were identified either through annotation in the UniProt database, prediction of a GPI attachment site by FragAnchor and/or PredGPI, or prior experimental evidence of a GPI anchor. Intracellular and multipass transmembrane proteins were eliminated from the analysis. In cases where multiple splice forms of a protein exist and only a subset of those splice forms were predicted to be GPI anchored, the peptide spectra were analyzed to ensure they matched the GPI-anchored splice form(s) of the candidate protein. A protein had to contain at least two unique peptide matches and two spectral counts in the PI-PLC-released fraction to be included in the list of identified GPI-APs.

Glycosylation of mammalian GPI-APs
Prior reports have determined that several mammalian GPI-APs each possess experimentally verified combinations of N-and/or O-linked glycans [1][2][3][4], however, the potential to which all GPI-APs may receive multiple types of glycosylation has not been systematically examined. Therefore, we computationally modeled and evaluated the human proteome for the presence of GPI-APs and their putative N-and O-linked glycan sites (Supporting Information Table 2). The programs FragAnchor and PredGPI were each used to predict the presence of a C-terminal GPI attachment site ( site) in all UniProt human proteins [25,26]. The data were narrowed to include only proteins having an N-terminal secretion signal peptide (SignalP 4.1) [42], and checked by TMHMM 2.0 [43] to ensure proteins contained two or fewer putative transmembrane segments. While GPI-APs do not contain transmembrane domains, prediction algorithms occasionally interpret the N-terminal signal peptide and/or the C-terminal GPI signal sequence as transmembrane helices. The resulting list of 255 human candidate GPI-APs is presented in Supporting Information Table 3. The modeled human GPI-AP dataset was further evaluated for the presence of potential N-and/or O-linked glycan attachment sites using the programs NetNGlyc 1.0 and NetOGlyc 4.0, respectively [44]. Nearly all (99%) of the modeled GPI-APs contained predicted N-or O-linked glycan sites (Supporting Information Tables 2,  3). Most of the proteins (92%) possessed putative N-glycans sites with 85% having potential O-glycan sites, and 77% having both. While computational modeling alone cannot ensure that a protein will definitively possess N-or O-linked glycans, this analysis suggests that there is a very high potential for almost all human GPI-APs to harbor multiple forms of appended glycans (N-or O-linked and the GPI glycan).

Enrichment of GPI-APs from complex protein mixtures
Based on the likelihood that the majority of human GPI-APs contain N-and/or O-glycans in addition to their GPI anchors, we sought to use glycans as "handles" to isolate GPI-APs from complex protein mixtures. In contrast to prior GPI-AP proteomic analyses that utilized concentrated membranes from lysed cells, we sought to identify GPI-APs liberated directly from the surface of intact live cells. Two workflows were established for the purification of GPI-APs from live cells via their appended glycans ( Fig. 1B and C). The first method (sugar analog capture enrichment) consisted of metabolically incorporating the azido sugar analog GalNAz into cellular glycans prior to the PI-PLC-mediated release of GPI-APs (Fig. 1B). Previous studies had demonstrated that GalNAz (and its epimerized N-azidoacetylglucosamine form) becomes incorporated into N-and O-linked glycans as well as into some GPI anchors [36,45,46]. Hence, GPI-APs from labeled cells may contain the sugar analog in multiple glycans (Fig. 1A). Following labeling, click chemistry is used to covalently immobilize GPI-APs to an alkyne agarose resin via their azide-labeled glycans. The captured proteins are then stringently washed and subjected to on-resin trypsin digestion and liberated peptides are analyzed by LC-MS/MS.
The second method (lectin affinity capture enrichment) involves using immobilized lectins to capture PI-PLC-released GPI-APs via their appended glycans (Fig. 1C). In this strategy, which is not dependent on metabolic labeling, PI-PLCreleased GPI-APs are bound to a mixture of ConA and WGA lectin resins. ConA binds to ␣-D-glucose and ␣-D-mannose containing oligosaccharides, while WGA is thought to bind GlcNAc and sialic acid residues, common substituents of GPIs, N-, and O-linked glycans. The noncovalent interaction For each enrichment method, the average percent sequence coverage (Avg % Seq Cov) and average number of unique peptides (Avg Pep) across three biological replicates are indicated. The far right columns show the reproducibility of a protein identification using a given enrichment method (+PLC only) across three biological replicates. PLC, phospholipase C. a) Theoretical molecular weight of precursor protein in kDa.
b) The minimum number of unique peptides required for a protein identification was 2, but due to averaging across triplicate samples, the average peptides may appear to be 1. A complete listing of the number of peptides for each replicate can be found in Supporting Information Table 5 and all peptides used in identifications are listed in Supporting Information Table 6. c) Peptides match specific isoforms predicted to contain a GPI anchor though other isoforms cannot be excluded. d) Experimentally proven to be GPI anchored in other studies but not annotated as a GPI-AP in the UniProt database. between captured GPI-APs and resin allows for competitive elution of bound proteins with ␣-methyl-mannopyranoside and GlcNAc. Eluted proteins are treated with trypsin and resulting peptides analyzed by LC-MS/MS.

The cell surface GPI-AP proteome of HeLa cells
To determine the reproducibility and effectiveness of our experimental workflows, GPI-APs from HeLa cells were isolated and captured using the sugar analog or lectin enrichment methods. Following trypsin digestion and LC-MS/MS, GPI-APs were extracted from the list of all identified proteins using the methods described in Section 2.8. Of the 33 identified GPI-APs, 27 proteins (82%) were annotated in the UniProt database as GPI-APs. The remaining six proteins have been either experimentally demonstrated to be a GPI-AP (but not annotated as such) or the identified peptides mapped to a protein isoform predicted by one or more algorithms to potentially be GPI anchored (Table 1,  Supporting Information Tables 4, 5). Control experiments on HeLa cells using PI-PLC treatment alone without sugar analog or lectin affinity enrichment failed to identify as many GPI-APs. There was significant overlap in the GPI-APs identified by both glycan enrichment methods with 16 of 33 proteins (48%) being commonly observed ( Table 1). The protein identifications were reproducible, with 81% present in two or more biological replicates using sugar analog capture and 78% using lectin capture ( Table 1, Supporting Information Fig. 1). Importantly, our method identified ten of the 11 GPI-APs previously observed in HeLa cells using PI-PLC and GPI-PLC release on concentrated detergent-resistant membrane preparations [27,28]. Consistent with previous reports, some release of GPI-APs in the non-PLC-treated control samples was observed [47][48][49][50]. However, treatment with PI-PLC enriched our samples for GPI-APs at least threefold over non-PLC-treated controls (Supporting Information Table 6). These data demonstrate that glycan-based enrichment ahead of LC-MS/MS is a viable approach for identification of cell surface GPI-APs. The described workflow is more sensitive than prior methods and can be executed without disruption of cellular membranes.

GPI proteomics of polarized cells
To highlight the utility of our approach for isolation of cell surface GPI-APs from intact cells, we applied the SAE method to the discrete apical and BL membrane domains of polarized epithelial cells. GPI-APs have been shown to preferentially localize on the apical surface of polarized MDCK and ARPE-19 [15-18, 20, 36, 51]. However, most of these studies employed overexpressed recombinant GPI-anchored reporter proteins to determine GPI-AP localization. Meanwhile, reports examining the distribution of some endogenous GPI-APs have reported significant levels of GPI-APs on both the apical and BL surfaces [36,47]. To obtain a more complete picture of the distribution of endogenous GPI-APs in polarized cells, we examined the apical and BL GPI proteomes of polarized epithelial cell monolayers.
In a prior study, we established polarized ARPE-19 cells as a model system for in vivo incorporation of GalNAz into GPI anchors and N-glycans [36]. Thus, we used GalNAz-labeled ARPE-19 cells to concurrently determine the apical and BL GPI proteomes of a polarized monolayer. In this experiment, ARPE-19 cells were grown as a polarized monolayer on a permeable membrane and labeled with GalNAz ( Fig. 2A). Apical and BL cell surfaces were separately treated with PI-PLC, and labeled GPI-APs were captured by the SAE method. The intactness of tight junctions of the polarized monolayer during the PI-PLC treatment has been verified previously by the measurement of protein diffusion across the cell monolayer [36]. The average percent sequence coverage (Avg % Seq Cov) and the average number of unique peptides (Avg Pep) from three biological replicates are given. The reproducibility of a protein identification on a given cell surface in the +PLC samples across three biological replicates is in the far right columns. PLC: phospholipase C, AP: apical, BL: basolateral. a) Theoretical molecular weight of precursor protein in kDa. b) The minimum number of unique peptides required for a protein identification was 2, but due to averaging across triplicate samples, the average peptides may appear to be 1. A complete listing of the number of peptides for each replicate can be found in Supporting Information Table 7 and all peptides used in identifications are listed in Supporting Information Table 8. c) Peptides match specific isoforms predicted to contain a GPI anchor though other isoforms cannot be excluded. d) Experimentally proven to be GPI anchored in other studies but not annotated as a GPI-AP in the UniProt database.
We identified 29 GPI-APs from the apical surface and 24 GPI-APs from the BL surface (Table 2, Supporting Information  Tables 7, 8). Protein identifications were highly reproducible with 73% of apically identified GPI-APs and 71% of GPI-APs on the BL surface being observed in two or more biological replicates ( Table 2, Supporting Information Fig. 1). Notably, 24 of the 29 proteins were observed on both membranes, with only five observed exclusively on the apical surface and no GPI-APs exclusively found on the BL surface (Fig. 2B). While relatively little is known about the polarized distribution of GPI-APs in ARPE-19 cells, the presence of CD73 on the apical and BL membrane domains correlated with its previously reported localization determined by Western blotting, immunofluorescence, and enzymatic activity [36,51]. These results successfully demonstrate our ability to identify GPI-APs on discrete membrane domains of live polarized cells. We next analyzed the GPI proteome of polarized MDCK cells, a cell line that has been extensively used as a model to study polarized GPI-AP trafficking [15,16,18,20,47,[52][53][54]. To further improve the sensitivity of our method, 2D LC-MS/MS on a Q Exactive mass spectrometer [55] was employed for the analysis of apical and BL protein samples prepared by SAE. We observed 38 potential GPI-APs from both membranes with 84% of apical protein identifications and 95% of BL protein identifications observed in two or more biological replicates (Table 3, Supporting Information Fig. 1). Detailed information on all the identified GPI-APs in the biological triplicate samples including spectra counts and peptide assignments can be found in Supporting Information Tables 9  and 10. Notably, while MDCK GPI-APs are thought to be preferentially trafficked to the apical surface, we observed most detected GPI-APs on both cell surfaces (35 out of 38, or 92%), with only one GPI-AP being exclusively detected on the apical surface, and two GPI-APs present only on the BL surface ( Fig. 2C; Table 3). Importantly, we identified carboxypeptidase M, a GPI-AP known to be present on both the apical and BL membranes of MDCK cells [47] (Table 3). Together, these data demonstrate that GPI-APs are clearly present on both surfaces of polarized epithelial cells.

Discussion
We report methodology for identification of GPI-APs directly from the surface of intact mammalian cells using bottomup proteomics. An enabling feature of our experimental approach was the use of a sample preparation method that selectively released GPI-APs (using PI-PLC) from the surface of intact cells coupled to the capture and enrichment of GPI-APs via their ubiquitous appended glycans. Using these methods upstream of LC-MS/MS, we have analyzed the cell surface GPI proteomes of three mammalian cells lines with significantly improved depth (see Supporting Information Table 11 for a summary of all identified GPI-APs across the three cell lines). Furthermore, the ability to identify cell surface GPI-APs without perturbing cellular membranes has permitted us to separately define the GPI proteomes of distinct plasma membrane domains of polarized epithelial cells.

Considerations in the choice of GPI-AP enrichment strategy
Variations in the abundance and/or composition of glycans on individual GPI-APs are expected to play a role in the sensitivity of the glycan enrichment method. Glycan-rich proteins may perform well in both enrichment schemes due to an increased likelihood of GalNAz incorporation or association with the lectin resin. In contrast, GPI-APs bearing fewer glycans and/or glycans that lack epitopes recognized by WGA or ConA may be more difficult to detect. However, in our HeLa samples, we do identify proteins with a single predicted site of N-or O-glycosylation, suggesting our method can detect GPI-APs even if they are not heavily glycosylated. Additionally, GPI-APs that have tertiary folds or other modifications that sterically hinder the ability of a glycan to interact with the capture resin may favor the identification by only one method.
In support of this, we readily detected the heparan sulfate modified glypican family of GPI-APs (glypican 1, 4, and 5) [56] via sugar analog capture enrichment but not with lectin affinity capture enrichment. We speculate that heparan sulfate may sterically interfere with glycan interaction with the lectins used in this study. The performance of lectin affinity capture enrichment in this workflow may be improved by including additional immobilized lectins in the resin mixture whose specificities widen the array of recognized glycan epitopes [57] and/or through the use of lectin multimerization [58]. Unique features of the enrichment methods indicate their suitability for a particular application. We generally observed a greater number of identified GPI-APs with sugar analog capture enrichment than with lectin affinity capture enrichment, likely in part due to the covalent interaction between the incorporated sugar and the agarose resin. However, we have seen differences in the efficiency of GalNAz incorporation into glycans in different cell lines, thus necessitating optimization of labeling conditions for each cell type. In contrast, lectin affinity capture enrichment may be advantageous because metabolic labeling of glycans is not necessary to isolate GPI-APs and its specificity can be tuned to specific glycan epitopes. This method could be applied to the analysis of GPI-APs from cells or tissues directly extracted from humans or other animals. Furthermore, due to the noncovalent nature of the lectin-GPI-AP interaction, eluted GPI-APs could be further analyzed to determine the structure of GPI-linked glycans, a type of analysis that would not be possible using sugar analog capture because the glycans become irreversibly bound to the alkyne agarose column. Finally, lectin capture may also permit identification of proteins that specifically interact with GPI-APs.
Lastly, while both glycan enrichment methods are effective in capturing GPI-APs after PI-PLC release from the cell surface, factors that potentially limit the efficiency of PI-PLC digestion could cause some GPI-APs to be missed. It has been shown previously that residual carboxypeptidase M activity remains in the apical membrane of MDCK cells after PI-PLC treatment [47]. This suggests that the apical surface of MDCK cells may be less accessible to PI-PLC or that a subset of GPI-APs on this surface are modified thus rendering them resistant to PI-PLC cleavage. Consistent with the latter, resistance to PI-PLC has been observed previously in erythrocytes and is due to acylation of the GPI inositol  The average percent sequence coverage (Avg % Seq Cov) and average number of unique peptides (Avg Pep) across three biological replicates are indicated. The reproducibility of a protein identification on a given cell surface in the +PLC samples across three biological replicates is in the far right columns. PLC: phospholipase C, AP: apical, BL: basolateral.
a) Predicted molecular weight in kDa. When more than one isoform is listed, the predicted molecular weight of the largest isoform is given.
b) The minimum number of unique peptides required for a protein identification was 2, but due to averaging across triplicate samples, the average peptides may appear to be 1. A complete listing of the number of peptides for each replicate can be found in Supporting Information Table 9 and all peptides used in identifications are listed in Supporting Information  [59,60]. We did observe less enrichment upon PI-PLC treatment in the apical samples (1.4-fold) as compared to the BL surface (3.0-fold; Supporting Information Table 6) suggesting that perhaps some apical GPI-APs are resistant to PI-PLC cleavage. Oligomerization of GPI-APs during trafficking to the cell surface [52] or clustering of GPI-APs in lipid rafts [61] could also potentially limit access of PI-PLC to their GPIs. However, we identified 17 of the 19 GPI-APs previously found in the HeLa cell lipid raft proteome [33], suggesting that in that cell line, PI-PLC cleavage is relatively efficient and complete despite clustering of GPI-APs in lipid rafts.

GPI-APs in polarized epithelial cells
Prior to this study, little was known about the cohort of GPI-APs that naturally populates the surface of polarized mammalian epithelial cells. We have advanced this knowledge by cataloging the GPI-APs present on the apical and BL surfaces of both ARPE-19 and MDCK cells. We identified a large number of GPI-APs having diverse molecular functions, the majority of which were not previously known to be produced in these cell lines. Several studies have indicated that GPI-APs are preferentially trafficked to the apical surface of polarized ARPE-19 and MDCK cells [15,17,20,51]. Based on these data, one would anticipate that most GPI-APs would be present predominantly in apically derived samples. However, GPI-APs were detected on both the apical and BL surfaces, with very few being present on only one membrane domain. This could be due to our use of sensitive MS technology to detect endogenous GPI-APs in comparison with previous reports that used exogenous reporter GPI-APs or activity assays. Importantly, while GPI-APs are present on both membrane domains of polarized cells, it remains to be determined if these proteins are active in both domains. The increased knowledge we now have of the endogenous GPI-APs that populate these membranes will permit a more rigorous exploration of their individual localization, activity, and surface abundance.

Future applications
We have developed two new methods to enrich and identify GPI-APs from the complex mixture of proteins present on the surface of living cells. Importantly, these approaches permit exploration of the mammalian GPI-anchored proteome without disturbing cellular integrity. We anticipate that these methods will further enable novel biomarker discovery, the monitoring of changes in the GPI proteome during cell differentiation or disease progression, and in-depth characterization of the glycan moieties appended to GPI-APs.