Cell Surface Labeling and Mass Spectrometry Reveal Diversity of Cell Surface Markers and Signaling Molecules Expressed in Undifferentiated Mouse Embryonic Stem Cells*

Although interactions between cell surface proteins and extracellular ligands are key to initiating embryonic stem cell differentiation to specific cell lineages, the plasma membrane protein components of these cells are largely unknown. We describe here a group of proteins expressed on the surface of the undifferentiated mouse embryonic stem cell line D3. These proteins were identified using a combination of cell surface labeling with biotin, subcellular fractionation of plasma membranes, and mass spectrometry-based protein identification technology. From 965 unique peptides carrying biotin labels, we assigned 324 proteins including 235 proteins that have putative signal sequences and/or transmembrane segments. Receptors, transporters, and cell adhesion molecules were the major classes of proteins identified. Besides known cell surface markers of embryonic stem cells, such as alkaline phosphatase, the analysis identified 59 clusters of differentiation-related molecules and more than 80 components of multiple cell signaling pathways that are characteristic of a number of different cell lineages. We identified receptors for leukemia-inhibitory factor, interleukin 6, and bone morphogenetic protein, which play critical roles in the maintenance of undifferentiated mouse embryonic stem cells. We also identified receptors for growth factors/cytokines, such as fibroblast growth factor, platelet-derived growth factor, ephrin, Hedgehog, and Wnt, which transduce signals for cell differentiation and embryonic development. Finally we identified a variety of integrins, cell adhesion molecules, and matrix metalloproteases. These results suggest that D3


tified a variety of integrins, cell adhesion molecules, and matrix metalloproteases. These results suggest that D3 cells express diverse cell surface proteins that function to maintain pluripotency, enabling cells to respond to various external signals that initiate differentiation into a variety of cell types. Molecular & Cellular Proteomics 4: 1968 -1976, 2005.
Embryonic stem (ES) 1 cells are a unique type of cultured cells defined by two functional properties, self-renewal and pluripotency. In cultured mouse ES cells, the soluble cytokine leukemia-inhibitory factor (LIF) can support the undifferentiated state and promote self-renewal, whereas the formation of embryoid bodies followed by the addition of growth factors induces differentiation of the cells to specific fates (1)(2)(3)(4). Interactions between cell surface proteins and soluble factors or insoluble ligands play important roles in regulating ES cell functions. However, the molecular mechanisms involved in these cellular processes remain unclear because we lack a thorough understanding of the properties and functions of ES cell surface proteins. The study of ES cell surface proteins is also attractive because some of these proteins can be used as non-destructive markers to characterize and/or isolate specific cell types. Thus, a large scale identification of ES cell surface proteins is key to understanding the regulation of ES cell function and to developing new research tools.
Recent advances in MS-based proteomics have enabled us to identify a large number of proteins from a variety of membrane preparations (5)(6)(7). However, it is difficult to isolate plasma membranes in a pure form because the membranes lose their specific structure upon cell lysis, and a typical plasma membrane-rich fraction prepared by ultracentrifugation is heavily contaminated with other membrane components (8). Therefore, several methods have been developed to obtain relatively homogeneous preparations of plasma membranes, including coating of intact cells with silica derivatives (9). In this study, we describe an alternative approach for large scale and selective identification of ES cell surface proteins. First cell surface proteins of intact cells were selectively labeled with the membrane-impermeable reagent biotin, and biotinylated plasma membrane proteins were then enriched via affinity capture using immobilized avidin (10 -14). The biotinylated proteins can be separated by gel electrophoresis and identified by MS (10 -13); alternatively then can be proteolytically digested and identified by a MS-based "shotgun" approach (14).
We describe here a strategy for large scale and selective identification of cell surface proteins using cell surface labeling coupled with high resolution two-dimensional (2D) LC-MS/MS (Fig. 1). The method allowed us to identify more than 200 cell surface proteins expressed in mouse ES cells with simultaneous identification of the sites of biotinylation on each protein molecule. Analysis of the identified proteins indicated that ES cells express a wide variety of cell surface markers and signaling molecules (such as receptors, transporters, cell adhesion molecules, etc.), which are characteristic not only of ES cells but also of differentiated cell types, such as hematopoietic or neural cells. Our previous study identified ϳ1,800 proteins expressed in undifferentiated ES cells, thereby revealing the diversity of the ES cell proteome (15).

EXPERIMENTAL PROCEDURES
Cell Surface Labeling-A mouse embryonic stem cell line, D3 (American Type Culture Collection, Manassas, VA), was maintained on 0.1% gelatin-coated tissue culture dishes in Dulbecco's modified Eagle's medium (Invitrogen) supplemented with 15% heat-inactivated FCS (JRH Biosciences, Lenexa, KS), 0.1 mM ␤-mercaptoethanol, 100 units/ml penicillin, 100 g/ml streptomycin, and 1,000 units/ml recombinant mouse LIF (ESGRO-Chemicon International, Temecula, CA). Under our culture conditions, ϳ95% of the cells were found to be undifferentiated as monitored by staining with alkaline phosphatase and stage-specific embryonic antigen-1 (SSEA-1), which are cell surface markers for undifferentiated ES cells. D3 cells grown to ϳ80% confluency on 150-mm tissue culture dishes were first incubated in serum-free Dulbecco's modified Eagle's medium for 1 h, rinsed twice with ice-cold PBS (10 mM NaH 2 PO 4 /Na 2 HPO 4 , pH 7.4, 138 mM NaCl, 2.7 mM KCl) supplemented with 0.1 mM CaCl 2 , 1 mM MgCl 2 (PBSϩ), and then incubated with 1 mg/ml EZ-Link TM sulfo-NHS-LC-biotin (Pierce) in PBSϩ for 20 min at 4°C with gentle agitation. After removal of the supernatant, residual sulfo-NHS-LC-biotin was quenched with 100 mM glycine in PBSϩ, and the cells were harvested using a plastic scraper.
Tryptic Digestion and Avidin Affinity Enrichment of Biotinylated Peptides-The biotinylated membrane fractions were delipidated twice with cold acetone, dried, and solubilized with 8 M urea in 400 mM NH 4 HCO 3 , pH 8.5. Protein samples were reduced with 2 mM dithiothreitol at room temperature for 30 min, alkylated with 2 mM iodoacetamide for 30 min, and diluted 4-fold with distilled water. Protein samples were then digested with N␣-L-tosyl-L-phenylalanine chloromethyl ketone-treated porcine trypsin (Promega, Madison, WI) at an enzyme:substrate ratio of 1:100 (w/w) at 37°C for 16 h. Digestion was monitored by SDS-PAGE/Western blotting using alkaline phosphatase-conjugated avidin. Biotin-labeled peptides were enriched from the tryptic digest by avidin affinity chromatography. Approximately 1 mg of the digest was applied to a column packed with 1 ml of Immunopure immobilized monomeric avidin (Pierce) pretreated with 30% CH 3 CN in 0.4% trifluoroacetic acid and equilibrated with 2 M urea in 100 mM NH 4 HCO 3 , pH 8.5. After sequential washing with (i) 2 M urea in 100 mM NH 4 HCO 3 , (ii) 2 M urea in 100 mM NH 4 HCO 3 containing 0.5 M NaCl, (iii) 2 M urea in 100 mM NH 4 HCO 3 containing 30% CH 3 CN, and (iv) 100 mM NH 4 HCO 3 , the bound peptides were eluted with 30% CH 3 CN in 0.4% trifluoroacetic acid and concentrated using a vacuum concentrator. The amounts of peptides and biotin labels were determined using the BCA protein assay kit (Pierce) and 2-(4Ј-hydroxyazobenzene)-benzoic acid (Pierce), respectively, according to the manufacturer's instructions.
Automated Multidimensional LC-MS/MS Analysis-The peptide mixtures were analyzed on an automated 2D LC-MS/MS system (15, 18 -20) using a combination of first dimensional cation exchange and second dimensional reverse-phase chromatography. The peptide mixture (50 g) was separated on a SP-5PW column (1-mm internal diameter ϫ 40 mm long, 20-m particles; TOSOH, Tokyo, Japan) by 15-min stepwise elution (20 mM acetate buffer, pH 4.0, containing 0, 25, 50, 100, or 400 mM NaCl) at a flow rate of 10 l/min. Eluted peptides in each step were captured on a trap column (Mightysil C 18 , 0.5-mm internal diameter ϫ 1 mm long, 3-m particles; Kanto Chemicals, Tokyo, Japan) for desalting and then separated on a Mightysil C 18 column (0.15-mm internal diameter ϫ 40 mm long, 3-m particles; Kanto Chemicals) using a three-step linear gradient (0 -30% CH 3 CN in 0.1% formic acid for 120 min, 30 -70% CH 3 CN in 0.1% formic acid for 40 min, and 70% CH 3 CN in 0.1% formic acid for an additional 10 min) at a flow rate of 50 nl/min. The eluted peptides were directly sprayed into a high resolution quadrupole time-of-flight hybrid mass spectrometer (Q-Tof-2; Micromass, Manchester, UK). The total analysis time for a single 2D LC-MS/MS operation was 22.5 h.
Protein Identification by Database Search-The MS/MS signals were acquired by MassLynx (Micromass) and converted to text files by ProteinLynx software (Micromass). The database search was performed in triplicate by MASCOT (Matrix Science Ltd., London, UK) against the Refseq mouse, human, and rat sequence databases with the following parameters: fixed modification, carbamoylmethylation (Cys); variable modifications, oxidation (Met) and sulfo-NHS-LC-biotin (Lys); maximum missed cleavages, 3; peptide mass tolerance, 150 ppm; and MS/MS tolerance, 0.5 Da. For peptide and protein identification, the search results were processed as follows. (i) The candi-date peptide sequences were screened with the probability-based MOWSE scores that exceeded their thresholds (p Ͻ 0.05) and with MS/MS signals for y-or b-ions Ն3. (ii) Redundant peptide sequences were removed. (iii) Each peptide sequence was assigned to a protein that gave the maximal number of peptide assignments among the candidates. (iv) The mouse, human, and rat datasets were combined.
(v) Interspecies redundancy of proteins was removed. Details of the methods of protein identification and data processing are described elsewhere (21). In this study, the following additional criteria were applied during visual inspection of individual MS/MS spectra for reliable identification of labeled peptide sequences: (vi) the presence of an MS/MS signal corresponding to the labeled lysine residue and/or (vii) the presence of one of fragment ions (M ϩ ϭ 227.1 and 340.2) derived from the labeling reagent (14). Peptide sequences identified without biotin modifications (ϳ10%) were excluded regardless of their scores.
Cell Staining-D3 cells grown on gelatin-coated coverglass were labeled as described above in the presence or absence of sulfo-NHS-LC-biotin. The labeled cells were washed with PBS, fixed with 4% paraformaldehyde for 20 min, treated with 0.5% (w/v) Triton X-100 for 10 min, and then treated with 3% BSA for 1 h. After being washed, cells were incubated with 20 g/ml of FITC-conjugated avidin (Pierce) for 30 min. The cells were co-stained with 2 g/ml of Hoechst 33342 (Invitrogen) and observed under a laser scanning confocal microscope (Radiance 2100, Bio-Rad).
Transfection of D3 Cells and Immunofluorescence Staining-D3 cells were transiently transfected by seeding cells on collagen-coated glass slides at a density of ϳ6 ϫ 10 5 cells/ml 1 day before transfection. Constructs to express cell surface proteins (0.5 g of each) were transfected into cells using Lipofectamine 2000 (Invitrogen) according to the manufacturer's recommendations and then incubated at 37°C for 24 h. After washing with PBS, cells were incubated in PBS containing 3.7% formaldehyde for 30 min at room temperature. Fixed cells were washed with PBS and then permeabilized by incubating in 0.2% (w/v) Triton X-100 at room temperature for 6 min. Cells were washed with PBS and blocked with PBS containing 1% BSA and 5% FCS for 30 min. A primary monoclonal antibody against FLAG (Sigma) was applied directly to coverslides at 1:1,000 dilution in PBS; the goat anti-mouse secondary antibody labeled with FITC (Invitrogen) was diluted 1:1,000. Hoechst 33342 was used at 2 g/ml to detect nuclei. Cells were visualized with a Radiance 2100 confocal microscope (Bio-Rad).
Characterization of Identified Proteins-Protein annotations, such as subcellular localization and molecular function, were obtained using Expression Analysis Systematic Explorer (EASE) version 2.0.
(david.niaid.nih.gov/david/ease.htm) (22). The signal sequence and transmembrane segment were predicted by the beta version of the SOSUI program (sosui.proteome.bio.tuat.ac.jp/sosuiframe0.html). Assignment of cluster of differentiation (CD) molecules was carried out using Protein Reviews On The Web (mpr.nci.nih.gov/prow/).

Cell Surface Labeling and Preparation of Plasma Membrane
The method presented here consisted of (i) in situ biotinylation of surface proteins on intact cells using the membraneimpermeable reagent sulfo-NHS-LC-biotin, (ii) cell lysis and subcellular fractionation of the biotinylated membranes on sucrose gradients, (iii) tryptic digestion of the protein mixture, (iv) affinity capture of the biotinylated peptides with avidin, and (v) 2D LC-MS/MS analysis of the peptide mixture (Fig. 1). First D3 cells grown on culture dishes were labeled in situ with sulfo-NHS-LC-biotin. The majority (Ͼ95%) of D3 cells displayed clear outlines as visualized by FITC-conjugated avidin, suggesting that the cell surface was selectively biotinylated with sulfo-NHS-LC-biotin ( Fig. 2A). Diffuse staining, however, was also observed in cells inside the colony, suggesting that cytoplasmic labeling occurred in non-viable cells ( Fig. 2A). Although scraping cells prior to labeling increased the labeling efficiency, it caused undesirable labeling of cytoplasmic proteins (data not shown). After lysis of the labeled cells by nitrogen cavitation and centrifugation to remove large debris, the subcellular components in the cell lysate were fractionated on a discontinuous sucrose density gradient. Western blotting analyses of the fractions revealed that biotinylated proteins were recovered in fractions 3 (d ϭ 1.14 g/ml) and 4 (d ϭ 1.11 g/ml) exclusively (Fig. 2B). To characterize the contents of these subcellular fractions, all fractions were analyzed by Western blotting with antibodies against known molecular markers for several organelles: annexin II for the plasma membrane, GM130 for the Golgi apparatus, Bip/ GRP78 for the endoplasmic reticulum, and nucleoporin p62 for the nucleus. The distribution of these molecular markers indicated that fractions 3 and 4 were enriched in plasma membrane as they were labeled with the annexin II antibody but showed little staining with the Bip/GRP78 and nucleoporin antibodies (Fig. 2B). These fractions were combined and digested to generate a tryptic peptide mixture for MS identification of cell surface proteins.

Protein Identification by 2D LC-MS/MS
The automated 2D LC-MS/MS analysis of the avidin-purified biotin-labeled peptide mixture (50 g) generated 5,871 MS/MS spectra in a single 22.5-h analysis. These spectra were assigned to 608 unique peptides by the sequential MAS-COT search of Refseq mouse, human, and rat sequence databases. Of those, 551 peptides were biotinylated. A typical MS/MS spectrum assigned to a peptide from the transferrin receptor is shown in Fig. 3. Subjecting biotinylated peptides to MS generated fragment ions at m/z ϭ 227.1 and 340.2 due to collision-induced dissociation of the labeling reagent as indicated by arrows in Fig. 3. Thus, all MS/MS spectra were carefully inspected, and peptide assignments without biotin labels were excluded. The 551 peptides were attributed to 240 unique proteins (2.3 biotin-labeled peptides/protein on average). Finally the 2D LC-MS/MS analysis was performed twice with different peptide preparations from two different cell labeling experiments, yielding a composite subproteome consisting of 324 proteins derived from 965 unique biotinylated peptide sequences (Supplemental Table 1). The number of peptides used to identify a single protein ranged from one to 37 with an average of 3.0 peptides/protein, and 151 proteins (47%) were identified by multiple peptide assignments.

Enrichment of Cell Surface Proteins by the Protein Labeling Reaction
The analysis of identified proteins by the SOSUI program predicted transmembrane (TM) segments for 84 proteins and both signal sequences and TM segments for 116 proteins (Fig. 4A). Thus, a total of 200 proteins among 324 identified proteins (62%) had molecular characteristics typical of integral membrane proteins. The number of TM domains in the molecules ranged from one (122 proteins) to 13 (four proteins) (Supplemental Table 1). Because 20 -30% of all open reading frames encoded by the genome have been predicted to be integral membrane proteins (23,24), the method presented here concentrated potential TM proteins efficiently. By including the 35 proteins with signal sequences but no detectable TM domains suggestive of a secreted protein, potential cell surface membrane and secreted proteins accounted for ϳ73% of total identified proteins.

Cell Surface Selectivity of Protein Labeling Reaction
Because sulfo-NHS-LC-biotin is impermeable to the plasma membrane and labels primary amines in proteins mainly at the ⑀-amino group of lysine residues exposed to the extracellular space, the labeled peptides should reside in the extracellular domain of a transmembrane protein. However, if non-selective labeling reactions took place, labeled peptides might be found on both sides of the TM domain. Thus, we examined the biotinylation sites of the 122 single TM domain proteins found in this study. On the assumption that the SOSUI program predicts TM segments correctly, none of these proteins were biotinylated on both sides of the TM segment. Namely among the 122 proteins assigned from 522 unique biotinylated peptides, 98 proteins (80%) had biotinylated lysine(s) only on the amino-terminal side of the predicted TM segment and were thereby defined as type I transmembrane proteins (Fig. 4B). Conversely the remaining 24 proteins (20%) contained biotinylated lysine(s) on the carboxyl-terminal side of the predicted TM domain and were defined as type II transmembrane proteins (Fig. 4B). Furthermore most of the amino-terminally labeled TM proteins (87 proteins) had potential signal sequences at the amino terminus of the polypeptide as expected for a type I transmembrane protein (Supplemental Table 1). Thus, the results suggest that the labeling reaction took place rather specifically at the surface of D3 cells and that the labeled peptides represented the extracellular domains of cell surface transmembrane proteins.
To further validate the present procedure, the subcellular localization of several proteins with no annotated localization in the Gene Ontology (GO) Database was studied by transiently expressing individual FLAG-tagged proteins in D3 cells. We selected RIKEN cDNA B430119L13, a trophoblast plasma membrane glycoprotein, glycoprotein A33, and the hypothetical protein D7Ertd458e, all of which have a single predicted transmembrane domain and are present in preand/or postimplantation embryos or in neoplastic tissues. Immunofluorescence staining of transfected D3 cells revealed that all four proteins were localized extensively on the plasma membrane as judged by co-localization with CD9, a cell surface marker of ES cells (Fig. 5).

Characterization of ES Cell Surface Proteins
Functional Classification-The molecular functions of the 324 proteins identified in this study were classified according to the GO database and literature surveys (Fig. 6). Receptors were the largest subgroup, consisting of 63 proteins (19% of identified proteins) with cellular abundance ranging from several hundred copies to hundreds of thousands of copies. The group included various types of kinase-and phosphataseassociated receptors, endocytotic receptors for proteins and lipids, and seven G-protein-coupled receptors. The second largest group was the transporters, which consisted of 49 proteins (15%), including 16 solute carrier family proteins, five ATPase-type cation pumps, and three anion channel proteins. Adhesion molecules were a third major category, consisting of 44 proteins (14%) that included 10 cell adhesion molecules, eight integrins, and seven extracellular matrix proteins. Another major category was proteolysis proteins, which included 25 proteolytic enzymes or inhibitors (8%), including nine metalloproteases/peptidases and five protease inhibitors. This study also identified ribosomal constituents (21 proteins), structural molecules (15 proteins), histones (10 proteins), and chaperones (eight proteins) as well as "other" 56 proteins with various functions such as nucleic acid metabolism, protein metabolism, sugar metabolism, etc. Approximately 10% of identified proteins (33 proteins) had no annotated functions and, therefore, were classified as uncharacterized or hypothetical.
CD Antigens-CD molecules are protein or carbohydrate antigens recognized by specific antibodies, and they are utilized to distinguish particular cell types. Of the ϳ300 proteinaceous CD molecules that have been identified in a variety of cell types and are listed in Protein Reviews On The Web (mpr.nci.nih.gov/prow/), we found 59 molecules on the surface of D3 cells (Fig. 7A). We found CD9, which is downregulated upon differentiation of ES cells (25); CD130 (gp130) and CD222, which have potential roles in embryo development (26,27); and CD87 and CD146 (melanoma cell adhesion molecules), which are associated with cancer development (28,29) (Fig. 7A). In addition to ubiquitous cell surface molecules such as CD55 and CD71, we found a number of CD molecules that serve as markers for neural cells (CD56; neural cell adhesion molecule), hematopoietic stem cells (CD117; c-Kit), epithelial cells (CD142; coagulation factor III), and endothelial cells (CD31; platelet endothelial cell adhesion molecule) (Fig. 7A). Interestingly immunohistochemistry demonstrated that D3 cells expressing CD9 and SSEA-1, which have been used as cell surface markers for undifferentiated ES cells, co-expressed some of these CD markers of differentiated cells (Fig. 7B).
Cell Signaling Molecules-This study identified 82 proteins that have potential roles in a wide variety of cell signaling pathways, including 59 cell surface receptors and adhesion molecules, eight ligands, and 15 signal modulators in D3 cells (Table I). Among these components, we found Lifr and its co-receptor, interleukin 6 receptor (Il6st), which, together with ciliary neurotrophic factor receptor (Cntfr), transduce LIF signals required for the maintenance of undifferentiated mouse ES cells (26). We also identified BMP receptor type 1A (Bmpr1a), a component that transduces BMP-4 signals to sustain LIF-mediated self-renewal of mouse ES cells by inducing the expression of Id protein in serum-free cultures (30), and integrin ␣6␤1, the major laminin receptor on mouse ES cells that promotes self-renewal in both mouse and human ES cells (31,32). Besides these components, the D3 cell line expressed receptors for PDGF (Pdgfra), VEGF (Flt4; Nrp2), and FGF (Fgfr2), which direct the differentiation of ES cells to mural, hematopoietic, endothelial, epidermal, or neural cell lineages (33). We also found many cell signaling molecules that have roles in embryonic development, such as Notch (Notch3) and receptors for transforming growth factor ␤ (Eng), Wnt (Fzd2 and Fzd10), Hedgehog (Ptch), and ephrin (Epha1, Epha2, Epha4, and Ephb4) and their proteinaceous ligand (Efnb1). The other receptors for growth factors or cytokines we identified were Erbb2, interferon ␣ and ␤ receptor 1 (If-nar1), IGF 1 and 2 receptors (IGF1R and Igf2r), and IL1 receptor (Il1rap). Erbb2 is a co-receptor of epidermal growth factor that is up-regulated in certain types of cancer cells (34), whereas IGF1R plays a role in mesenchymal stem cell differentiation into adipocytes (35). This study also identified many   7. A, cellular distribution of CD markers identified in this study. Many CD markers expressed in D3 cells are also found associated with endothelial cells (red), epithelial cells (orange), or neural cells (yellow-green) or are widely expressed in a variety of normal (blue) or cancer cells (green). The CD molecules localized in D3 cells by immunofluorescence microscopy are indicated in bold. B, cell surface localization of CD9, CD31, CD56, and CD98. D3 cells were stained with antibodies against each of the CD molecules (red) and an antibody against SSEA-1 (green), a cell surface marker for undifferentiated ES cells. Note that D3 cells co-express SSEA-1 and a number of cell surface markers for differentiated cell lineages. PECAM, platelet endothelial cell adhesion molecule. surface subproteome. We were able to identify 200 known or putative cell surface proteins using this approach. The study also identified minor cellular components such as cytokine/ growth factor receptors, Erbb2, Fgfr2, Flt4, and the LIF receptor, Lifr. Lifr has been reported to be present at only 250 -300 copies/cell (36), suggesting that this labeling and isolation method enabled us to identify relatively low abundance plasma membrane proteins. In addition, the MS-based identification of labeled peptides allowed the determination of the sites of biotinylation in each protein. When biotinylation can be shown to specifically label only the amino-terminal or only carboxyl-terminal side of the potential TM domain, the site of biotinylation can be used to assign the putative TM protein as a type I or type II TM protein. We identified 122 proteins for which we could make such assignments (Fig. 4). Furthermore four proteins of unknown function and localization identified in this study localized to the plasma membrane when the FLAG-tagged versions of these proteins were expressed in D3 cells (Fig. 5).
In early attempts, we prepared biotinylated peptides by direct tryptic digestion of total cell lysates of biotin-labeled D3 cells without prior subcellular fractionation. Although the LC-MS/MS analysis of this peptide preparation identified a comparable number of biotinylated peptides and ϳ50% of identified proteins had potential signal sequences and/or TM domains, many peptides were derived from a number of highly abundant intracellular proteins, probably originating from non-viable cells (data not shown). To minimize contamination that interfered with the selective and sensitive identification of low abundance cell surface proteins, we incorporated subcellular fractionation by sucrose density centrifugation prior to the tryptic digestion of the protein samples.
In a previous study, we identified 1,790 proteins expressed in mouse ES cells by applying the MS-based protein identification technology to tryptic digests of whole-cell lysates (15). The identified proteins included many housekeeping proteins found in common with other cell types as well as a group of proteins unique to ES cell function, such as the transcription factors Oct-3/4 and Sox-2. The set of previously identified proteins also contained 260 potential membrane proteins, including cell surface antigens such as CD9 and CD81. This represented 15% of the total proteins identified. Eighty-two of these were again identified in this study. Thus, 118 of the 200 potential cell surface proteins identified in the present strategy of utilizing specific cell surface labeling coupled with MS had not been previously identified by our previous study (Supplemental Table 1).
However, our labeling and isolation strategy does not yield plasma membrane proteins completely free of contamination. First, intracellular proteins from non-viable cells (usually ϳ5% of the cell culture) were labeled with the biotinylating reagent. In this study, about one-fourth of the identified proteins had no potential signal sequences or TM segments (Fig. 4A) and contained abundant housekeeping proteins, such as riboso-mal constituents, structural molecules, histones, and chaperones (Fig. 6). Although some of these proteins might potentially be cell surface components (37), it was still difficult to distinguish them from the intracellular components of nonviable cells that might be labeled and retained in the membrane fractions. Second, the biotinylating reagent might fail to label proteins that have few reactive lysine residues, small extracellular regions, or many post-translational modifications, such as glycosylation.
To our knowledge, this study identified the largest subset of CD molecules on the surface of a single cell line to date. These CD molecules, as well as many cell surface proteins identified in this study, suggest that the D3 cell line expresses a wide variety of proteins on the plasma membrane. In our previous analysis of the ES cell proteome, we also found that these cells express a number of ES-specific proteins, such as Oct-3/4 and UTF-1, as well as many cell signaling molecules that are characteristic of differentiated cell lineages, such as hematopoietic and neural cells (15). Although we could not exclude the possibility that a small portion of cells were differentiated to a variety of cell lineages during culture, our present and previous results imply that ES cells express multiple proteins considered unique to a number of differentiated cell lineages, enabling cells to respond to a variety of stimuli leading to differentiation to different specific cell lineages. Indeed our present study shows that a number of cell surface markers for differentiated cells are co-expressed with known markers of undifferentiated ES cells in D3 cells (Fig. 7). Future applications of this method in combination with quantitative proteomic approaches, such as stable isotope labeling, should identify stage-and lineage-specific expression of ES cell surface proteins and provide a catalogue of candidates for molecular markers of ES cells and of potential targets for controlled differentiation of ES cells in tissue engineering.