Rare Cell Proteomic Reactor Applied to Stable Isotope Labeling by Amino Acids in Cell Culture (SILAC)-based Quantitative Proteomics Study of Human Embryonic Stem Cell Differentiation*

The molecular basis governing the differentiation of human embryonic stem cells (hESCs) remains largely unknown. Systems-level analysis by proteomics provides a unique approach to tackle this question. However, the requirement of a large number of cells for proteomics analysis (i.e. 106–107 cells) makes this assay challenging, especially for the study of rare events during hESCs lineage specification. Here, a fully integrated proteomics sample processing and analysis platform, termed rare cell proteomic reactor (RCPR), was developed for large scale quantitative proteomics analysis of hESCs with ∼50,000 cells. hESCs were completely extracted by a defined lysis buffer, and all of the proteomics sample processing procedures, including protein preconcentration, reduction, alkylation, and digestion, were integrated into one single capillary column with a strong cation exchange monolith matrix. Furthermore, on-line two-dimensional LC-MS/MS analysis was performed directly using RCPR as the first dimension strong cation exchange column. 2,281 unique proteins were identified on this system using only 50,000 hESCs. For stable isotope labeling by amino acids in cell culture (SILAC)-based quantitative study, a ready-to-use and chemically defined medium and an in situ differentiation procedure were developed for complete SILAC labeling of hESCs with well characterized self-renewal and differentiation properties. Mesoderm-enriched differentiation was studied by RCPR using 50,000 hESCs, and 1,086 proteins were quantified with a minimum of two peptides per protein. Of these, 56 proteins exhibited significant changes during mesoderm-enriched differentiation, and eight proteins were demonstrated for the first time to be overexpressed during early mesoderm development. This work provides a new platform for the study of rare cells and in particular for further elucidating proteins that govern the mesoderm lineage specification of human pluripotent stem cells.

hESCs 1 provide a novel means to study early human development and also hold strong therapeutic potentials (1). However, our understanding of the biological mechanisms involved in hESC developmental processes, such as differentiation into mesoderm and its progenies, blood, endothelial cells, bone, heart, and skeletal muscles, is limited. Mass spectrometry-based proteomics has been developed as a systems biology approach to explore gene functions at the post-transcriptional level (2,3).
There are two main challenges with the stable isotope labeling by amino acids in cell culture (SILAC)-based quantitative proteomics study of hESCs. The first challenge is the number of cells required for proteomics analysis, and the second challenge is the maintenance of hESCs necessitated for SILAC labeling. Recently, two research groups have reported SILAC protocols to study the self-renewal and differentiation of hESCs (4,5). In these methods, up to 10 million hESCs per sample were cultured in modified mouse embryonic fibroblast-conditioned medium (MEF-CM). The large number of cell required per analysis hampers the study of lineage specification of hESCs. To put it in context, to obtain 10 millions hESCs requires at least 15 days of culture, representing a culture medium cost of ϳ$250 per sample, not considering the cost of SILAC reagent. More-over, MEF-CM used for SILAC labeling of hESCs contains numerous known and unknown factors (6,7), which raises a serious issue for the elucidation of a given factor affecting hESC differentiation. It would be prohibitively expensive and difficult to simultaneously examine and compare multiple factors that impact lineage specification during hESC differentiation using current methodology. As well, it would be difficult to study a small number of ancestor cells during developmental processes, such as early precursors of blood, bone, or muscles. These ancestor cells can be Ͻ5% of the total cell population after fractionation for proteomics analysis. Therefore, new technologies that can handle a limited number of cells and that can provide suitable growth and SILAC labeling conditions are needed.
In recent years, we and others have developed novel analytical techniques that improve the processing of proteomics samples (8 -11). We have previously shown that a device, called the proteomic reactor, can improve the processing of proteomics samples over conventional approaches (12)(13)(14)(15). We have also shown that a smaller version of this device can be used to handle a limited number of cells (12). Here, we report the development of the rare cell proteomic reactor (RCPR) that integrates 1) a monolithbased proteomic reactor that allows the processing of the proteome of hESCs and 2) the on-line coupling of the proteomic reactor in a two-dimensional HPLC-ESI-MS/MS analysis for the identification and quantitation of proteins. Using this approach, we have shown that we can analyze hESCs down to only 500 cells introduced on the proteomic reactor. We also report that 2,281 unique proteins can be readily identified on this system using only 50,000 hESCs. In addition, based on previous studies (4, 5, 16 -18), we developed a new SILAC medium with a chemically defined background using readily available reagents for hESC maintenance and differentiation. Furthermore, we report a new in situ differentiation procedure that generates more differentiated cells and offers a simple and efficient way to study cellular commitment of hESCs. We studied the mesoderm differentiation of hESCs using the RCPR and were able to quantify 1,086 proteins (excluding one-peptide results) with an FPR rate of 0.45%. Altogether, the RCPR provides a new platform to allow simultaneous study of multiple factors and rare ancestor cells during different stages of hESC development. This work paves the way toward further elucidating crucial proteins controlling cellular commitment, lineage restriction, and terminal differentiation of human pluripotent stem cells.

EXPERIMENTAL PROCEDURES
Human Embryonic Stem Cell Culture, SILAC Labeling, BIO Treatment, and Harvest-H1, H9, and CA1 hESC lines were maintained under feeder-free culture conditions as described previously (16,19,20) (refer to supplemental experimental methods for details). For SILAC labeling, hESCs were cultured on Matrigel-coated plates in either heavy or light SILAC labeling medium for 8 days to achieve complete SILAC incorporation. Media were changed daily, and cells were passaged every 4 -5 days.
A new chemically defined SILAC medium was developed and used in this study. The medium was prepared using commercially available agents devoid of MEF conditioning and dialyzing. It is formulated with the advanced DMEM/F12-Flex medium supplemented with 100 mg/ liter L-arginine, 200g/liter glucose, 1 mM nonessential amino acids, 2 mM L-glutamine, 0.1 mM 2-mercaptoethanol, 1ϫ N2 supplements, 1ϫ B27 supplements, and 100 mg/liter L-[ 13 C 6 ]lysine HCl or 100 mg/liter non-labeled L-lysine HCl. All reagents were purchased from Invitrogen. The media were sterile filtered, and basic FGF (120 ng/ml; R&D Systems) was added immediately before use. The high concentration of basic FGF was important for hESC maintenance in the chemically defined SILAC medium.
To harvest hESCs, hESC colonies were dissociated into single cells with collagenase IV and cell dissociation buffer (Invitrogen), and the cell number was counted using a hemocytometer. hESCs were then aliquoted at a defined cell number into different tubes. The cells were lysed immediately with RCPR lysis buffer (10 mM HEPES, pH 7.4, 150 mM NaCl, 2 mM CaCl 2 , 2 mM MgCl 2 , 0.6 M guanidine HCl, 1% (v/v) Triton X-100, protease inhibitor mixture) before proteomics analysis.
RT-PCR, Q-PCR, Alkaline Phosphatase Staining, Immunofluorescence Microscopy, and Quantitative Image Analysis-The assays were conducted as described previously (16, 19 -21) and detailed in the supplemental experimental methods and supplemental Table 1.
Rare Cell Proteomic Reactor-As shown in Fig. 1, the RCPR consists of a capillary with integrated strong cation exchange (SCX) monolith matrix and a custom-made vessel pressurized with nitrogen for parallel sample loading. The SCX monolith column was prepared by modifying a method reported previously (22). Briefly, a fused silica capillary (Polymicro Technologies, Phoenix, AZ) was pretreated with 0.1 M NaOH for 1 h, washed with water and methanol, and then dried by nitrogen. 3-(Trimethoxysilyl)propyl methacrylate mixed with an equal amount of methanol was introduced into the pretreated capillary and incubated at 60°C overnight. After the reaction, the capillary was washed with methanol and dried by nitrogen. The polymerization mixture containing 100 mg 2-acrylamido-2-methyl-1-propanesulfonic acid, 60 mg of bisacrylamide, 270 l of DMSO, 200 l of dodecanol, 50 l of N,N-dimethylformamide, and 2 mg of 1,1Ј-azobis(cyclohexanecarbonitrile) was sonicated on ice for 5 min. The resulting transparent solution was infused into the pretreated capillary (200-m inner diameter ϫ 15 cm) to fill 5 cm of the capillary. Then, the capillary was sealed at both ends and incubated at 60°C for 12 h. The prepared monolith column was extensively washed with methanol prior to use.
For RCPR operation, the monolith column was preconditioned with 10 mM potassium phosphate buffer, pH 3 before sample loading, and the flow rate was measured on the pressurized vessel. Cell lysate was loaded directly onto the preconditioned SCX monolith column at 62 p.s.i. The SCX monolith column was then washed with wash buffer (8 mM potassium phosphate buffer, 20% (v/v) ACN). For protein reduction, 100 mM DTT dissolved in 10 mM ammonium bicarbonate (ABC) was loaded onto the column and incubated for 30 min at room temperature. After that, the monolith column was washed with 10 mM ABC briefly. Finally, 10 mM iodoacetamide and 2 g/l trypsin (Promega) dissolved in 20 mM ABC, pH 8 were loaded for simultaneously alkylation and digestion. After 2 h of incubation at room temperature, the monolith column was disconnected from the pressurized vessel, sealed at both ends, and stored at 4°C for on-line two-dimensional LC-MS/MS analysis.
On-line Two-dimensional LC-MS/MS Analysis-The on-line two-dimensional LC-MS/MS analysis was performed on a standard LC-MS system consisting of an Agilent 1200 capillary pump, Agilent microautosampler (Agilent Technologies, Waldbronn, Germany), and LTQ (for method optimization) or LTQ-Orbitrap XL (for SILAC) mass spectrometers equipped with a nanospray source (Thermo, San Jose, CA).
For on-line two-dimensional LC separation (see Fig. 1), the RCPR was cut right after the monolith matrix to remove the blank capillary part using a diamond scribe and connected in tandem to a C 18 tip column by a PicoClear union (New Objective, Woburn, MA). When used with the LTQ mass spectrometer, the RCPR was connected to a 120-mm ϫ 75-m-inner diameter C 18 tip column packed with Magic C 18 AQ resins (5 m, 200 Å; Michrom Bioresources, Auburn, CA), and the system flow rate was restricted to 200 nl/min. When used with the LTQ-Orbitrap XL mass spectrometer, the RCPR was connected to a 200-mm ϫ 50-m-inner diameter C 18 tip column packed with Repro-Sil-Pur C 18 resins (3 m, 200 Å; Dr. Maisch GmbH, Ammerbuch, Germany), and the system flow rate was restricted to 100 nl/min. The trapped peptides on RCPR were eluted onto C 18 analytical column by 13 stepwise elutions with different concentrations of NH 4 OAc, pH 2.7. After each elution, the tandem column system was washed with 0.1% acetic acid in water, and a 60-or 90-min ACN gradient elution from 5 to 35% was performed using 0.1% acetic acid in ACN for the reversed phase LC-MS analysis.
The LTQ and LTQ-Orbitrap XL mass spectrometers were operated in a positive ionization mode. A voltage of 1.8 kV was applied to generate the electrospray ionization. All MS and MS/MS spectra were acquired in a data-dependent mode. The instrument was set so that one full MS scan was followed by 10 MS/MS scans. For the LTQ-Orbitrap XL, the full-scan MS spectra (from m/z 400 to 2,000) were acquired with a resolution of 60,000 at m/z 400 after accumulation to a target value of 500,000. The 10 most intense ions at a threshold above 500 counts were selected for fragmentation by CID at a normalized collision energy of 35%.
MS Data Analysis-Peak lists were generated from the raw file using Mascot Distiller (version 2.0.0.0, Matrix Science, London, UK) for LTQ data and DTASupercharge (version 2.0a7, SourceForge) for LTQ-Orbitrap XL data. The acquired MS/MS spectra were searched against the human International Protein Index (IPI) protein sequence database (version 3.66, 86,845 protein entries; European Bioinformatics Institute). A decoy database with the reversed sequence of each entry in the forward database was also searched for evaluation of the false positive rate. The database searching was performed using Mascot (version 2.2.02, Matrix Science) with the following parameters: trypsin as digestion enzyme, carbamidomethyl (Cys) as a fixed modification, and oxidation (Met) as a variable modification. The SILAC label ([ 13 C 6 ]lysine) was set as an extra variable modification for Orbitrap XL data. The number of allowed missed cleavages was set to 2. The precursor and fragment mass tolerances were set at 2.0 and 0.8 Da for the LTQ data and 7 ppm and 0.5 Da for the Orbitrap XL data, respectively. The significance threshold was set to 0.05. A protein hit required at least one "bold red peptide," i.e. the most logical assignment of the peptide in the database selected, to be reported. Mascot cutoff scores were set to 30. All of the raw files were searched by Mascot separately. The false positive rate was controlled to be less than 1% using the equation FPR ϭ Number of false peptides/(Number of true peptides ϩ Number of false peptides) ϫ 100 (23). MSQuant (version 2.0a81, SourceForge) was used for calculating the relative peptide abundance with a default setting. The exported files from MSQuant were further integrated, and results were normalized by StatQuant (version 1.2.2, GForge). Manual validation was performed to assign shared peptides belonging to multiple protein groups to the protein group with the highest number of identified peptides (24). Molecular functions and protein networks were analyzed using the Ingenuity pathways analysis (IPA) software (version 8.5, Ingenuityா Systems).

RESULTS AND DISCUSSION
We are interested in the study of the proteome of rare cells. In these instances, the amount of starting material is limited, and the analytical techniques need to be adapted accordingly. The recovery from every step in the process needs to be maximized with a particular attention to cell lysis, the recovery and digestion of proteins, and the recovery and separation of peptides for identification by mass spectrometry. In this study, we report a fully integrated proteomics platform, termed RCPR, that minimizes sample loss and maximizes the mass spectrometry analysis efficiency. The RCPR allowed efficient capture and processing of proteins on a capillary proteomic reactor composed of an SCX monolith matrix. As well, the direct coupling of the capillary proteomic reactor with a nanoflow reversed phase column on line with an ESI mass spectrometer allowed the efficient transfer of the analytes from the proteomic reactor to the nanoflow reversed phase column and the on-line analysis of the peptides by mass spectrometry. Here, we also describe the application of the RCPR to the analysis of hESCs and their differentiation.
Optimization of Lysis Buffer-First, we developed a well defined cell lysis buffer compatible with the RCPR to completely solubilize proteins from the rare cell sample in as little volume as possible. The typical procedure in proteomics analysis from cell culture is to add cell lysis buffer into a cell pellet and remove the unsolubilized fraction by centrifugation prior to protein digestion. Unfortunately, this causes protein loss, which is an issue when dealing with a limited amount of samples. Instead, we optimized the cell lysis buffer composition and volume according to the number of cells to maintain a cell-to-buffer volume ratio lower than 2,500 cells/l. After adding the RCPR lysis buffer and pipetting several times, the cell pellet disappears completely, and the solution becomes transparent. Because the high concentration of guanidine HCl will possibly influence the binding of proteins onto the SCX monolith column, we next optimized the proper concentration of guanidine HCl in RCPR lysis buffer. As shown in supplemental Fig. 1, a concentration of guanidine HCl lower than 1 M does not cause protein loss. We selected to set the composition of the RCPR lysis buffer to 0.6 M guanidine HCl, which allows extracting 50% more proteins in comparison with RCPR lysis buffer without guanidine HCl (data not shown).
Monolithic Proteomic Reactor Design-Second, we optimized the protein recovery and processing techniques to maximize recovery when dealing with a limited number of cells. We previously demonstrated that the proteomic reactor is an efficient approach to process proteomics samples (12). This technology greatly simplifies the handling of proteins, reduces the volumes required for analysis, and shortens the processing times. Here, we introduce an SCX monolith column for proteomic reactor operation to further reduce sample loss and dead volume. The SCX monolith column has several advantages over our previously reported proteomic reactor based on SCX beads. First, the monolith matrix is polymerized directly inside the capillary (see Fig. 1, inset photo), and the whole proteomic reactor operation is performed in one single capillary column. A frit, SCX beads, and a union are not needed to prepare the reactor column, and surface loss is reduced. Second, the binding capacity of the monolith column is ϳ4 g of protein/cm for a 200-m-inner diameter column, which doubles the capacity of the previous proteomic reactor. This allows more proteins to be captured on a smaller volume of the column. Our comparison based on the previous off-line protocol indicates that 50% more proteins can be identified by the monolith proteomic reactor than by the bead proteomic reactor when as few as 220 and 4,200 cells are used (supplemental Table 2).
The separation ability of the analytical technique is also a critical factor for large scale proteomics. Here, to enhance the separation ability of the RCPR platform we coupled the proteomic reactor with an on-line two-dimensional LC-MS/MS analysis system. As shown in Fig. 1, after trypsin digestion, the SCX monolith proteomic reactor was connected directly with the LC-MS/MS system for on-line two-dimensional separation. To reduce the possible dead volume between the SCX monolith column and C 18 analytical column, the capillary columns were cleaved and carefully connected using a PicoClear union. As shown in Fig. 1, inset, the two columns are perfectly aligned. SCX-based peptide separation has been shown previously to be well suited as a first dimensional separation in on-line twodimensional LC-MS/MS analysis because of good orthogonality in the separation and solvents utilized in SCX-and reversed phase-based separation (25). Here, the monolithic proteomic reactor plays a dual role as a bed for extraction and chemical/ biochemical reactions and as the first dimension in two-dimensional LC separation. This key design makes the whole RCPR platform a fully integrated proteomics sample processing and mass spectrometry analysis system.
Minimum Number of hESCs That Can Be Processed by RCPR-To test the applicability of RCPR for large scale proteomics study with limited starting material, the default number of hESCs was loaded, processed, and analyzed by RCPR. As an example, Fig. 2, A and B, shows the results for processing and analyzing 50,000 hESCs. To fully explore the complexity of the digested peptides, the on-line two-dimensional LC-MS/MS in the RCPR was performed with 13 subsequent salt step elutions from 50 mM to 1 M salt. After each salt step elution, peptides were eluted from the SCX proteomic reactor to the reversed phase column, separated by reversed phase chromatography, and analyzed by ESI-MS/ MS. Fig. 2A illustrates the number of peptides and proteins identified from each salt step, and Fig. 2B illustrates the overlaps for peptides identified between the different fractions. The majority of the peptides were eluted at a salt concentration above 350 mM, indicating a strong retention of the peptides on the proteomic reactor. Moreover, the carryover between neighboring fractions and distal fractions is minimal. We analyzed 500, 5,000, and 50,000 hESCs on the RCPR, and 68, 409, and 2,281 unique proteins were identified, respectively. An excellent linearity between the number of unique proteins identified and the number of cells processed was obtained (Fig. 2C). This clearly demonstrates that the RCPR is compatible with the analysis of a limited number of cells, such as fractionated precursors from differentiated hESCs, and the distribution of the identified proteins is generally consistent with a previous report (26).
FIG. 1. Schematic overview of typical procedure for SILACbased quantitative proteomics analysis of hESCs using RCPR platform. hESCs were cultured in chemically defined SILAC media first, limited amounts of cells were collected for RCPR operation on a nitrogen-pressurized vessel, and then the RCPR was connected directly with a C 18 analytical column for on-line two-dimensional LC-MS/MS analysis.

Quantitative Proteomics Study of hESCs Using Novel
Chemically Defined SILAC Media-We are interested in studying the quantitative changes that occur in the proteome of hESCs using a limited number of cells. It has been reported that SILAC labeling of hESCs is not a trivial task because of the specific conditions required for hESC maintenance (4,5,27). Recently, Blagoev and co-workers (4) developed a SILAC labeling protocol by cultivating and SILAC labeling hESCs in a dialyzed MEF-CM. Although they reported excellent results, the generation of the medium is not straightforward. Another recipe reported by Bendall et al. (5) is to prepare the SILAC medium first followed by MEF conditioning. Notably, MEFconditioned medium contains numerous anonymous xenogenic factors from MEF cells that may complicate the data interpretation when studying a given factor that impacts the self-renewal and differentiation of hESCs. Instead, we report a new SILAC medium with a chemically defined background for the SILAC-based quantitative proteomics study of hESCs. It can be simply made using commercially available reagents directly without MEF conditioning (for details see "Experimen-tal Procedures"). Importantly, this new chemically defined SILAC medium is capable of maintaining hESCs in an undifferentiated state for over five cell doublings (an average doubling time for hESCs is ϳ36 h (28 -30)), fulfilling the requirement for complete SILAC incorporation. As shown in Fig. 3, hESCs cultured over 8 days in both heavy and light SILAC labeling media maintained their undifferentiated characteristics and were indistinguishable from hESCs maintained in MEF-CM. Moreover, the hESCs cultured in both heavy and light SILAC labeling media exhibited typical colony morphology (Fig. 3A), expressed OCT4 pluripotency gene (Fig. 3B), and were stained positively for pluripotency markers alkaline phosphatase, OCT4, and NANOG (Fig. 3, C-E). These results demonstrate that the simplified SILAC medium is capable of maintaining hESCs in an undifferentiated state. Similar results were obtained from three different hESC lines (H1, H9, and CA1), suggesting that the chemically defined SILAC media used in this study are not cell line-specific.
We next examined the incorporation efficiency of isotopelabeled lysine using the SILAC labeling medium containing 100 mg/liter L-[ 13 C 6 ]lysine HCl. The analysis was based on 578 peptides that were identified repeatedly (supplemental Table 3). The efficiency of labeling on lysine residue was estimated to be 97% after culture of hESCs for 8 days in the SILAC labeling medium (Fig. 3F). Based on the atom purity of the stable isotope-labeled lysine, this incorpo- ration rate almost reaches the maximal achievable labeling efficiency.
Application of RCPR in Study of hESC Differentiation-The conventional approach of cellular aggregates (formation of embryoid body) in hESC differentiation sometimes produces a limited number of differentiated cells (31). Instead, we developed an in situ differentiation protocol to improve cell yields after hESC differentiation. Briefly, hESCs were cultured in the chemically defined SILAC media with the addition of a GSK-3 kinase inhibitor, BIO (0.8 M), from days 4 to 8. GSK-3 kinase is one of the key components in the canonical Wnt signaling pathways (32). Inhibition of GSK-3 kinase leads to Wnt activation (32), which facilitates the mesoderm differentiation of hESCs (33,34). Our results showed that the in situ differentiation procedure produced up to 5-fold greater numbers of live cells than the formation of embryoid body and drove hESCs to differentiate efficiently. In the absence of BIO, hESCs were maintained in an undifferentiated state as characterized by colony morphology and positive staining for pluripotency markers alkaline phosphatase (data not shown), OCT4, and NANOG (Fig. 4, B and C). In contrast, exposing hESCs to BIO for 4 days resulted in a marked change in colony morphology (Fig. 4A) and a significant decrease in OCT4 and NANOG gene expression (Fig. 4D). Consistently, the majority of cells lost OCT4 and NANOG protein expression as determined by fluorescence immunocytochemistry and quantitative image analysis (Fig. 4, B and C). Furthermore, as opposed to the undifferentiated hESCs, the differentiated hESCs expressed a higher level of genes known to be expressed during mesoderm differentiation, such as BRACHYURY (1,600-fold increase) and MIXL1 (600-fold increase), whereas three other genes representing ectoderm and endoderm differentiation were not significantly up-regulated (Fig. 4D). These results demonstrate that undifferentiated and differentiated hESCs (mesoderm-enriched specification) can be obtained in the chemically defined SILAC media using vehicle or BIO treatment.
We next mixed the cell lysates of 25,000 cells from both undifferentiated hESCs and BIO-differentiated hESCs and applied them onto the RCPR for sample processing and on-line two-dimensional LC-MS/MS analysis with optimized salt fractions. Three replicate experiments with the same amount of starting material were performed. 2,347 proteins were identified, and 1,086 proteins with a minimum of two peptides per protein were kept for further quantitative analysis (FPR of 0.45% and SILAC ratio S.D. of less than 2; supplemental Table 4). Quantified peptide information and MS/MS spectra for one peptide identifications were included in supplemental information. 95% of the proteins could be quantified in at least two experiments, and only 57 proteins were quantified in one experiment. The relative quantitation of the 57 proteins was manually validated by inspecting the raw files, and the S.D. of the quantitation was less than 0.76. The quantified proteins are involved in a variety of cellular and molecular functions (supplemental Fig. 2), suggesting that the RCPR can be used for a global quantitative study of the hESC proteome at great depth. Fig. 4E shows that the SILAC ratio (a logtransformed distribution) for the most quantified proteins is nearly centered on 0 with overall S.D. of 0.55, which demonstrates the high accuracy of the SILAC-based quantitation. Of these quantified proteins, 1,030 proteins fell within the 95% confidence interval, whereas 56 proteins exhibited a significant SILAC ratio change (supplemental Table 5). The cutoff ratios for proteins with significant changes are 2.62 and 0.58 for up-regulation and down-regulation, respectively. Of the 56 proteins, 45 proteins were up-regulated, whereas only 11 proteins were down-regulated after differentiation of hESCs into an early mesoderm-enriched lineage (Fig. 4), representing the possible contribution of mesoderm differentiation to significant protein up-regulation. Interestingly, 84% of these proteins are either known to directly interact or linked by one other protein in the Ingenuity Knowledge Base (supplemental Fig. 3). For example, the up-regulated proteins APR2/3, ARPC3, ARPC1B, and ARPC2 are part of the ARP2/3 complex, whereas CORO1B is known to interact with the ARP2/3 complex. The ARP2/3 complex and CORO1B are involved in the regulation of the actin cytoskeleton (35). Furthermore, comparison with recent transcriptome analyses of hESC self-renewal and differentiation indicated that 52% of the matching entries (protein-RNA entries) have the same expression trend at the RNA and protein levels (supplemental Table 6) (36 -40).
To further explore potential proteins associated with selfrenewal and early mesoderm differentiation of hESCs, identified proteins showing significant change after treatment with BIO (a mesoderm inducer) were grouped and extracted for validation analysis. As shown in Table I, there were three down-regulated proteins; SOX2 and isoform 1 of DNMT3B are well known self-renewal markers (41,42), and L1TD1 has been demonstrated to be enriched in undifferentiated hESCs in a recent transcriptome study (43). Our recent experiments suggest that L1TD1 is one of the most important factors during hESC selfrenewal, 2 consistent with our proteomics results. Notably, of the eight up-regulated proteins, two proteins are associated with hematopoiesis and angiogenesis (MEST and PTK7) (44 -46), two are associated with chondrogenesis (PLOD2 and P4HA2) (47,48), two are associated with myogenesis (ERO1L and CAMK2D) (49,50), and one is associated with mesoderm commitment (PRTG) (51). SARS, which is associated with angiogenesis (52), was up-regulated after mesoderm-enriched differentiation and further validated by Q-PCR. Consistently, three down-regulated proteins also exhibited a significant decrease in mRNA expression, whereas eight up-regulated proteins showed an increase in mRNA expression as determined subsequently by Q-PCR assays (Table I).
This work demonstrates that the RCPR is suitable for the quantitative analysis of the proteome changes that occur in as little as 50,000 hESCs after SILAC labeling. It also demonstrates that during hESC differentiation into a mesodermenriched lineage 56 proteins are significantly changed and that eight proteins previously associated with late mesoderm progenies are first shown to be expressed during early mesoderm commitment. The RCPR shown to be provides a new platform to allow us to further study these molecular details during different stages of hESC differentiation using rare ancestor cells.