Targeted Proteomics-Driven Computational Modeling of Macrophage S1P Chemosensing*

Osteoclasts are monocyte-derived multinuclear cells that directly attach to and resorb bone. Sphingosine-1-phosphate (S1P)1 regulates bone resorption by functioning as both a chemoattractant and chemorepellent of osteoclast precursors through two G-protein coupled receptors that antagonize each other in an S1P-concentration-dependent manner. To quantitatively explore the behavior of this chemosensing pathway, we applied targeted proteomics, transcriptomics, and rule-based pathway modeling using the Simmune toolset. RAW264.7 cells (a mouse monocyte/macrophage cell line) were used as model osteoclast precursors, RNA-seq was used to identify expressed target proteins, and selected reaction monitoring (SRM) mass spectrometry using internal peptide standards was used to perform absolute abundance measurements of pathway proteins. The resulting transcript and protein abundance values were strongly correlated. Measured protein abundance values, used as simulation input parameters, led to in silico pathway behavior matching in vitro measurements. Moreover, once model parameters were established, even simulated responses toward stimuli that were not used for parameterization were consistent with experimental findings. These findings demonstrate the feasibility and value of combining targeted mass spectrometry with pathway modeling for advancing biological insight.

Chemotaxis is defined as directed movement of a cell (or of an organism) resulting from stimulation by a chemokine or other chemotactic chemical. Eukaryotic cells employ intricate intracellular pathways to sense concentration differences of chemoattractants at their surface and move along such gradients using multiple synchronized cellular processes, including cell protrusion and adhesion at the leading end, de-adhesion at the trailing end, and mechanical force generation at both the leading and trailing ends (1)(2)(3)(4).
Activation of chemotactic receptors through chemoattractant concentration gradients results in nonuniform intracellular signaling responses that depend on numerous feedback mechanisms (5)(6)(7). Downstream, F-actin synthesis at the leading end causes the formation of cell protrusions (for example, filopodia, lamellipodia, and lamellae) that, in concert with actomyosin contraction, generates force at both the leading and trailing ends (8 -10). Chemotactic traction is produced by cell protrusions interacting with a confined environment and/or by adhesions that bind to the extracellular matrix and/or to cell adhesion molecules (11).
Chemotaxis plays a major role in a wide range of physiological and pathophysiological processes. Sphingosine-1phosphate (S1P), a phosphosphingolipid, mediates chemotaxis of many circulating cell types, including osteoclast precursors (OPs) (12)(13)(14). Osteoclasts are monocyte-derived multinuclear cells that directly attach to the bone matrix and resorb bone. They are solely responsible for bone resorption, and their misregulated activity has been implicated in numerous skeletal diseases, including osteoporosis, osteopetrosis, arthritic joint destruction, and bone metastasis (13). Recently, it was reported that S1P regulates bone resorption in mice by functioning as both a chemoattractant and chemorepellent of OPs via two G-protein coupled receptors (S1PR1 and S1PR2, respectively), which antagonize each other in an S1P-concentration-dependent manner (13,14). Circulating OPs are exposed to a relatively high S1P environment, and this can result in S1PR1 internalization and S1PR2-mediated chemorepulsion away from the circulation and into tissues. Within the bone lining tissue (an environment with low S1P levels), S1PR1 can be relocalized to the OP cell surface, resulting in chemoattraction toward the circulation. Alternatively, OPs can be chemoattracted to the bone matrix by chemokines se-creted by osteoblastic stromal and vascular endothelial cells (specifically, CXCL12 activation of CXCR4 and/or CX3CL1 activation of CX3CR1) and then can differentiate to become functional, multinuclear osteoclasts (14).
Even for single-ligand chemotactic stimulation, understanding how the responding signaling mechanisms allow a cell to sense what are often shallow and unstable chemoattractant concentration gradients represents a major challenge and requires detailed quantitative investigations. Computational studies, in addition to precise experimental measurements, have proven highly valuable in this regard (15)(16)(17). However, almost all existing models dealing with chemosensing represent the underlying biochemical interactions in greatly simplified or abstract terms to contend with the complexity of the relevant molecular reaction networks. To overcome this difficulty, rule-based approaches can be used to generate computational models based on the specification of bimolecular interactions, rather than through explicit specification of a complete system of coupled differential equations for the behavior of all multimolecular complexes resulting from those interactions (18 -22). Using the modeling platform Simmune, it is possible to apply such a rule-based approach to spatially resolved simulations that, in addition to the biochemical reactions, also take into account the diffusion of signaling components and the morphology of the simulated cells (23)(24)(25)(26). This capacity for spatially resolved rule-based modeling is especially appropriate for simulation of chemosensing, and it was used previously in a computational study of the cAMP-mediated chemotaxis pathway of the soil-dwelling amoeba Dictyostelium discoideum (24).
While such rule-based simulation tools have overcome many of the technical challenges involved in computational analysis of complex biochemical pathways, the lack of accurate measurements of important model parameters, such as intracellular protein concentrations, still represents a major bottleneck in using such methods to better understand biological processes (22). Among the experimental approaches that allow measuring protein abundances, mass spectrometry is establishing itself as the tool of choice in cases where more than just a handful of signaling components need to be quantified. In this investigation, advanced mass-spectrometric techniques were used to establish the quantitative basis for a detailed model of the OP S1P-chemotaxis pathway that could be used to simulate the highly dynamic spatiotemporal features of S1P-mediated chemoattraction and chemorepulsion in macrophages. RAW264.7 cell RNA-seq data were used to identify potentially expressed proteins (27)(28)(29), and RAW264.7 cell shotgun proteomics data were used to identify proteotypic peptides (30,31) for use as target peptides. Targeted proteomics assays using nanoflow liquid chromatography coupled to selected reaction monitoring mass spectrometry (LC-SRM) (32)(33)(34) were developed and used to measure the absolute abundance of chemotaxis pathway proteins within RAW264.7 cells. The target protein abundance values were then used as parameters of the pathway model, and manual refinement of the bimolecular association and dissociation reaction rates was guided by using two RAW264.7 cell in vitro microscopy datasets.
Combining these quantitative measurements of pathway component abundance with the Simmune tool set, many experimentally observed features of the modeled pathway were reproduced. These included aspects of the simulated cellular responses that emerged from the model structure, as opposed to having been input as desirable outcomes a priori. Our findings reveal how computational systems biology now allows high-throughput quantitative measurements to be used in an effective manner in efforts to understand complex aspects of cellular behavior.

EXPERIMENTAL PROCEDURES
RNA-seq-This investigation used RAW264.7 cell RNA-seq data that originated from a separate investigation (Narayanan et al., manuscript in preparation). Briefly, RAW264.7 Mus musculus monocyte/ macrophage cells were processed using the "Total RNA" protocol of the mirVana miRNA Isolation Kit (Thermo Fisher Scientific, Waltham, MA). Two sequencing libraries were prepared (the concentration of each was 6 pM): one using the standard Illumina protocol and the other using the DeLi-seq protocol (35). The resulting sequencing libraries were analyzed using a HiSeq 2000 sequencing system (Illumina, Inc., San Diego, CA) (60 -80 million paired-end, 100 bp reads per sample). CASAVA v1.7 (Illumina, Inc.) was used to generate FASTQ files of the resulting data (after applying quality control filters on the base calls made by the HiSeq 2000 image analysis software). The resulting data were analyzed using the Tuxedo suite of software tools (TopHat and Cufflinks) (36) and mapped onto the mouse genome (C57BL/6J strain of M. musculus, July 2007 NCBI37/mm9 assembly, http://www.ncbi.nlm.nih.gov/projects/genome/assembly/ grc/mouse/) (37). Each resulting FPKM value (fragments per kilobase of transcript per million mapped reads) was used as a measure of transcript absolute abundance (38,39). For each transcript, the mean FPKM value was calculated across the two RNA-seq analyses.
Protein and Peptide Target Selection for LC-SRM-A list of mouse chemotaxis pathway target proteins was manually compiled from an extensive literature review, and by using the Kyoto Encyclopedia of Genes and Genomes (http://www.genome.jp/kegg/) (43). A small number of additional proteins were targeted (for example, housekeeping proteins for normalization across biological samples and Photinus pyralis (firefly) luciferase to serve as a quantified internal protein standard). The protein sequences of the target proteins and of the entire mouse proteome were retrieved from UniProt (release 2011_04 (April 5, 2011), downloaded on April 12, 2011 from http:// www.uniprot.org/) (41).
An ad hoc score was developed to rank tentative target peptides based on numerous criteria. The peptides were required to be fully tryptic, to have no missing trypsin cleavage sites, and to be five to twenty amino acids in length. Peptides were penalized if they were prone to chemical modification (specifically, Cys and Met oxidation, Gln and Asn deamidation, N-Term Gln formation of pyroglutamate, and protein N/C termini modification). Peptides were required to be unique to each target protein or to a set of close homologues (when determining peptide uniqueness, peptides differing only by Ile/Leu substitution were considered identical because these perform nearly identically during LC-SRM). Peptides with neighboring trypsin cleavage sites were penalized to avoid potentially incomplete trypsin digestion (44). Peptides known to be proteotypic were greatly preferred and were identified using data from a previous proteomics study of RAW264.7 cells (45) obtained via shotgun nanoflow liquid chromatography coupled to (tandem) mass spectrometry (LC-MS(/MS)). Additional peptide identifications were downloaded from the National Institute of Standards and Technology (downloaded on April 10, 2012 from http://peptide.nist.gov/) (46), and from The Global Proteome Machine (X! HUNTER spectral libraries were downloaded on March 24, 2012 from http://www.thegpm.org/HUNTER/index.html) (GPMDB peptide identifications were downloaded on April 11, 2012 from http://gpmdb.thegpm.org/) (47). Subsequently, the target peptide list was manually reviewed using additional criteria including mRNA transcript abundance, usefulness of the target peptide against the most abundant transcript splice isoform(s), additional posttranslational modifications annotated in the UniProt database (for example, phosphorylation) (http://www.uniprot.org/) (41), and if the target peptide could also be used for LC-SRM assays of human protein orthologs.
LC-SRM Assay Development-SPOT synthesis (48) was used to synthesize 557 light external peptide standards (not stable isotope labeled; crude; carbamidomethyl cysteine, which is the same chemical structure that is formed by iodoacetamide alkylation) representing 171 target proteins (JPT Peptide Technologies GmbH, Berlin, Germany). The absolute abundance of each external peptide standard was estimated based on the parameters of its synthesis (primarily, the quantity of the amino acid used for its synthesis). These external peptide standards were used to develop the LC-SRM assays. The lyophilized peptides were dissolved in 0.1% v/v formic acid, 20% v/v acetonitrile, and the samples were vortexed for 2 min and bath sonicated for 5 min at room temperature. Mixtures of ϳ30 peptides were prepared, concentrated using a SpeedVac vacuum concentrator (Thermo Fisher Scientific Inc.), and analyzed by shotgun LC-MS(/MS) using a 1200 series nanoflow HPLC (Agilent Technologies, Santa Clara, CA) coupled to a TSQ Vantage triple quadrupole mass spectrometer (Thermo Fisher Scientific Inc.). The LC-MS system was equipped with freshly prepared custom-made C-18 resin packed-tip columns (50 m inner diameter). The columns were packed with either Magic C18AQ resin (5 m diameter, 20 nm pores, 15 cm column length) or Halo ES C18 resin (2.7 m diameter, 16 nm pores, 10 cm column length) (Michrom Bioresources, Inc. (now part of Bruker Corp.), Auburn, CA). Each LC-MS(/MS) run included a 60 min linear gradient (0 -40% Solvent B), a 30 min column regeneration step (80% Solvent B), and a 30 min column re-equilibration step (0% Solvent B) (Solvent A ϭ 0.1% v/v formic acid in H 2 O, Solvent B ϭ 0.1% v/v formic acid in acetonitrile, flow rate ϭ 200 nl/min, electrospray voltage ϭ 1,800 V, Q1 isolation width ϭ 0.2 m/z or 0.7 m/z, q2 argon pressure ϭ 1.2 mTorr). Approximately 5 pmol of each peptide were analyzed during each LC-MS(/MS) run of the external peptide standards mixtures. For each precursor ion scan, the nine most intense ions were dynamically selected for tandem mass spectrometry (MS/MS). Two different instrument methods were used to identify ϩ2 and ϩ3 precursor ions (that is, each sample was run in duplicate; runs for ϩ2 precursors used collision energy ϭ 2.905 V ϩ (0.03 V per m/z)*(precursor m/z); runs for ϩ3 precursors used collision energy ϭ 2.281 V ϩ (0.038 V per m/z)*(precursor m/z)) (49).
The resulting shotgun LC-MS(/MS) data were analyzed using Proteome Discoverer (v1.2.0.208, Thermo Fisher Scientific Inc.) to perform database searches using Mascot (v2.3.0, Matrix Science Inc., Boston, MA) (50) against the list of external peptide standards. The resulting data were imported into Skyline (64-bit, v1.4.0.4421) (51) to construct a spectrum library, and the resulting peptide identifications were manually reviewed (52). An LC-SRM transition list was prepared using the ten most intense transitions per precursor ion. LC-SRM analyses of the external peptide standards (in two mixtures) were performed using mass spectrometry as described above (Q1 and Q3 isolation width ϭ 0.2 m/z or 0.7 m/z, dwell time ϭ 10 ms, LC-SRM scheduling was not used).
For each peptide, all of its transition chromatograms were required to have identical elution profiles. For each peptide, the relative LC-SRM transition intensities were required to match those of the corresponding shotgun LC-MS(/MS) spectrum library entry. Each transition chromatogram was required to have a roughly Gaussian elution profile with signal-to-noise Ն ϳ3. The uniqueness of this profile over the LC-SRM analysis was considered when estimating the peptide identification confidence. Overlapping transitions were deleted (these were pairs of transitions that had precursor and fragment ion m/z values that were within the LC-SRM mass measurement accuracy). The resulting assays consisted of 508 peptides, 688 precursor ions, and 5,682 transitions. Note that these assays were generally designed to use significantly more than three to five transitions per peptide, which is the broadly accepted minimum for LC-SRM (53). The Skyline data are available for download at the Panorama online LC-SRM database (in the Manes_RAW_Chemotaxis folder at https:// panoramaweb.org/labkey/project/NIH_NitaLazar/begin.view).
Biological Sample Preparation-RAW264.7 cells were cultured in Dulbecco's modified Eagle's medium (Lonza Ltd., Basel, Switzerland) supplemented with 10% v/v fetal bovine serum (Atlanta Biologicals, Flowery Branch, GA) (two 100 mm diameter dishes at ϳ80% confluency per biological replicate, ϳ30 million cells). The cells were detached from the dishes by incubating them for 3 min in 2 mM EDTA in phosphate-buffered saline (PBS) with gentle pipetting. For each biological replicate, the cells from two dishes were immediately transferred to a centrifuge tube containing 20 ml of Dulbecco's modified Eagle's medium (serum free). The cells were counted and assessed for viability using a trypan blue staining solution and a Cellometer Auto T4 (Nexcelom Bioscience, Lawrence, MA) (the cells were required to be ϳ100% viable). The cells were pelleted by centrifugation for 5 min at 400 ϫ g in a swinging bucket rotor.
Each cell pellet was lifted in 400 l of either urea lysis buffer or RapiGest lysis buffer: 100 mM HEPES⅐NaOH (pH 8), 10 M bestatin hydrochloride, 10 M pepstatin A, PhosStop phosphatase inhibitor mixture (F. Hoffmann-La Roche Ltd., Basel, Switzerland, 1 tablet/10 ml), and a protein denaturant (either 8 M urea or 1 mg/ml RapiGest SF acid-hydrolysable surfactant (Waters, Corp., Milford, MA), respectively) (both freshly prepared). Two orthogonal protein denaturation conditions were used to control for incomplete protein solubilization and denaturation and to control for chemical artifacts such as carbamylation (54). Each sample was immediately transferred to a 2 ml microcentrifuge tube (with a screw cap containing an o-ring) containing ϳ100 l of 0.1 mm zirconia/silica beads (BioSpec Products, Bartlesville, OK, the beads will cause tubes with snap-caps to leak). The cells were lysed by bead beating by vortexing at maximum intensity for 5 min. The RapiGest samples were incubated for 10 min at 90°C to denature proteins, and all of the samples were bath sonicated for 10 min at room temperature to assist homogenization and denaturation.
BCA protein concentration assays (Thermo Fisher Scientific Inc.) were performed, and then 200 g (protein mass) of each sample were transferred to a fresh microcentrifuge tube and used for the downstream analyses. Photinus pyralis (firefly) luciferase was spiked into each sample to serve as a quantified internal protein standard (4.765 pmol of luciferase per sample; Sigma-Aldrich Corp.; St. Louis, MO, 98% pure by SDS-PAGE; quantified spectrophotometrically at 280 nm). The absorptivity coefficient at 280 nm of luciferase was estimated using the online tool ProtParam (55), which used the Edelhoch method (56) but with revised values for the extinction coefficients for Trp and Tyr (57). Mixes of the internal heavy peptide standards (prepared from lyophilized aliquots as described below) were spiked into the samples (0, 0.2, 2, and 20 pmol of each peptide per sample). Additionally, samples without RAW264.7 cell lysate were prepared for control experiments (specifically, internal peptide standards alone and also internal peptide standards plus ϳ4 pmol of each external peptide standard).
Sample protein was reduced by adding 10 mM dithiothreitol (final concentration) to each sample and incubating the samples at 60°C for 30 min. Sample protein cysteine residues were alkylated by adding 50 mM iodoacetamide (final concentration) to each sample then incubating the samples at room temperature for 20 min in darkness. The urea-denatured samples were diluted with 100 mM HEPES⅐NaOH (pH 8) so that the final concentration of urea was 1 M. The samples were proteolytically digested by adding sequencing grade modified trypsin (1:50 (protein w:w) trypsin:sample; Promega Corp., Madison, WI) and incubating the samples at 37°C for 18 h. The urea-denatured samples were acidified by adding 1% v/v formic acid (final concentration). The RapiGest denatured samples were acidified by adding 0.5% v/v trifluoroacetic acid (final concentration) and incubated at 37°C for 1 h to hydrolyze the RapiGest surfactant. The samples were microcentrifuged at 21,000 ϫ g for 20 min at room temperature to pellet particulates, and each supernatant underwent solid-phase extraction using a Sep-Pak C-18 SPE column (Waters Corp., Solvent A ϭ 0.1% v/v formic acid, Solvent B ϭ 0.1% v/v formic acid, 80% v/v acetonitrile). The eluates were concentrated using a SpeedVac vacuum concentrator and were prepared for LC-SRM by adding 0.1% v/v formic acid, 2% v/v acetonitrile (final concentrations).
Qualitative LC-SRM Assays-Qualitative LC-SRM assays of RAW264.7 cell samples were performed to attempt to detect the target peptides prior to preparing heavy-labeled purified quantified internal peptide standards. The RAW264.7 cell samples were analyzed by LC-SRM as described above (508 peptides, 688 precursor ions, 5,682 transitions, 29 unscheduled LC-SRM transition lists, duty cycle ϭ ϳ2 s) using newly prepared packed-tip columns. Immediately after the LC-SRM analyses of the RAW264.7 cell samples, numerous blanks were run, and then mixtures of the external peptide standards were analyzed by LC-SRM (using the same column/tip and instrument method). The resulting data were manually analyzed using Skyline as described above and are available for download at the Panorama online LC-SRM database (in the Manes_RAW_Chemotaxis folder at https://panoramaweb.org/labkey/project/NIH_NitaLazar/begin.view).
In addition to the qualitative results, the data were used to estimate the absolute abundance of the RAW264.7 cell proteins by using a label-free approach (using the estimated absolute abundance values of the external peptide standards). The mean of the estimated abundance values of each RAW264.7 cell-derived peptide was calculated across the biological replicates. For each external peptide standard, an estimated protein abundance value was calculated. This step was trivial unless the target peptide was not unique to the target protein (this was avoided if possible during peptide selection, as described above; 26 of the 285 quantified target peptides were not unique; each of these 26 corresponded to a set of close protein homologues; deconvolution of the 26 peptide abundance values was performed manually using the RNA-seq data as well as the abundances of the remaining 259 quantified target peptides; eight target peptide abundance values were discarded because deconvolution was not possible). For each target protein, the mean of the estimated protein absolute abundance values was calculated across the external peptide standards.
Subsequent to the accurate quantification of the RAW264.7 cell target proteins using LC-SRM and internal peptide standards (described below), the protein absolute abundance estimates (made using the external peptide standards) and accurate measurements (made using the internal peptide standards) were found to be skewed. Specifically, the median ratio (accurate measurement/estimate) was found to be equal to 0.22. Therefore, the estimates were normalized using a central tendency normalization (58). Specifically, each unadjusted RAW264.7 cell protein abundance estimate (made using the external peptide standards) was normalized by multiplying it by 0.22.
Quantitative LC-SRM Assays-Key regulatory proteins of the mouse chemotaxis signaling pathway were targeted for accurate quantification using LC-SRM and internal peptide standards. Two internal peptide standards were prepared for each target protein if possible. SPOT synthesis (48) was used to prepare 65 purified quantified heavy-labeled internal peptide standards against 45 target proteins (JPT Peptide Technologies GmbH). Each internal peptide standard was stable isotope labeled at the residue that becomes carboxy-terminal upon trypsin digestion (97-99% pure [ 13 C 6 , 15 N 4 ]Arg; 97-99% pure [ 13 C 6 , 15 N 2 ]Lys). The internal peptide standards were synthesized using unmodified cysteines (the internal standards were spiked-into cell lysates prior to dithiothreitol reduction and iodoacetamide alkylation). These peptides were purified by HPLC (typically Ն80% pure by UV), and simultaneously quantified by UV spectrophotometry using an incorporated trypsin-cleavable carboxyterminal "JPT-Tag." Amino-terminal native flanking residues were included during peptide synthesis (length ϭ three residues) to control for trypsin digestion efficiency and chemical artifacts such as aminoterminal carbamylation (54). Carboxy-terminal native flanking residues could not be incorporated because the JPT SPOT synthesis platform only allows incorporation of heavy Arg and Lys residues if they are already chemically bound to the JPT-Tag.
Each internal peptide standard was delivered as five 1 nmol lyophilized aliquots. Two mixtures of all 65 internal peptide standards were prepared independently. For each mixture and for each internal peptide standard, one of the lyophilized aliquots was dissolved in 0.1% v/v formic acid, 20% v/v acetonitrile, and was vortexed for 2 min and bath sonicated for 5 min at room temperature. For each mixture, the peptides were pooled and concentrated using a SpeedVac vacuum concentrator, and 0.1% v/v formic acid, 20% v/v acetonitrile (final concentrations) were added to each sample. An equal volume of each of the two mixtures was combined and spiked-into the RAW264.7 cell lysates (described above). Additionally, samples absent RAW264.7 cell lysate were prepared (specifically, internal peptide standards alone and also internal peptide standards plus ϳ4 pmol of each external peptide standard).
Six biological replicates of RAW264.7 cells were prepared for LC-SRM as described above (three were prepared using urea and three using RapiGest). LC-SRM was performed as described above (172 precursor ions, 1,422 transitions, ten unscheduled LC-SRM transition lists, duty cycle ϭ ϳ1.4 s). The design of these assays conformed to the "Tier 1" criteria (the strictest requirements, intended for clinical assays) as described by a broadly accepted guidance for LC-SRM assay development (53), except that assays of a quantified purified protein standard were not performed for each target protein (only for the firefly luciferase internal protein standard), and the influence of all analytical variables (such as the commercial source of the trypsin) on each assay performance was not evaluated. Including LC-SRM assay development, ϳ1,000 mass spectrometry runs were performed for this investigation (ϳ2,500 h of instrument runtime).
The resulting absolute quantification data were manually analyzed using Skyline as described above and are available for download at the Panorama online SRM database (in the Manes_RAW_Chemotaxis folder at https://panoramaweb.org/labkey/project/NIH_NitaLazar/begin.view). The amount of light peptide contamination within each heavy internal peptide standard was determined from the LC-SRM analyses of the internal peptide standards alone. It was confirmed that the internal peptide standards were only slightly contaminated with light forms of the peptides (Fig. S1). For each RAW264.7 cell derived light peptide abundance value, it was discarded if Ͼ5% of the light peptide abundance originated from light peptide contamination within the corresponding heavy internal peptide standard. For each target peptide and biological replicate, a mean light peptide abundance value was calculated using the light peptide abundance values that satisfied: 0.02 Յ light/heavy abundance ratio Յ 2. For each target peptide and biological replicate, if none of the light peptide abundance values satisfied this criteria, then the light peptide abundance value closest to the corresponding heavy peptide abundance value (Minimum(AbsoluteValue(Log 10 (Light/Heavy))) was used. This resulted in a final range of tolerated light/heavy abundance ratios that spanned from 0.02 to 8.53.
The resulting peptide abundance values were normalized using a central tendency normalization (58) to eliminate systematic biases between the biological replicates (the normalization factors for the six biological replicates were 0.791, 0.679, 1.372, 0.942, 1.067, and 1.149). For each target peptide, the mean peptide abundance value was calculated across the biological replicates. For each internal peptide standard, a protein abundance value was calculated. This step was trivial unless the target peptide was not unique to the target protein (this was avoided if possible during peptide selection, as described above; five of the 60 quantified target peptides were not unique; each of these five corresponded to a set of close protein homologues; deconvolution of the five peptide abundance values was performed manually using the RNA-seq data as well as the abundances of the remaining 55 quantified target peptides; one target peptide abundance value was discarded because deconvolution was not possible). For each target protein, the mean of the protein abundance values was calculated across the target peptides. Correlation analyses and statistical tests were performed using GraphPad Prism v6.05 (GraphPad Software, Inc., La Jolla, CA).
Pathway Simulation-A mouse chemotaxis signal transduction pathway was manually constructed from an extensive literature review, and by using the Kyoto Encyclopedia of Genes and Genomes (http://www.genome.jp/kegg/) (43). Proteins were grouped into the same pathway node if they had identical topology within the pathway. For example, the Gnai pathway node consists of the three G␣i proteins (GNAI1, GNAI2, GNAI3) because these three proteins have identical pathway functions (each interacts with the same binding-partners in the same way). Each node abundance value was simply the sum of the corresponding protein abundance values. If possible, the accurate LC-SRM measurements made using the internal peptide standards were used. Otherwise, the mean of the two estimates were used (the estimates made using a label-free approach using the external peptide standards, and the estimates made using the RNA-seq data). Pathway visualizations were prepared using Cytoscape (64- Simmune performs simulations by discretizing space into voxels (that is, cubes), and in this study the voxels each had a volume of 1 m 3 . The RAW264.7 cell was modeled as an approximate sphere within a 20 ϫ 20 ϫ 20 voxel compartment. The RAW264.7 cell had a diameter of 13 m and a volume of 1,189 m 3 , and it contained an approximately spherical nucleus that had a diameter of 7 m and a volume of 147 m 3 . This geometry was used for all simulations, unless explicitly specified otherwise. A low resolution spherical RAW264.7 cell geometry (each voxel was 2 m ϫ 2 m ϫ 2 m), and a RAW264.7 cell attached to a flat surface was used for some of the simulations. All of the in silico RAW264.7 cells had approximately equal volumes.
In general, Simmune typically required between 1 ms and 10 s of CPU core time per degree of freedom to complete a simulation. Here, a degree of freedom is an independent reacting form of a molecular complex at a specific point in space and assuming that the spatial geometry of the cell(s) encompasses up to 1,000 voxels. Model development generally requires calibration of the model parameters, and this typically requires 100 -10,000 simulations (some Simmune models required millions of simulations). Though not used in this project, if the model refinement procedure permits (at least partial) calibration using nonspatial simulations (for example, performing a simulation of a homogeneous pathway stimulus and restricting model parameters using the results), individual simulations can be run in significantly shorter time, and the maximum number of simulated molecular complex forms can typically be increased by an order of magnitude. Note that Simmune offers low level access to its command line and application programming interface, which permits distributing simulations on a computer cluster.
The full model was far too complex to simulate all at once, so two subpathways were simulated (designated "Subpathway1" and "Sub-pathway2"). Subpathway1 was constructed to fully account for the biphasic nature of S1P-mediated chemotaxis of RAW264.7 cells, and its nodes were mostly accurately quantified. Subpathway2 was constructed to include all of the nodes and edges of Subpathway1 and also almost all of the actin cytoskeleton and FA machinery, but it was mostly quantified using estimation. The in silico S1PR1 and S1PR2 knockdown cells were modeled as being 98% reduced in S1PR1 or S1PR2 protein abundance (respectively), but otherwise were identical to the wild-type cells. Each simulation required either ϳ5-20 h (Sub-pathway1) or ϳ5-20 days (Subpathway2) of computer runtime. Use of the low resolution geometry file increased simulation speed approximately threefold.
All of the Simmune complexes were limited to a maximum of six molecules per complex. Relaxing this to a maximum of seven molecules per complex caused the simulations to progress extremely slowly or to fail. The diffusion coefficients were estimated to be 100 m 2 /s (freely diffusing small molecule), 30 m 2 /s (freely diffusing protein), 0.5 m 2 /s (inner/outer membrane small molecule), 0.1 m 2 /s (inner/outer membrane protein), and 0.01 m 2 /s (transmembrane protein) (see the Simmune_Node_DiffCoeff column of Table S9 for more details). For each molecular player in the simulations, self-association (for example, homodimerization) and membrane association information was retrieved from the UniProt online protein database or from a literature review (see the Simmune_Node_SelfAssociation and Simmune_Node_CellMemAssoc columns of Table S9 for more details).
The initial (prerefinement) molecular reaction rates were estimated to be 1,000,000 L/(mol*s) (association), 1/s (dissociation, nonenzymatic), and 100/s (dissociation, enzymatic) (21). Simulations were performed using the initial (prerefinement) pathway model, and these produced grossly unphysiological results. For example, prior to any pathway stimulation (that is, receptor activation by blue photons or S1P), PI(3,4,5)P 3 abundance increased wildly until all of the PI3K substrate (PI(4,5)P 2 ) was consumed. Therefore, the molecular association, transformation, and dissociation rates were manually refined to produce simulation results that were consistent with data from previous investigations of in vitro RAW264.7 cell chemotaxis (60 -63).
Model refinement was performed iteratively. At each step, a small number (typically 1-4) of the molecular reaction rates were manually adjusted (typically by a factor between two and ten), a small number of simulations were performed (typically 1-4), and the results were used to design the next refinement step. Refinement started from the beginning of the signaling pathway (the blue photon and S1P receptors), and then proceeded stepwise downstream to the end of Sub-pathway1 (the small GTPases). Some Subpathway2 rates were also refined, but the complexity of Subpathway2 prevented a comprehensive refinement. Numerous refinement steps were required for each step down the pathway, and backtracking was often required. At each pathway step, the primary goals were to produce adequate cellular polarization of the activated form of the signaling components (that is, the component(s) at the specific pathway step, and also the component(s) upstream); to produce quantities of the activated component(s) that were sufficient for the downstream processes; and to avoid thermodynamically implausible molecular reaction rates (21). The blue opsin-PI(3,4,5)P 3 in vitro data (60,61) were used for model refinement first, and the S1P chemotaxis in vitro data (62,63) were used second. Activated Rac (guanosine-5Ј-triphosphate (GTP) bound RAC) polarization in silico was used as a measure of chemotactic direction. Approximately 200 simulations were performed during model refinement, and all of the resulting reaction rates were thermodynamically plausible (21). Subsequently, the final (refined) pathway model was used to perform all of the simulations described herein. All of the refinements are described in Table S11.
RAC1-GTP and RHOA-GTP Abundance Assays-Lyophilized S1P (Avanti Polar Lipids, Inc., Alabaster, AL) was dissolved in acidified dimethyl sulfoxide (95% v/v dimethyl sulfoxide, 5% v/v 1 M HCl) to prepare a 3.9 mM S1P stock solution. This solution was diluted using 4% w/v bovine serum albumin (fatty acid free; low endotoxin; MP Biomedicals, LLC, Santa Ana, CA) to make 150 M S1P aliquots. RAW264.7 cells were cultured in 12-well cell culture plates (Corning Inc., Corning, NY) at a density of 200,000 cells per well, and the cells were allowed to attach for ϳ24 h. The cells were cultured at 37°C with 5% CO 2 in 1 ml per well of Dulbecco's modified Eagle's medium (2 mM L-glutamine) supplemented with 10% fetal bovine serum and 20 mM HEPES. To switch to serum starvation conditions, the cells were washed once with PBS before adding 1 ml per well of OptiMEM (Thermo Fisher Scientific Inc.) and incubating at 37°C with 5% CO 2 for ϳ20 h. The 150 M S1P aliquots were used to make freshly prepared S1P dilutions in OptiMEM. Starved cells were treated with either 0.1 or 10 nM S1P for 1, 5, or 35 min (or were left untreated). Two biological replicates were prepared per treatment condition. After S1P treatment, the cells were immediately washed once with ice-cold PBS and assayed for RAC1-GTP and RHOA-GTP abundance using G-LISA Activation Assay kits (Cytoskeleton, Inc., Denver, CO) following the manufacturer's instructions and using the kit reagents.
For each G-LISA assay, the cells were lysed in ice-cold lysis buffer supplemented with protease inhibitor mixture, and the lysate was clarified by centrifugation at 10,000 ϫ g for 2 min at 4°C. Aliquots of the supernatant were flash-frozen in liquid nitrogen and stored at Ϫ80°C. A small aliquot was used for a protein concentration assay using the kit protein assay reagent. The lysate was thawed on ice and diluted to a total protein concentration of 0.8 mg/ml using ice-cold lysis buffer. For the RHOA-GTP assay, the lysate was diluted with an equal volume of binding buffer as directed. The lysate was loaded into a prepared well of a binding strip (specific to RAC-GTP or RHO-GTP), and it was incubated at 4°C for 30 min with shaking. The amount of bound GTPase was measured using a primary antibody (that recog-nizes RAC1 or RHOA), a horseradish peroxidase-conjugated secondary antibody, and a colorimetric horseradish peroxidase substrate. Absorbance at 490 nm was measured using a FLUOstar Omega microplate reader (BMG Labtech, Ortenberg, Germany). Blanks were performed and used for background correction.
Pathway Model Variance Analysis-Subpathway1 simulations using the refined pathway model were performed using Simmune v2.3.1.4433 for Linux installed on an Apple Power Mac (12 cores, 48 GB RAM). Custom-made scripts were used to automate 2,000 Simmune simulations and to compile the resulting output. These simulations used a low resolution spherical RAW264.7 cell geometry (each voxel was 2 m ϫ 2 m ϫ 2 m). For each simulation, all of the initial molecular complex abundances were independently altered from their original value (all of the other pathway model parameters were not altered). For each initial abundance, a scaling factor was drawn from a log-normal distribution with a median of one. Two different values were used for the Scale parameter. For the initial abundances that were solely the result of estimation, Scale ϭ 0.74. For all of the other initial abundance parameter values (Ն1 protein was quantified using Ն1 internal peptide standard), Scale ϭ 0.32. These Scale parameter values were designed to roughly approximate the accuracy of the quantification (estimation and accurate LC-SRM measurement) that was used during this project.

Absolute Quantification of RAW264.7 Cell Chemotaxis
Pathway Proteins-In this investigation, targeted proteomics, shotgun proteomics, transcriptomics, and pathway simulation were used to investigate S1P-mediated chemosensing and polarization of RAW264.7 cells (model OPs) (Fig. 1). A list of candidate mouse chemotaxis pathway target proteins was compiled from the literature (Table S1), and a RAW264.7 cell RNA-seq dataset was used to identify potentially expressed target proteins within this pathway set (Table S2). Shotgun mass spectrometry data were used to identify mouse proteotypic peptides from this candidate protein list. Selection of the target peptides was based on several criteria, including peptide proteotypic qualities, sequence uniqueness, and susceptibility to posttranslational modification (Table S1).
The full S1P chemotaxis pathway was far too large to target for comprehensive protein quantification using LC-SRM and internal peptide standards at this point in the development of the methods. Therefore, for modeling purposes a subpathway (Subpathway1) was constructed extending from the S1P receptors (S1PR1 and S1PR2) to the small GTPases (RAC and RHOA). An extension of Subpathway1 that included the core actin cytoskeleton and focal adhesion (FA) machinery was designated Subpathway2. Both of these subpathways are described in detail in the simulation Results section (below).
LC-SRM assays were developed using external peptide standards (Tables S3 and S4), and RAW264.7 cell lysates were analyzed by LC-SRM using the internal peptide standards (Fig. S1, Table S5). The peptide-level abundance data from each biological replicate are provided as Table S6 and  the fully analyzed data are provided as Table S7. Proteins with abundance values as low as ϳ10,000 copies/cell were confidently identified and accurately quantified (Fig. S2).
Coefficient of variation (CV) values were calculated for each target peptide across the six biological replicates ( Fig. 2A) and also for each target protein across the internal peptide standards (Fig. 2B). The spiked-in internal protein standard (firefly luciferase) was targeted using two internal peptide standards and was quantified with excellent accuracy and precision (Fig.  2C). Of the CV values calculated across the six biological replicates, the mean value was 16.65%, the median value was FIG. 1. Experimental workflow. A set of mouse chemotaxis pathway target proteins was compiled. RNA-seq data were used to identify expressed proteins and their specific splice isoforms, and shotgun mass spectrometry data were used to identify proteotypic peptides. Selection of the target peptides used numerous criteria, including sequence uniqueness and susceptibility to chemical modification. Crude unlabeled external peptide standards were SPOT synthesized and used to develop LC-SRM assays, and RAW264.7 cell samples were analyzed using qualitative LC-SRM. If the peptide was successfully detected, quantified purified heavy-labeled internal peptide standards were prepared and used for quantitative LC-SRM. The resulting protein absolute abundance values were used as parameters for Simmune pathway simulations.
14.14%, the minimum value was 2.26%, and the maximum value was 47.41%. Of these 60 LC-SRM assays, 47/60 achieved the "Tier 1" precision target (the strictest criteria, intended for clinical assays; defined as "typically Ͻ20 -25% CV achieved"), and 56/60 achieved the "Tier 2" precision target (defined as "typically Ͻ20 -35% CV achieved"), as described by a broadly accepted guidance for LC-SRM assay development (53). Although there was a significant correlation of CVs and corresponding mean abundance values for the target peptides across the six biological replicates ( Fig. 2A), the correlation was rather poor and not significant for each target protein across internal peptide standards (Fig. 2B). These results indicate that, for the target peptides that were confidently identified, the accuracy and precision of the quantification was largely independent of target abundance.
In general, the precision of peptide quantification measured across the biological replicates was better than the precision of protein quantification measured across the pairs of internal peptide standards. Of the CV values calculated across the pairs of internal peptide standards (Fig. 2B), the mean value was 37.83%, the median value was 37.54%, the minimum value was 1.84%, and the maximum value was 118.34%. Of most concern were the four largest CV values, so the design of the eight corresponding LC-SRM assays was re-examined. The largest CV resulted from the two assays of Arhgdia, a GDP-dissociation inhibitor of small GTPases. Quantification using the two corresponding internal peptide standards resulted in two very different protein abundance values (SIQEIQELDK resulted in 267,000 Arhgdia copies/cell, and VAVSADPNVPNVIVTR resulted in 3,000,000 Arhgdia copies/

FIG. 2. Protein quantification accuracy and precision, and transcriptome-proteome correlation. (A)
Quantification variance across the biological replicates. RAW264.7 cell proteins were quantified using internal peptide standards and each abundance mean and CV was calculated across the six biological replicates (mean CV ϭ 16.65%). Pearson R P 2 ϭ 0.0295, Pearson R P 2 ϭ 0.0821 (using Log 10 -transformed abundance values), and Spearman R S 2 ϭ 0.1314 (p ϭ .0044), indicating a very weak but statistically significant correlation. (B) Quantification variance across the internal peptide standards. Each RAW264.7 cell protein was quantified using two internal peptide standards if possible. After averaging across the biological replicates, each protein abundance mean and CV was calculated across the two target peptides (mean CV ϭ 37.83%). Selected individual data points are annotated with the two measured abundance values in units of copies/cell. Pearson R P 2 ϭ 0.0915, Pearson R P 2 ϭ 0.0428 (using Log 10 -transformed abundance values), and Spearman R S 2 ϭ 0.0073, again indicating a very weak correlation. (C) Quantification of the internal protein standard. Firefly luciferase (98% pure by SDS-PAGE and UV(280 nm) quantified) was spiked into each RAW264.7 cell lysate and accurately and precisely quantified by performing LC-SRM using two internal peptide standards (mean Ϯ S.E., n ϭ 6). (D) Correlation between the log-log-transformed transcript and protein abundance values. RNA-seq and LC-SRM using internal peptide standards were used to produce transcript and protein absolute abundance values (respectively), which were found to be correlated (FPKM ϭ fragments per kilobase of transcript per million mapped reads; Pearson R P 2 ϭ 0.7792 for a linear regression of the Log 10 -Log 10 -transformed data; Spearman R S 2 ϭ 0.6890).
cell). Upon re-examination of these two assays, it was determined that one of the target peptides corresponds to a previously reported phosphorylation site (S*IQEIQELDK) (http:// www.uniprot.org/uniprot/Q99PT1) (64). This is likely a major cause of the Arhgdia quantification disparity. No explanation for the three other high CV values could be determined. Nevertheless, it is likely that lack of peptide uniqueness, mRNA splicing, posttranslational modification, and/or proteolysis was/were the cause(s). This emphasizes the importance of carefully selecting the peptide targets and also of using at least two internal peptide standards per target protein.
The RAW264.7 cell transcript and protein absolute abundance values correlated strongly ( Fig. 2D and Fig. S3). Correlation between transcript and protein absolute abundance values has been reported previously for Escherichia coli (65,66), Saccharomyces cerevisiae (65,(67)(68)(69)(70), Schizosaccharomyces pombe (39), the HeLa (human epithelial) cell line (71,72), the NIH/3T3 (mouse fibroblast) cell line (73), and for 12 human tissues (74). These correlations are consistent with the observation that transcript abundance is the major determinant of protein abundance in mouse dendritic cells (75). Strong correlation was also found between the accurate measurements produced using the internal peptide standards and the abundance estimates produced using a label-free approach using the external peptide standards (Fig. S4). Therefore, for each protein that was not accurately quantified, an abundance value was estimated using the RNA-seq data (using the Fig. 2D equation) and/or using the LC-SRM data produced using the label-free approach using the external peptide standards (the mean of the two estimates was used; Table S8). It was noted that the two sets of abundance estimates were correlated (Fig. S5). The accuracy and precision of the accurate measurements and estimates were charted (Fig. S6).
Simulation of the RAW264.7 Cell Chemotaxis Signaling Pathway-Most of the proteins of Subpathway1 were accurately quantified, and of the ones that were not, most were estimated to be of relatively low abundance (Fig. S7A). In contrast to Subpathway1, most of the protein abundance values of Subpathway2 and of the full pathway were estimates (Figs. S7B and S7C). Many of the Simmune pathway nodes consisted of multiple proteins, so for each pathway node, the protein abundance values were summed (Table S9). These node abundance values were used to construct diagrams of Subpathway1 (Fig. 3), Subpathway2 (Fig. S8), and the full pathway (Fig. 4) that reflected the quantification results. Simmune models of a RAW264.7 cell (Fig. S9A) and of the full pathway (Tables S9 and S10) were constructed. Simmune permits specifying molecular interactions and modifications (such as phosphorylation) with the help of diagrammatic representations. Figs. S9B-S9E show the diagrams encoding heterotrimeric G-protein association, transformation, and dissociation reactions. A graph of the reaction net-work depicting the tiers of the signaling cascade of the full chemotaxis pathway was generated using the Simmune Network Viewer (Fig. S10).
The two subpathways (Subpathway1 and Subpathway2) were used for performing simulations. To this end, all of the full pathway nodes and edges were used to construct a single comprehensive Simmune pathway model file. Subsequently, for each of the two subpathways, Simmune simulation description files were constructed such that only the corresponding nodes were populated with nonzero initial concentration values. Refinement of the Simmune model parameters using two sets of previously reported in vitro data was performed manually within physiologically plausible ranges for the various parameters (Table S11). The Simmune model files are provided as supplemental material and can be used to reproduce the reported simulation results and as a starting point for models with modified parameters or pathway structure. The Simmune software package can be downloaded for free and without the need to register (http://www.niaid.nih. gov/LabsAndResources/labs/aboutlabs/lsb/Pages/simmune project.aspx).
A simulation of shifting, nonuniform stimulation of Subpath-way1 was performed corresponding to optical activation of RAW264.7 cells transfected with blue opsin (OPN1SW), which is a G-protein coupled receptor that is activated by blue photons and utilizes the same pathway components as S1P signaling. This experimental system provided direct measurements of an essential molecule in the S1P signaling pathway (PI(3,4,5)P 3 ). The temporal dynamics of PI(3,4,5)P 3 polarization in silico were tracked (Fig. 5). The resulting in silico PI(3,4,5)P 3 dynamics were consistent with previously reported in vitro dynamics in RAW264.7 cells (60,61).
Subsequently, simulations of S1P-stimulation of the RAW264.7 cell chemotaxis pathway were performed. The primary goal was to explore if the combination of a literatureretrieved pathway, and MS quantified protein abundances could be used to reproduce the concentration-related biphasic response to S1P stimulation that was observed in The in vitro data corresponding to panels I-P were not used to parameterize the model during model refinement. RAW264.7 cells (62,63). Polarization of the activated form of RAC (that is, the GTP-bound form) was used as a measure of RAW264.7 cell chemotactic direction. Activated RAC was selected because it has been shown to be a key chemotaxis pathway regulator that is localized to the leading end of chemotaxing mammalian cells (1,2,5,6,10). S1P-stimulation has been shown to activate RAC (76), and this was specifically observed in RAW264.7 cells in vitro (62). Seven simulations of S1P activation of Subpathway1 were performed (Fig. 6), and each resulting in silico activated Rac polarization was compared with previously reported in vitro RAW264.7 cell protrusion and movement observed by bright-field microscopy under the matched experimental condition (62,63,77). All of the Subpathway1 in silico results were consistent with all of the in vitro data ( Table I). The cell was polarized forward when the end S1P concentration was low: 40 pM (Fig. 6A), 400 pM (Fig.  6B), and 4 nM (Fig. 6C). When the end concentration of S1P was higher, the cells became polarized in a reverse direction both in shallow (Fig. 6D) and steep (Fig. 6E) gradients. In silico simulation of knockdowns of the S1PR1 did not change the reverse polarization of the cell when the end concentration of S1P was high, but simulated knockdown of S1PR2 did reverse the cell polarization in the same gradient, consistent with the experimental results describing RAW264.7 cell chemotactic movement (62,63,77). Note that cell movement, not RAC polarization, was measured in these in vitro studies. The in silico knockdowns were performed assuming a 98% reduction in protein abundance to be comparable to the siRNA knockdowns performed in the cited experiments. It is important to note that these effects of reducing S1PR1 and S1PR2 levels were not built into the model but emerged from the quantitatively parameterized simulation and the wiring of the underlying molecular interaction pathway.
Ideally, cross-validation would have been used to statistically estimate the accuracy of the Subpathway1 model. In a cross-validation analysis, the observation data are partitioned into two subsets, and model training is performed using one of the subsets, and model testing is performed using the other. Multiple rounds using different partitions are performed, and the overall model accuracy is estimated. Because the refinement of the chemotaxis pathway model was performed manually, cross-validation could not be used for examination of the refined model. Nevertheless, it should be noted that refinement of the chemotaxis pathway model actually only required the use of a subset of the in vitro data (corresponding to Figs. 5A-5H, 6A, and 6D) but resulted in a refined Subpathway1 model that was consistent with all of the in vitro data (Figs. 5 and 6).
Simulation of the RAW264.7 Cell Chemotactic Molecular Machinery-Simulations of the RAW264.7 cell chemotaxis pathway were performed with the aim of simulating the response of the actin cytoskeleton and FA machinery to S1Pstimulation. These simulations used Subpathway2 but were otherwise identical to the Subpathway1 simulations. First, a simulation of shifting, nonuniform stimulation of Subpathway2 was performed corresponding to optical activation of RAW264.7 cells transfected with blue opsin. The temporal dynamics of PI(3,4,5)P 3 polarization were tracked (Fig. S11), and the resulting in silico PI(3,4,5)P 3 dynamics were consistent with the corresponding Subpathway1 data (Fig. 5), as well as with the previously reported in vitro data (60,61).
In addition, seven simulations of S1P activation of Subpath-way2 were performed (Fig. 6), and each resulting polarization of activated Rac was compared with the corresponding Sub-pathway1 data (Fig. 6), as well as to the previously reported in vitro RAW264.7 cell movement data (62, 63, 77). As described above, all seven Subpathway1 simulations were consistent with all of the in vitro cell movement data. In contrast, only four of the seven Subpathway2 simulations were consistent with the Subpathway1 in silico data and the in vitro cell movement data (Fig. 6, Table I). For both Subpathway1 and Subpath-way2, the in silico cell was forward polarized when the S1P gradient was shallow, consistent with the in vitro data (Figs. 6A-6C). In addition, both Subpathway1 and Subpathway2 simulation of an S1pr2 knockdown cell in a steep S1P gradient resulted in forward polarization of activated Rac, again consistent with the in vitro data (Fig. 6G). Therefore, all of the Subpathway1 and Subpathway2 simulations of S1pr1mediated forward polarization of activated Rac were consistent with all of the corresponding in vitro data (Figs. 6A-6C and 6G).
In contrast, whereas three Subpathway1 simulations resulted in S1pr2-mediated reverse polarization of activated Rac (consistent with the in vitro data), the corresponding three Subpathway2 simulations failed to reproduce this result (Figs. 6D-6F). Numerous factors likely contributed to this discrepancy. Notably, sequestration in silico of activated Rock by highly abundant downstream targets (Ezr, Limk, Myl, Ppp1, and Slc9a1; these are absent from Subpathway1) reduced Rock phosphorylation and activation of Arhgap24, thereby reducing the rate of reverse polarization of activated Rac. Also, the Diaph1, Mcf2l, and Prkac nodes likely significantly affected Rhoa-mediated reverse polarization of activated Rac (Diaph1, Mcf2l, and Prkac are absent from Subpathway1). It is also possible that S1PR2-RHOA-ROCK-mediated reverse polarization of RAC-GTP in RAW264.7 cells requires more than 30 s of S1P stimulation in vitro and in silico. In addition, it is possible that no single set of refined molecular reaction rates can perform properly for both Subpathway1 and Subpath-way2 under all eight experimental conditions (that is, the single blue opsin and the seven S1P experimental conditions). During the refinement of the model (described above), some Subpathway2 molecular reaction rates were adjusted to correct grossly unphysiological results (Table S11), but a thorough refinement of Subpathway2 remains challenging due to its complexity.
Although the Subpathway2 simulations failed to produce S1pr2-mediated reverse polarization of activated Rac (Figs.  FIG. 6. Simulation of S1P activation of the chemotaxis signaling pathway. Seven Simmune simulations of Subpathway1 (Fig. 3) and Subpathway2 (Fig. S8) were performed. During each simulation, the in silico RAW264.7 cell was not stimulated for 120 s, and then it was stimulated for 30 s using the indicated S1P linear gradient. Each illustration (top of each panel) depicts previously reported in vitro RAW264.7 cell chemotactic behavior (62, 63, 77) (bright-field microscopy of in vitro RAW264.7 cell chemotactic movement in S1P gradients; RAC polarization was not measured). Each Simmune cross section (center and bottom of each panel) depicts the corresponding in silico activated (GTP-bound) Rac distribution across the cross section through the center of the RAW264.7 cell (activated Rac polarization was used as a measure of chemotactic direction). The in vitro data corresponding to panels B, C, and E-G were not used to parameterize the model during model refinement. (A-C) Chemoattraction up S1P gradients. S1PR1-mediated chemoattraction overwhelmed S1PR2-mediated chemorepulsion at relatively low S1P concentrations. (D-F) Chemorepulsion down S1P gradients. S1PR2-mediated chemorepulsion overwhelmed S1PR1-mediated chemoattraction at relatively high S1P concentrations in wild type and S1PR1 knockdown cells (Subpathway1 simulations). The corresponding Subpathway2 simulations failed to reproduce this behavior. (G) Chemoattraction of S1PR2 knockdown cells up a steep S1P gradient. S1PR1-mediated chemoattraction up a relatively steep S1P gradient. 6D-6F), they did successfully produce S1pr1-mediated forward polarization of activated Rac (Figs. 6A-6C and 6G). Therefore, one of the latter (successful) simulations was used to study numerous actin cytoskeleton and FA proteins, and it was found that the activated forms of many of these proteins were polarized in silico (Fig. 7). The activated forms of these nodes are defined in Table S9 (the "Simmune_Node_Final_Complex" column).
PI(3,4,5)P 3 was found to be forward polarized in silico (Fig.  7P), as would be expected in vivo within a typical chemotaxing mammalian cell (1, 2), as well as in a RAW264.7 cell (60,61). Downstream of PI(3,4,5)P 3 are the small GTPases Cdc42, Rac, and Rhoa. The activated forms of these proteins have been studied in numerous chemotaxing cells (78 -83). All three nodes were forward polarized in silico (Figs. 7E, 7Q, 7R), as would be expected in vivo, with the sole exception that activated Rhoa is also known to be present at sites of tail retraction in some cell types in vivo (though not necessarily within the in silico time frame of the simulation). Note that RAW264.7 cells display tail retraction during S1P-mediated chemotaxis (62,63). Downstream of the small GTPases, activated Myh (this node represents myosin, a component of actomyosin that functions by contracting to generate mechanical force) was forward polarized in silico (Figs. 7N-7O), as would be expected in vivo, except that activated myosin is also known to be present at the trailing end at sites of tail retraction in some cell types (2,6,84). As a consequence of Myh being downstream of Rhoa in the signaling cascade, the same polarization of Myh as Rhoa in silico was an expected result.
Both freely diffusing and membrane-associated Myh were forward polarized in silico (Figs. 7N-7O). Simmune performs spatially-resolved dynamic simulations of signal transduction pathways within in silico cells, permitting simulation of the cellular localization of signaling components resulting from pathway stimulation. Some of the activated forms of the proteins in Fig. 7 (Actn, Arpc, Diaph1, Gsn, Myh, and Vcl) had both a freely diffusing and an inner membrane-tethered form. This was possible because each of these proteins could be associated in silico with multiple different molecular complexes, and some of these complexes were freely diffusing and others contained a molecule that is membrane bound (and hence the molecular complex itself was membrane tethered). Almost all of these proteins (Actn, Arpc, Diaph1, Gsn, and Myh) displayed similar polarization for the freely diffusing and inner membrane-tethered forms. Inner membrane-tethered Vcl was forward polarized but negligible (Ͻ Ͻ1 copy/cell), so it was not included in Fig. 7. Freely diffusing Vcl (19,900.2 copies/cell) was not polarized (Fig. 7T).
Far downstream of the small GTPases, activated Arpc (Arp2/3 complex, which is an F-actin branching protein complex) was forward polarized in silico (Figs. 7C and 7D) as would be expected in vivo (6,9,10). The same was true for Diaph1 in silico (Figs. 7G and 7H) and in vivo (5,85). This node  The cells displayed chemoattraction until they reached ϳ20 nM S1P, and then they displayed chemorepulsion.
c This article also reported that calcitriol and eldecalcitol treatment reduced S1pr2 gene expression within RAW264.7 cells.
FIG. 7. Simulation of S1P stimulation of Subpathway2. A Simmune simulation of Subpathway2 (Fig. S8) was performed. During the simulation, the in silico RAW264.7 cell was not stimulated for 120 s, and then it was stimulated for 30 s using the indicated S1P linear gradient. Each panel depicts the relative distribution of the activated form of the indicated pathway node across the cross section through the center of the in silico RAW264.7 cell. Note that in vitro polarization of these molecules within S1P-stimulated RAW264.7 cells (specifically) has not been reported. The volume (freely diffusing) concentration and surface (transmembrane, membrane-bound, or membrane-tethered) concentration of the activated form of each node is depicted, as indicated. consists of the Diaph1 and Diaph2 genes, which encode mDia1 and mDia3, respectively (these proteins nucleate and elongate F-actin polymers). Activated Tln (talin, which is a component of FAs) was forward polarized in silico (Fig. 7S), as would be expected in vivo (85)(86)(87). The binding of talin to integrin is a very early step during FA assembly, and it was used as the sole measure of FA assembly up/down-regulation because larger FA protein complexes could not be simulated due to computational limitations. Activated Itgb (integrin, which is the transmembrane protein upon which FAs are constructed and that has inside-out and outside-in signaling functions) was not polarized in silico (Fig. 7M). However, in silico activated Itgb was defined broadly to function as a rough measure of its combined inside-out and outside-in signaling activity. Consequently, it is difficult to predict what its polarization would be in vivo. Activated Ezr (this node consists of the ERM family proteins ezrin, radixin, and moesin, which anchor F-actin to the cell membrane) was forward polarized in silico (Fig. 7J). In vivo, the localizations of these proteins vary by cell type (5,9,88). Ezr is sometimes present at the leading edge in filopodia and lamellipodia in vivo, where it can play an important role in lamellipodia growth and stability. Additionally, Ezr can be present at the trailing edge in vivo, where it could possibly aid tail retraction.
In contrast, activated Actn, Cfl, Enah, Gsn, and Vcl were polarized differently in silico than would be expected in vivo. Activated Actn (actinin, an F-actin bundling protein) was not polarized in silico (Figs. 7A and 7B), but it is known to be associated with FAs would be expected to be forward polarized in vivo (84,85,89). Similarly, activated Enah (this node represents ENAH and VASP, which are F-actin synthesis proteins) was not polarized in silico (Fig. 7I), but it is expected to be forward polarized in vivo. ENAH and VASP are localized to the very tip of the leading edge of filopodia and lamellipodia where they are bound to the end of actin filament bundles, and they also associate with FAs (90,91). Activated Vcl (vinculin, which is an F-actin bundling protein, and also is important during FA assembly) was not polarized in silico (Fig. 7T), but it is expected to be forward polarized in vivo (85)(86)(87). Vinculin can directly bind to F-actin, and as a homodimer it can bundle F-actin (92). In the Simmune model, in silico activated Vcl refers solely to its actin bundling function. Vinculin can also bind to talin, and while vinculin is bound to both talin and F-actin, it has a key role during FA assembly (93).
It is likely that F-actin, nascent adhesions, focal complexes, and FAs influence the behavior of Actn, Enah, and Vcl in vivo, but the current version of Simmune cannot simulate protein polymers, and the simulation of large protein complexes remains challenging. Also, Subpathway2 was S1P-stimulated for only 30 in silico seconds, and significant polarization of some of these proteins might require more time in vivo and in silico. Lastly, because these proteins have not been studied in S1P-stimulated RAW264.7 cells specifically, it is possible that some of them behave differently in S1P-stimulated RAW264.7 cells compared with how they are understood to behave within chemotaxing mammalian cells in general.
Activated Cfl (cofilin, an F-actin severing protein) was not polarized in silico (Fig. 7F), but it is known to be localized near the leading edge in vivo (possibly to increase actin dynamics) (8). Like Cfl, activated Gsn (this node represents adseverin and gelsolin, which are F-actin severing proteins) was not polarized in silico (Figs. 7K and 7L), but it is known to be localized near the leading edge in vivo (9,94). Activated Cfl and Gsn might have been polarized differently in silico than would be expected in vivo because at a very early stage of lamellipodia and lamellae development (for example, within the first 30 s of S1P-stimulation), it is reasonable to hypothesize that F-actin severing activity would be minimal. As Factin increased, F-actin severing would be up-regulated to produce new barbed ends for new F-actin extensions and meshwork development.
Of the nodes that were not polarized in Fig. 7 (Actn, Cfl, Enah, Gsn, Itgb, and Vcl), the abundances of two were affected by S1P-stimulation in silico. Activated Enah increased and activated Itgb decreased in response to S1P-stimulation (activated Actn, Cfl, Gsn, and Vcl were unaffected). Enah up-regulation would be consistent with its role during chemotaxis.
Testing and Variance Analyses of the Signaling Pathway Model-To test the refined Subpathway1 model, RAW264.7 cells were stimulated with S1P in silico and in vitro, and the resulting RAC-GTP and RHOA-GTP temporal trends were analyzed. For both GTPases, the in silico and in vitro trends were found to have numerous similarities (Fig. S12). The RAC-GTP abundance trend was higher at 100 pM S1P compared with 10 nM S1P both in silico and in vitro. The RHOA-GTP trend was lower at 100 pM S1P compared with 10 nM S1P both in silico and in vitro. RAC-GTP increased rapidly (within roughly 1 min) to reach saturation within the 100 pM S1P stimulated cells in silico and in vitro. The increase in RHOA-GTP was approximately linear in silico and in vitro (for both the 100 pM and 10 nM S1P stimulated cells). Lastly, RAC-GTP was increasing slowly after 5 min of 10 nM S1P stimulation in silico and in vitro. At 100 pM S1P, the relatively rapid RAC-GTP increase and the slower RHOA-GTP linear trend was consistent with the hypothesis that S1PR2-RHOA-ROCK-mediated reversal of RAC-GTP polarization requires more than 30 s of S1P stimulation in vitro (discussed above). This explains some of the apparent Subpathway2 discrepancies that were noted regarding Fig. 6.
The accuracy of the RAW264.7 cell protein quantification affected the corresponding pathway node concentration parameters and ultimately the simulation results. To test the robustness of the refined model, 2,000 simulations of Sub-pathway1 were performed. For each of these simulations, all of the node concentration parameters were varied (the other model parameters were held constant). Each node concentration parameter distribution was designed to roughly ap-proximate the accuracy of the quantification (accurate measurement or estimation) of the node. Ideally, the LC-SRM protein quantification would have been extraordinarily sensitive and accurate, and comparable parameter variation would have resulted in only negligible effects on the pathway simulation results. At the other extreme, parameter variation comparable to highly inaccurate quantification would generally cause simulations to produce unacceptable results. The robustness of a pathway model toward molecule copy number variation can be tested by performing an appropriate variance analysis and determining the percentage of acceptable results.
During each simulation, the in silico RAW264.7 cell was not stimulated for 120 s, and then it was stimulated for 30 s using a 0 -400 pM S1P linear gradient (the gradient spanned 20 m; the experimental conditions were the same as in Figs. 6B and 7). For each simulation, the resulting polarization of PI(3,4,5)P 3 , activated Rac, and activated Rhoa were determined. In 475 of the 2,000 simulations (24%), Ն1 of these nodes (PI(3,4,5)P 3 , activated Rac, and/or activated Rhoa) was not significantly activated at the leading end. Specifically, its concentration at the leading voxel was Ͻ0.1 copies per m 2 (the leading voxel was at the high S1P concentration end of the RAW264.7 cell). In addition, in 2 of the 2,000 simulations (0.1%), Ն1 of these nodes was activated at the trailing end exhibited negative concentrations, which is an artifact of concentrations falling below the precision threshold of 1e-14 (ϳ0.02 copies per m 2 ). Therefore, 477 of the 2,000 simulations (24%) failed to achieve significant pathway activation. For the remaining 1,523 simulations (76%), the polarization distributions of PI(3,4,5)P 3 , activated Rac, and activated Rhoa were visualized using histograms (Fig. 8). All three displayed narrow distributions at reasonable polarization levels. Ultimately, ϳ70% of the simulations resulted in acceptable results, and the pathway model was judged to be robust to the variance produced by the quantification methods used during this project. DISCUSSION Detailed quantitative simulation of complex biochemical pathways provides a means of testing whether conceptual models of such pathways are sufficiently accurate and complete to reproduce experimentally determined aspects of cellular behavior. Ideally, they yield verifiable predictions about cellular responses to perturbations not used to generate the initial model. Concordance between predicted simulated behavior and experimental results increases confidence in the conceptual model, while sufficient testing to reveal discrepancies between prediction and experiment provide the opportunity for further exploration that generates new data and a refined model.
Two major limitations to employing such a computational approach to complex biological systems involve obtaining accurate measurements of model parameters such as protein abundances and generating the complex mathematical representations necessary for performing simulations. In the work reported here, an initial attempt was made to address both challenges, using a combination of MS protein quantification and a computational modeling technique (Simmune) that automatically builds reaction-diffusion networks based on the specification of bimolecular interactions, thereby avoiding the error-prone step of manual assembly of the corresponding differential equations. Taking advantage of this combination of experimental and computational techniques, we developed a detailed, but modest sized, osteoclast precursor chemotaxis pathway model to explore whether it would be able to reproduce the highly dynamic spatiotemporal features of S1P-mediated chemoattraction and chemorepulsion. Many known features of S1P signaling were captured by the model, and aspects of such signaling not explicitly incorporated into the model emerged from its network struc- FIG. 8. Variance analysis of Subpathway1. To test the robustness of the refined model, 2,000 simulations of Subpathway1 were performed. In each, all of the node abundance parameters were varied (the other model parameters were held constant). Each node abundance parameter distribution was designed to roughly approximate the accuracy of the quantification of the node. During each simulation, the in silico RAW264.7 cell was not stimulated for 120 s, and then it was stimulated for 30 s using a 0 -400 pM S1P linear gradient (as in Figs. 6B and 7). Subpathway1 was not significantly activated in 477 of the 2,000 simulations (24%). For the remaining 1,523 simulations (76%), the polarization distributions of PI(3,4,5)P 3 (top), activated Rac (middle), and activated Rhoa (bottom) were visualized using histograms. Note that the leading end was at the high S1P concentration end of the RAW264.7 cell. ture and the use of the quantitative parameters generated by the SRM and transcriptomic methods that were employed.
Quantification of the target proteins was accomplished by developing and performing LC-SRM assays. This was greatly aided by RNA-seq and shotgun mass spectrometry datasets produced by other investigations, which demonstrates the enormous utility of online -omics databases and of multiomics analyses. The quantification of a spiked-in protein standard (firefly luciferase) was accurate and precise, and the quantification of the mouse chemotaxis pathway target proteins resulted in a minimal measurement variance across the biological replicates. This demonstrates that proper LC-SRM analyses of complex biological samples can produce confident peptide identifications and accurate and precise quantification values, as has been reported by others (32,33). However, for a small number of the target proteins, the quantification variance across the internal peptide standards was higher, demonstrating the importance of careful peptide target selection and also of the use of at least two internal peptide standards per target protein.
Changes in activation and localization of the proteins directly downstream of the S1P receptors were simulated and found to be roughly consistent with the experimental data. All of the Subpathway1 simulations and most of the Subpath-way2 simulations correctly reproduced the changes in molecular activation states and distributions when the in silico cell responded to a wide range of S1P gradients. In addition, simulations of knockdowns of the two S1P receptors that are expressed by RAW264.7 cells were performed, and Rac polarization flipped only when abundance of S1pr2 was reduced, consistent with prior experimental data using RAW264.7 cells and imaging-based chemotactic assays. This latter result was not a feature explicitly built into the model but one that emerged from its network connectivity and the specific parameterization that was achieved through protein quantification and model refinement. The success of a model in predicting experimental outcomes not used to construct the model is key to providing biologists with confidence that modeling can be valuable as an adjunct to pure experimental investigation.
Although not used during this investigation, pathway model accuracy could be increased by using MS of modified peptides (such as phosphopeptides) to measure the abundance of posttranslationally modified pathway proteins. Shotgun MS could be used to measure relative abundances, which could be used as constraints during model refinement. Targeted MS could be used to make absolute abundance measurements that could be used both as model constraints and as model parameters (for example, as initial conditions of phosphoproteins within prestimulated cells).
During the simulated experiments, most of the proteins responsible for cytoskeleton remodeling changed their cellular localization in agreement with experimentally observed changes in cell polarization known to depend on the S1P gradient and concentration. Proteins whose behavior were different from the literature-based prediction were likely parts of large protein complexes that remain challenging to model. A major advantage of the modeling approach employed (Simmune) is that scrutinizing model assumptions and reproducing the reported results is straightforward and does not require working with scripted model representations or additional software packages for numerical simulations of differential equations. At the same time, model specifications can readily be exported into standards such as the Systems Biology Markup Language. Simmune also makes it easy to modify the core biochemical model and use it as a basis for more complex (or simple) models to accommodate additional experimental data or to focus on only parts of the pathway being explored.
Detailed mechanistic models of cellular signaling networks based on comprehensive quantifications of their interacting components are still quite rare, in part because many biologists are concerned that it is far too easy to make simulations fit available data when many parameters remain unmeasured and can be arbitrarily adjusted. Model overfitting can also be a result of pathway redundancy, which can be required in vivo to produce signaling robustness so that everyday physiological perturbations do not undermine signaling when unwarranted. Nevertheless, as we show here, it is possible to employ rapidly improving techniques for constraining parameters through quantitative measurements using targeted proteomics and RNA-seq to overcome this concern. Such quantitative methods of enumeration, together with emerging methods for measuring intracellular association, transformation, and dissociation rates, will increase the robustness of pathway simulation approaches, helping to make the combination of experiment and modeling a standard tool in systems biological research.