Using Quantitative Spectrometry to Understand the Influence of Genetics and Nutritional Perturbations On the Virulence Potential of Staphylococcus aureus*

Staphylococcus aureus (Sa) is the leading cause of a variety of bacterial infections ranging from superficial skin infections to invasive and life threatening diseases such as septic bacteremia, necrotizing pneumonia, and endocarditis. The success of Sa as a human pathogen is contributed to its ability to adapt to different environments by changing expression, production, or secretion of virulence factors. Although Sa immune evasion is well-studied, the regulation of virulence factors under different nutrient and growth conditions is still not well understood. Here, we used label-free quantitative mass spectrometry to quantify and compare the Sa exoproteins (i.e. exoproteomes) of master regulator mutants or established reference strains. Different environmental conditions were addressed by growing the bacteria in rich or minimal media at different phases of growth. We observed clear differences in the composition of the exoproteomes depending on the genetic background or growth conditions. The relative abundance of cytotoxins determined in our study correlated well with differences in cytotoxicity measured by lysis of human neutrophils. Our findings demonstrate that label-free quantitative mass spectrometry is a versatile tool for predicting the virulence of bacterial strains and highlights the importance of the experimental design for in vitro studies. Furthermore, the results indicate that label-free proteomics can be used to cluster isolates into groups with similar virulence properties, highlighting the power of label-free quantitative mass spectrometry to distinguish Sa strains.

cells (7)(8)(9). Others, such as staphylococcal complement inhibitor (SCIN) efficiently inhibits opsonization and phagocytosis (10,11), by disarming critical innate immune defense strategies through the prevention of complement activation. In addition, Sa produces surface proteins including clumping factor A (ClfA) and extracellular fibrinogen binding protein (Efb), to promote attachment to host cells and tissues (12). This pathogen also secretes enzymes such as nucleases and proteases enabling Sa to counter neutrophil extracellular nets (NETs) and degrade host tissues, respectively (13)(14)(15). Additionally, Sa produces an array of cytotoxins such as leukocidins and phenol soluble modulins (PSMs) that target and destroy cells of the innate and adaptive immune systems (16,17). PSMs insert nonspecifically into the cell membrane because of their amphiphilic nature. In contrast, the bi-component pore-forming leukocidins insert into the plasma membrane of target cells upon binding to specific surface receptors, both resulting in membrane puncture and cell death (18).
Sa modulates expression of virulence factors by sensing signals from the environment through a complex network of regulatory elements. Three master regulators are key for the controlled expression of Sa virulence factors: the accessory gene regulator (AgrBDCA), the repressor of toxins (Rot), and the Sa exoprotein expression (SaePQRS) system. The wellcharacterized, agr locus encodes for a "self-recognizing" twocomponent system (TCS) that detects increases in cell density via a secreted signaling peptide, resulting in the production of an effector regulatory RNA known as RNAIII (19,20). RNAIII is a powerful activator of enzymes and cytotoxins that are thought to promote Sa survival in vivo (19 -24). It acts by inhibiting translation of another master regulator, rot (22). Rot directly binds to the promoter elements of toxins/proteases and represses their transcription, while it activates expression of gene products involved in immune evasion (21,25). Although Rot represses the expressions of certain toxins and enzymes, the Sae-TCS is involved in the activation of the genes encoding toxins and secreted proteins (26 -31). The Sae-TCS senses environmental stimuli such as pH and the presence of host phagocytes resulting in a dramatic increase of toxin production (31)(32)(33). Taken together, when bacterial densities are low during early stages of Sa infections, the Agr system is generally inactive and Rot is abundant, thereby repressing toxin expressions. At this time, factors that evade immune detection, and adhesins that help Sa attach to host tissues, are highly expressed. When Sa adheres and colonizes various tissues, it rapidly grows, leading to activation of the Agr system. Activated Agr leads to suppression of Rot, and thus enhances enzyme and toxin production, while repressing production of immune modulators.
Mutations in master regulators are often associated with clinical infections and can be reflective of the severity of infections. The ability to predict severity of infection of an emerging strain could potentially inform clinical prognostica-tion, management and infection control. Current methods for cataloging Sa strains rely on utilizing genomic sequences to determine clonality, such as multilocus sequence typing (MLST) and spa typing. Strains are also screened for the presence of important genetic biomarkers, such as the staphylococcal cassette chromosome mec gene (SSCmec) and the Panton-Valentine Leukocidin gene (pvl) (34). MLST uses seven housekeeping genes to group the different Sa strains into sequence types (ST). STs sharing 5 of the 7 identical alleles are grouped as clonal complexes (CCs). Thus, MLST can provide information regarding the lineage of different Sa isolates in population and epidemiological studies (35). Spa typing cataloges the variable tandem number of repeat polymorphisms in the 3Ј coding region of Spa (36,37). SCCmec confers methicillin resistance to Sa (38). The presence of pvl is often associated with CA-MRSA, such as strains from the USA300 lineage (3).
Although genotypic methods have been reliable and reproducible in grouping the different Sa strains (34), they provide limited information on the production of virulence factors; thus the virulence potential of these strains cannot be accurately predicted based on typing alone. Since the controlled production of virulence factors is vital for the success of Sa as a human pathogen, we sought to test the utility of label-free quantitative proteomics for characterization of Sa exoproteomes under a variety of different conditions. We quantitatively compared the exoprotein profiles of Sa USA300 1) master regulator mutants, ⌬agr, ⌬rot, and ⌬sae grown to stationary phase in either minimal (RPMI) or rich (TSB) media; 2) wild type (WT) USA300 grown to either exponential, early stationary, or late-stationary phase; and 3) 13 different reference Sa strains belonging to four different clonal complexes. Altogether, our study provides a rich data set cataloging the exoproteome of the highly prevalent CA-MRSA in the United States, USA300, and other important Sa isolates.

EXPERIMENTAL PROCEDURES
Bacterial Culture and Growth Conditions-Sa strains were grown at 37°C on Tryptic Soy Agar (TSA), then in Tryptic Soy Broth (TSB) or Roswell Park Memorial Institute media (RPMI, Invitrogen, Waltham, MA) supplemented with 1% Cassamino Acid. Liquid cultures were grown in 5 ml growth media in 15 ml tubes at a 45°angle or in 150 l growth media in 96-well plates with shaking at 180 rpm overnight prior to subculture.
For growth curves, Sa strain LAC was subcultured at 1:100 in 100 l TSB or RPMI from 6 independent colonies grown in 150 l TSB in a 96-well plate with shaking overnight at 180 rpm. Optical densities at 600 nm were read at the beginning of the subculture and at the indicated time points using a PerkinElmer Envision 2103 Multi-label reader (PerkinElmer, Waltham, MA).
Isolation of Primary Human Neutrophils-Leukopaks were obtained from de-identified donors from the New York Blood Center where written consents were obtained from all participants and human polymorphonuclear neutrophils (hPMNs) were purified as described previously (40). hPMNs were resuspended in RPMI 1640 (Cellgro, Herndon, VA) supplemented with 10% fetal bovine serum (FBS).
Cytotoxicity Assay-Cytotoxicity assays were performed as described previously (21,41). Briefly, culture supernatants were collected from Sa strains sub cultured at 1:100 from overnight cultures in 5 ml TSB or 5 ml RPMI in 15 ml conical tubes at the indicated time points. The culture supernatants were serially diluted and added to 2 ϫ 10 5 hPMN/well for a final volume of 100 l/well. hPMNs were intoxicated with the culture supernatant from the indicated Sa strain for 1 h at 37°C and 5% CO 2 . hPMN viability was determined using CellTiter 96 Aqueous One Solution (Promega, Madison, WI). Briefly, 10 l/well of CellTiter was added and incubated at 37°C and 5% CO 2 for 2 h. Cell viability was measured by absorbance at 492 nm using a PerkinElmer Envision 2103 Multilabel reader (PerkinElmer).
Exoprotein Isolation-Sa cultures were grown in 5 ml TSB or RPMI in 15 ml conical tubes for 3, 5, or 8 h in a 1:100 subculture from overnight cultures grown in TSB. Isogenic mutants were grown in TSB for 5 h. All established reference strains were grown in TSB for 5 h. At the indicated time point post-subculture, cultures were normalized to the same optical density by adding respective media to dilute cultures with higher cell density. Culture supernatants were collected by centrifugation at 4000 rpm for 10 min to remove bacteria, followed by filtration through a 0.22 m filter to remove cell debris. The proteins in the culture supernatants were precipitated in 10% (v/v) trichloroacetic acid (TCA) at 4°C overnight. The precipitated proteins were sedimented by centrifugation and pellets were washed with ethanol. The protein pellets were centrifuged again, the remaining ethanol was removed and the pellets were allowed to air dry.
Exoprotein Profiling-Precipitated exoproteins were resuspended in 8 M urea for 30 min at RT, then diluted 1:1 with 2ϫ SDS sample buffer and boiled for 10 min. exoproteins were separated in a 12% SDS-PAGE gel and protein visualized using either Coomassie, silver staining, or Instant Blue. The gels were imaged using the Gel Doc XR System (Bio-Rad, Hercules, CA).
Exoprotein Sample Preparation and LC-MS Analysis-Reconstituted exoprotein isolates were reduced with 0.02 M dithiothreitol and alkylated with 0.05 M iodoacetamide. The exoproteins were in-gel digested as described in (42) and the resulting peptide mixture desalted as previously described (see supplemental information for details) (43). Aliquots of the peptide mixtures were loaded onto a Acclaim PepMap 100 precolumn (75 m ϫ 2 cm, C18, 3 m, 100 Å) in-line with an EASY-Spray, PepMap column (75 m ϫ 50 cm, C18, 2 m, 100 Å) with a 5 m emitter using the autosampler of an EASY-nLC 1000 (Thermo Scientific, Waltham, MA). The samples were gradient eluted directly into an Orbitrap Fusion Lumos mass spectrometer (Thermo Scientific) and analyzed in a data dependent manner using a top speed method. Complete details of the LC-MS acquisition can be found in the supplemental materials.
Data Analysis-The MaxQuant software suite (version 1.5.2.8) was used for peptide and protein identifications and label-free quantitation (44). For the master regulator mutant and growth curve studies the raw data was searched against a UniProt USA300 protein database downloaded on August 31, 2016 containing 2607 entries. For the first search the peptide tolerance was set to 20 ppm and for the main search peptide tolerance was 4.5 ppm. Trypsin specific cleavage was selected with 2 missed cleavages. A peptide spectral match (PSM) FDR of 1% and a Protein FDR of 1% was selected for identification. Label-free quantitation was performed with a label-free quantitation minimum ratio of 2 and allowing for unique peptides only. Matching between runs was allowed with a 0.7 min match window and a 20-min alignment time window. Carbamidomethylation of Cys was added as a static modification. Oxidation of Met, deamidation of Asn and Gln and acetylation of the protein N terminus were the allowed variable modifications.
Results were filtered to include proteins identified with 2 or more unique peptides in at least all three replicates of one strain type. Label-free quantitation intensity values were log2 transformed, missing values were imputed from the normal data distribution, and z-scores were calculated. A z-score indicates how many standard deviations a value is from the mean ((z ϭ (X Ϫ )/), X ϭ value, ϭ population mean, ϭ standard deviation). Unsupervised hierarchical clustering is used to generate a heat map from the z-scores representing protein groups in the matrix as colors and grouping exoprotein isolates based on the relative intensities of the quantified proteins.
For the reference strain comparison analysis, the raw files were searched against a combined NCBI database containing individual databases for 12 strains (LAC and SF8300 are both USA300) consisting of 33 The parameters were the same as above except matching between runs was not allowed.
Comparative analysis of protein sequences was done using orthologous genes as defined by "Reciprocal Best Hits" (RBH) comparisons based on BLAST searches to compile protein groups (45). In short, a RBH is found when the proteins encoded by the sequences of two genes from two different genomes find each other as the best scoring match in the other genome. To compare the expression of the orthologs across all the reference strains of interest, we selected one strain to serve as a pivot-strain (LAC) and proceeded to search for reciprocal best blast hits for every one of its gene loci. In the cases where no RBH was found with another strain, it was assumed that no ortholog was present for that strain. In order to ensure quantitation of orthologs based strictly on comparable mass spectrometry features, protein quantitation was limited only to peptides present in the predicted amino acid sequence of all defined orthologs for a given pivot gene locus and no other protein entry (i.e. the peptides must be present in all putative orthologues and must be unique to the putative ortholog group).
For each of the resulting putative ortholog groups, the number of peptide spectral matches (PSMs) reported by MaxQuant for all orthoconserved and ortho-unique peptides was averaged across all experimental replicates to yield the average intensity per ortholog group and experimental condition pairing. The resulting dataset was clustered in the R statistical software package using complete linkage agglomerative clustering and 1 Ϫ r as the distance measure, where r is defined as the Pearson coefficient.
The resulting heat map is shown in Fig. 7A, color coded by Z-score (normalized per row). Toxicity assay results are also shown (using a similar but linear color-coding from minimum toxicity in white to maximum toxicity in purple). The functional category of the secretome genes are color coded as follows: immunomodulators in green, exoenzymes in blue and cytotoxins in red.
Experimental Design and Statistical Rationale-For the virulence factor regulator mutant studies three biological replicates were used for each WT, ⌬rot, ⌬agr, and ⌬sae grown in both minimal and rich media for a total of 24 individual samples. All 24 samples were prepared in parallel and analyzed via LC-MS in a randomized order. We also isolated exoproteins from bacteria grown in minimal and rich media at 3, 5, and 8 h in triplicate for a total of 18 individual samples. The growth curve samples were prepared in parallel and again analyzed in random order via LC-MS. Results were filtered to include proteins identified with 2 or more unique peptides in at least all three replicates of one strain type. The Sa reference strain analysis used three biological replicates for each of the 13 strains, except for Newman because of contamination of one replicate we had biological duplicates. The 38 samples were prepared in parallel and analyzed via LC-MS in a random order.

RESULTS
The Effect of Nutrient Availability On the Exoproteome of USA300 Master Regulator Mutants-Bacterial virulence is dependent on its environment and genetic background. To better understand virulence changes associated with environmental and strain variation, we cataloged changes in the Sa exoproteome using label-free quantitative proteomics of the representative USA300, strain LAC (WT), and isogenic mutants (⌬rot, ⌬agr, and ⌬sae) grown to early stationary phase in nutrient rich or minimal media. Three separate experiments per growth medium were performed to obtain a robust data set that will serve as a resource for the research community. We quantified 595 proteins across all strains, requiring two or more unique peptides per protein in at least all three replicates of one sample type. Approximately 65% (385/595 proteins) of the identified proteins were detected in all culture filtrates (supplemental Figs. S1A-S1B). Overall the protein secretion levels were similar for strains grown in rich versus minimal media (supplemental Fig. S1C-S1E). The ⌬rot strain secreted the highest level of overall protein in both minimal and rich media compared with WT, ⌬agr, and ⌬sae strains (supplemental Fig. S1E). This finding is consistent with the known potent repressive role of Rot for the genes that encode for exoproteins (21,39). It is worth noting that unsupervised hierarchical clustering of the protein data revealed that ⌬rot exoproteomes are more similar to each other regardless of the medium condition. However, clustering of the WT, ⌬agr, and ⌬sae strains was driven by nutrient availability (Fig. 1). Importantly, clustering showed the high reproducibility of the replicate samples and the techniques.
Next, we examined the secretion of three classes of virulence factors in these mutants: immunomodulators, exoenzymes, and cytotoxins (Table I). Although certain classes show a general trend of increase or decrease, not all proteins in a group followed the same trend, nor exhibited a similar fold change. For instance, the rot mutant showed decreased secretion of immunomodulatory proteins compared with WT ( Fig. 2A) and increased secretion of exoenzymes and cytotoxins (Figs. 2B-2C). However, the coagulases (Coa and Vwbp) were lower in abundance in this mutant (Fig. 2B). We did not detect the secreted PSM␣ 2 and 4 toxins in the ⌬agr mutant, data consistent with the requirement of the Agr system for the expression of these virulence factors (46). Altogether, analyses of the exoproteome profiles of the WT LAC strain and the isogenic master regulator mutants demonstrated that deletion of the master regulators affect global protein secretion; the overall trends of immunomodulators, exoenzymes, and cytotoxins secretions when compared with WT were similar in minimal and rich media.
To correlate our proteomic data to the virulence potential of these strains, we utilized a cytotoxicity assay to evaluate the ability of the exoproteins from each strain to kill hPMNs. Culture filtrate from the ⌬rot strain was more cytotoxic to hPMNs than the WT strain (Fig. 3A). Culture filtrate from the ⌬sae strain lacked cytotoxic activity toward hPMNs (Fig. 3A), consistent with the observation that Sae is indispensable for toxin production (29). The cytotoxicity of the ⌬agr strain grown in rich medium was very low, but surprisingly, when the mutant strain was grown in minimal medium, it exhibits enhanced cytotoxicity (Fig. 3B). Examination of the label-free quantitation data showed that the ⌬agr mutant had lower production of cytotoxins compared with WT (Fig. 2C). How-FIG. 1. Label-free mass spectrometry quantitation differentiates isogenic Sa mutant exoprotein profiles. A heat map of the protein quantitation data generated using unsupervised clustering is shown. LFQ intensity values were log2 transformed, missing values were imputed from the normal data distribution, and z-scores were calculated. A z-score indicates how many standard deviations a value is from the mean ((z ϭ (X Ϫ )/), X ϭ value, ϭ population mean, ϭ standard deviation). Unsupervised hierarchical clustering is used to generate a heat map from the z-scores representing protein groups in the matrix as colors. Each row in the heat map is a different protein group and each column is an individual sample. The clustering at the top indicates which samples are most closely related based on the relative intensity of the quantified protein groups. Unsupervised hierarchical clustering confirms that the replicates tightly cluster showing high reproducibility of the workflow.  (Fig. 3B). LukAB has been shown to be the dominant cytotoxic factor in the culture supernatants of Sa (41,47,48), thus we speculate that the cytotoxic activity observed in ⌬agr grown in minimum medium is likely caused by this toxin. All together these data suggest that agr deficiency in a nutrient limited environment can contribute to Sa virulence, consistent with the observa-tions that agr deficient strains are commonly isolated from both nasal carriers and bacteremic patients (49). The Effect of Nutrient Availability and Culture Density on USA300 Exoproteome-Next, we investigated the effect of the different bacterial growth phases on the overall exoproteome of LAC. We collected culture filtrates from exponential (3 h), early stationary (5 h), and late stationary (8 h) growth phases in either minimal or rich media (Fig. 4A). These times were selected because during the early stages of infection (analogous to the exponential phase), the bacterium devotes significant resources to upregulate the production of adhesins and immunomodulators, but as the population density increases (early and late stationary phases), the secretion switches to exoenzymes and toxins that cause tissue damage and extend pathogenesis (19,20). Thus, we postulated secretion profiles at these three time points would reveal changes in the expression of different functional protein classes.
Three separate experiments per growth media were performed to obtain a robust data set. We quantified 438 proteins across all conditions, requiring 2 or more unique peptides per protein in at least all three replicates of one sample type. Only a few proteins were unique between the time points and across the growth conditions (supplemental Figs. S2A-S2B). Unsupervised hierarchical clustering showed that there was more variability during the exponential phase compared with stationary phase samples, possibly because of sample handling and low protein amounts (supplemental Fig. S2D).
We again investigated the abundance of the three classes of virulence factors: immunomodulators, exoenzymes, and cytotoxins in these nutrient-growth phase combinations (Table I). Irrespective of nutrient conditions, secretion of most immunomodulator proteins was highest in exponential growth phase (5 vs 3 h and 8 vs 3 h, Fig. 5A). One protein that did not fit this trend was staphylokinase (Sak), which is more abundant at early stationary phase in minimal medium (8 vs 5 h; Fig. 5A). In contrast, abundances of exoenzymes were high at stationary phase compared with exponential phase (5 vs 3 h). Interestingly, levels of certain proteases (SplA-F) increased further at late stationary phase (8 vs 5 h) (Fig. 5B). Two proteins that deviated from this trend were the coagulases, Coa and Vwbp, both of these enzymes were most abundant at exponential phase, but their levels dwindled at stationary phase (Fig. 5B).   FIG. 3. The effect of nutrient availability on the cytotoxicity of mutant Sa. A, Cytotoxicity assay data is plotted for LAC WT and each mutant, where a higher percent of dead cells (hPMNs) indicates greater cytotoxicity. Intoxication of hPMNs from six donors Ϯ the standard error of mean with titration of culture filtrates from the indicated Sa LAC strains grown in rich (TSB) or minimal media (RPMI). Cell death was measured with CellTiter metabolic dye. The culture filtrate from ⌬agr grown in minimal medium has a cytotoxicity closer to LAC WT, but grown in rich medium the cytotoxicity is drastically reduced. A two-way ANOVA was performed comparing means using Sidak correction for multiple comparisons. Data points with p values less than 0.05 are considered significant and are indicated by the following key: 0.01-0.05 ϭ *, 0.01-0.001 ϭ **, 0.001-0.0001 ϭ ***, and Ͻ0.0001 ϭ ****. B, The average LFQ intensity values for the selected cytotoxins are plotted with minimal medium in red and rich medium in blue. Error bars represent the standard deviation of the triplicate analyses. All proteins are labeled with the corresponding protein ID. The intensity of the monomers of the bi-component leukocidin LukAB from ⌬agr grown in minimal medium is significantly higher than that grown in rich medium. A two-way ANOVA was performed as described above.
FIG. 4. Sa grows to a higher density when grown in a nutrient rich environment. The optical density at 600 nm of four independent colonies of LAC WT grown in both minimal (RPMI) and rich (TSB) media was measured at t ϭ 0, 2, 4, 6, and 8 h. Minimal medium (RPMI) colony density is plotted in red and rich medium (TSB) is plotted in blue. The density of the cultures after 2 h is slightly higher when grown in a nutrient rich environment.

Molecular & Cellular Proteomics 16 Supplement 4 S21
Cytotoxin secretion was similar to that of exoenzymes regardless of nutrient conditions (Fig. 5C). This functional class was highly abundant in the culture filtrates of stationary phase bacteria (5 vs 3 h and 8 vs 3 h). The bi-component toxins are an important class of cytotoxins with high lytic activities on hPMNs (16). Thus, to correlate bi-component toxin levels in culture filtrates to phenotypic functionality, we performed cytotoxicity assays using culture filtrates of LAC (Fig. 6A). First, consistent with the observation by mass spectrometry that toxin production increased at stationary phase, culture filtrates from the stationary phase were more cytotoxic toward hPMNs compared with the ones from exponential phase. Second, late-stationary phase culture filtrates had a moderate reduction in cytotoxicity compared with early-stationary phase supernatants (5 vs 8 h TSB/RPMI, Fig. 6A), perhaps owing to the increased protease presence in late stationary phase (Fig. 5B), which results in degradation of proteins in the culture filtrate (50). Lastly, LukAB and LukSF-PV have been reported to be the most potent in lysing hPMNs (47). In fact, culture filtrates lacking either of these toxins have led to reduced cytotoxicity toward hPMNs (47). Therefore, we compared the abundances of these two cytotoxins in the various growth phases under different media conditions. Consistent with our data (Fig. 6A), we observed that increased cytotoxicity correlated with higher abundance of LukAB and LukSF-PV during stationary phase (5 and 8 h). Interestingly, we observed significantly higher levels of LukAB in the exoproteomes of Sa grown in minimal medium during stationary phase, but LukSF-PV is up-regulated in stationary phase when Sa is grown in nutrient rich medium (Fig. 6B). These are interesting observations that will require further investigation.
The Effect of Clonal Lineages On the Exoprotein Production by Diverse Sa Reference Strains-Our data demonstrates that label-free quantitation can predict cytotoxicity of a strain based on the exoproteome alone (Figs. 3 and 6). We next extended this hypothesis to test our ability to predict the cytotoxic profiles of 13 Sa reference strains representing four CCs (Table II). These strains include CA-MRSA, hospitalassociated MRSA (HA-MRSA), methicillin-sensitive Sa (MSSA), and vancomycin-intermediate Sa (Table II). Label-free quantification showed that each reference strain has a unique exoprotein profile (Fig. 7A). Because a large majority of Sa infections in the USA belong to CC8 (51-54), we included six different CC8 strains. We examined the abundance of three classes of virulence factors: immunomodulators, exoenzymes, and cytotoxins in these reference strains. The proteomics data reproduces the CC8 genomic cluster except for COL (Fig. 7A). Interestingly, the cytotoxicity data supports the placement of COL outside of the CC8 cluster in agreement with the exoproteome profile (Fig. 7B). In addition, the exoproteome profiles suggest other distinct differences between strains within the CC8 group (Fig. 7A). For example, two representative CA-MRSA strains in our collection, LAC and SF8300, were among the most cytotoxic to hPMNs (Fig. 7B) and their exoproteomes are distinct from the other CC8 strains (Fig. 7A). The CC8 strain, Newman, while also highly cytotoxic (Fig. 7B) lacks LukSF-PV as expected due to these lack of these genes, lukS-PV and lukF-PV, in this strain. In comparing the highly cytotoxic CC8 cluster, higher levels of HlgA, HlgB, and HlgC are present in the exoproteome of FIG. 6. The effect of nutrient availability and culture density on the cytotoxicity of Sa. A, Cytotoxicity assay data is plotted for LAC WT grown in both environments at the three time points. A higher percent of dead cells (hPMNs) indicates greater cytotoxicity. Intoxication of hPMNs (from six donors Ϯ the standard error of mean) with a titration of culture filtrates from LAC WT grown in rich (TSB) or minimal media (RPMI) is shown. Cell death was measured with CellTiter metabolic dye. There is a statistically significant increase in the cytotoxicity of the bacteria grown in rich medium as compared with those grown in minimal medium. A two-way ANOVA was performed comparing means using Sidak correction for multiple comparisons. Data points with p values less than 0.05 are considered significant and are indicated by the following key: 0.01-0.05 ϭ *, 0.01-0.001 ϭ **, 0.001-0.0001 ϭ ***, and Ͻ0.0001 ϭ ****. B, The median LFQ intensities of the bi-component leukocidins are plotted at log phase (3 h -blue), early stationary phase (5 h -red), and late stationary phase (8 h -green). Bars representing LukAB from Sa grown in minimal medium are solid, LukSF-PV are lined, and bars representing LukAB from Sa grown in rich medium are checkered, LukSF-PV are slashed. The level of LukSF-PV plateaus at stationary phase, but LukAB is levels are highest during early stationary phase and decrease in late stationary phase. Interestingly, at both early and late stationary phase LukAB is more abundant than LukSF-PV when Sa is grown in minimal medium, but the reverse is true when Sa is grown in a nutrient rich environment. A two-way ANOVA was performed as described above.

TABLE II
Staphylococcus aureus reference strains with clonal complex assignment. Table II 7. The effect of clonal lineages on the exoprotein production by a group of diverse Sa reference strains. A, Heat map of protein quantitation data for the selected virulence factors. The color scheme represents a row-based z-score on a scale from dark purple (most PSMs) to white (fewest PSMs). The protein class is designated by color on the left y axis and the common protein IDs are labeled on the right y axis. Immunomodulators are indicated in green, exoenzymes in blue, and cytotoxins in red. The strain names are indicated along the bottom x axis. The average cytotoxicity using the 5% dilution of the culture filtrate is shown for each strain at the top of the heat map.
Newman, whereas LukAB are present at a similar level in the exoproteomes of LAC, SF8300, and Newman (Fig. 7A). The higher level of toxins in the Newman exoproteome is consistent with the findings that this strain harbors a hyperactive saeS allele, which results in increased exoproteome production (55)(56)(57) and high cytotoxicity (Fig. 7B). Notably, USA500 has a large abundance of exo-enzymes and cytotoxins, consistent with what is known in literature (21). The CC5 strains Mu50, 502A, and N315 cluster together with the CC1 strains (Fig. 7A). Closer examination reveals the driving force for this cluster is the overall lower production of all exoproteins and a subset of immunomodulators involved in attachment to host tissues (Fig. 7A). Overall, our data suggest that proteomic data could be used to deduce the cytotoxic potential of diverse strains belonging to various CCs.

DISCUSSION
Sa is a significant human pathogen that has evolved a large repertoire of secreted and cell surface tethered virulence factors to colonize multiple tissue sites and cause various disease states (6). In this study, we used a multi-factorial approach to characterize the Sa exoproteome and correlate changes in exoproteomes with the cytotoxic potential of the conditions tested. Analyzing the Sa exoproteome can be a daunting task, given that hundreds of proteins are secreted or surface-bound (58). Here, we have attempted to simplify the study of such large exoproteome data sets by grouping exoproteins based on virulence function (immunomodulators, exoenzymes and cytotoxins) and by focusing on several representative proteins of each group. This strategy allowed us to identify trends in the secretion of these important virulence factors.
First, we validated our label-free mass spectrometry strategy by comparing proteomes of known regulatory mutants, ⌬agr, ⌬sae and ⌬rot, in nutrient-rich and minimal media. Second, we compared exoproteomes of LAC grown to various growth phases in rich and minimal media. These data sets identified the virulence factors secreted at both low and high cell densities, mimicking conditions of early Sa infection (exponential growth phase) and a state of quorum (early-and late-stationary growth phases). Last, we extended these analyses to a larger set of reference strains to assess if the exoproteome profiling is indicative of strainspecific cytotoxicity.
In our study we used isogenic mutants of master regulators that are well characterized in the literature (21-26, 31, 32). We selected these specific mutants not only for validation, but also to evaluate if label-free quantitation can provide insights into Sa biology. For instance, we found that inactivation of the repressor, rot, led to a dramatic increase in protein abundance in the culture filtrates (supplemental Fig. S1E); however, few unique protein species were detected in this mutant compared with WT, ⌬agr and ⌬sae (supplemental Fig. S1A-S1B). The in vitro biology and the in vivo relevance of this finding would be an interesting area to investigate in future studies. Likewise, we found that mutation of the agr locus grown in rich medium abrogated Sa virulence with respect to its cytotoxic potential toward hPMNs (Fig. 3A). However, culture filtrates from the same strain grown in minimal medium still exhibited high levels of cytotoxicity (Fig. 3A). On closer analyses, we found that the cytotoxin LukAB was more abundant in the exoproteome when Sa ⌬agr was grown in minimal verses rich media (p value Ͻ0.0001), potentially accounting for the differences in cytotoxicity (Fig. 3B). This finding suggested an agr-independent but media-dependent mechanism by which these cytotoxins are produced.
In an effort to understand protein variations during various Sa growth phases, we compared the exoprotein profile at different phases of growth. Our results corroborate what is found in literature regarding the cell density-induced protein profile of Sa (Figs. 5 and 6) (20,24,59). Of note, the immunomodulatory protein Sak was ϳ2-fold higher in minimal medium stationary phase culture filtrates compared with the same time point in rich medium and other proteins in this category (Fig. 5A). Additionally, the coagulases (Coa and Vwbp) exhibit profiles more similar to immunomodulators rather than the other evaluated exoenzymes (Fig. 5B). Taken together, our data set allows us to identify groups of proteins that did not follow the trend of their functional classes and could be subjects of future studies.
Multiple efforts to understand Sa virulence have used comparative transcriptomics (59 -63) to study specific changes in the Sa virulon. Such methods like RNA-Seq or microarrays inform us about regulatory changes at the genomic level. However, these results may not reflect actual levels of secreted effector molecules. In contrast, the exoproteome is a true indicator of virulence factor levels, and potentially Sa pathogenesis (64). We and others have successfully exploited this strategy to study Sa virulence properties (42,65,66). However, these studies have largely compared exoproteomes The same color scheme is used as for the protein data (i.e. the cytotoxicity data was also transformed to z-scores; dark purple (most cytotoxic) to white (least cytotoxic)). The expression of orthologues was compared across the reference strains using LAC as the pivot-strain. The reciprocal best blast hits for every selected virulence factor in each reference strain were determined, peptide intensities for all ortho-conserved and ortho-unique PSMs were summed, results were log10 transformed and z-scores were calculated. The resulting data was clustered using complete linkage agglomerative clustering and 1-r as the distance measure, where r is defined as the Pearson correlation. B, The cytotoxicities of all 13 reference strains is plotted. A higher percent of dead cells (hPMNs) indicates greater cytotoxicity. Intoxication of hPMNs (from three donors Ϯ the standard error of mean) with a titration of culture filtrates from the reference strains. Cell death was measured with CellTiter metabolic dye. The CC8 strains are the most cytotoxic strains, CC5 strains are moderately cytotoxic, and the CC1 and CC30 strains have low cytotoxicity.

Molecular & Cellular Proteomics 16 Supplement 4 S25
of single mutants to their WT counterparts (26,67,68) or have elucidated secretion profiles of one strain in different nutrient conditions (69 -71). In this study, we use label-free quantitation analysis to generate a large, multicomponent data set comparing exoproteomes in various combinations of different growth phases, nutrient environment and genetic background, thus generating a powerful data set to further studies of Sa pathobiology. We used minimal (RPMI) and rich (TSB) media to mimic environmental conditions encountered by Sa in vivo. Observed differences in the secretion profiles of Sa grown in minimal and rich media suggest the need for greater understanding in the regulation of Sa virulence factor productions. Additionally, our study highlights the importance of nutrient availability for the design of in vitro experiments to better address the expressions of virulence factors under different conditions.
We have shown the power of label-free quantitative proteomics for the prediction of the potential virulence of Sa strains. As the pathogen evolves to evade the host immune system, profiling of the exoproteome could assist in predicting the virulence potential of new emerging strains, which is critical to improve treatment and prevention methods. However, comparing the exoproteome across different bacterial strains presents a technical challenge. Usually quantitative comparisons using mass spectrometry are done across proteomes with similar background where very few of the proteins differ and most of the proteins do not change. In stark contrast, the exoproteome of Sa strains shows a large diversity of protein variants, including gene deletions, gene duplications (paralogues) or allelic differences (orthologues). The most common workflow in proteomics involves analysis of the proteomes after proteolytic digestion into peptides rather than intact proteins. Peptide sequences are then characterized using mass spectrometry and the sequence in turn credited to a protein or protein isoforms, which is in turn linked to a gene. Here, searching the peptide database against a database containing all analyzed strains resulted in matches to multiple protein orthologues and paralogues. Further confounding the analysis of the mass spectrometric data is that genome annotation of different Sa strains is neither uniform nor complete, and the entire sequence of a protein is generally not observed by mass spectrometry. Indeed, the sequence coverage of a protein can differ dramatically (ϳ3-100%). Thus, it is challenging to group the peptide data and allow for a one to one comparison across different strains. For instance, the leukocidins are known to differ between strains and share sequence similarities between family members (16,72), as evident by pronounced cross reactivity of anti-leukocidin antibodies (73). To be able to align the gene products of the 13 strains, we compiled protein groups of all 13 strains using orthologous genes as defined by "Reciprocal Best Hits" (RBH) comparisons based on BLAST searches (45). To compare the production of the orthologues across all the reference strains of interest, we selected one strain to serve as a pivot-strain (LAC) and proceeded to search for reciprocal best blast hits for every one of its gene loci. This procedure allowed us to compare across the 13 strains while avoiding orthologs with different enough amino acid sequences to appear as a separate protein groups and thus allowing direct comparison of that ortholog across all strains. However, to ensure that quantitation of orthologues is based strictly on comparable mass spectrometry features (i.e. identical peptides), quantitation was restricted to peptides present in the predicted amino acid sequence of all defined orthologues for a given pivot gene locus and no other protein entry (i.e. the peptides must be present in all putative orthologues and must be unique to the putative orthologue group).
Going forward, we believe our strategy of combining mass spectrometry with the computational alignment of orthologues and paralogues (a pangenome) will be an increasingly important tool to comprehensively characterize the exoproteome (74). We believe that with careful data integration and analysis, exoproteome characterization has the potential to become an indispensable tool for predicting the potential virulence of pathogens, especially in the case of new emerging strains.

DATA AVAILABILITY
All raw mass spectrometry data and search results have been deposited to the ProteomeXchange Consortium via the MassIVE partner repository with the data set identifiers: ProteomeXchange PXD005203 and MassIVE MSV000080260.