Reproducible and consistent quantification of the Saccharomyces cerevisiae proteome by SWATH-mass spectrometry

: Targeted mass spectrometry by selected reaction monitoring (S/MRM) has proven to be a suitable technique for the consistent and reproducible quantification of proteins across multiple biological samples and a wide dynamic range. This performance profile is an important prerequisite for systems biology and biomedical research. However, the method is limited to the measurements of a few hundred peptides per LC-MS analysis. Recently, we introduced SWATH-MS, a combination of data independent acquisition and targeted data analysis that vastly extends the number of peptides/proteins quantified per sample, while maintaining the favorable performance profile of S/MRM. Here we applied the SWATH-MS technique to quantify changes over time in a large fraction of the proteome expressed in Saccharomyces cerevisiae in response to osmotic stress. We sampled cell cultures in biological triplicates at six time points following the application of osmotic stress and acquired single injection data independent acquisition data sets on a high-resolution 5600 tripleTOF instrument operated in SWATH mode. Proteins were quantified by the targeted extraction and integration of transition signal groups from the SWATH-MS datasets for peptides that are proteotypic for specific yeast proteins. We consistently identified and quantified more than 15,000 peptides and 2500 proteins across the 18 samples. We demonstrate high reproducibility between technical and biological replicates across all time points and protein abundances. In addition, we show that the abundance of hundreds of proteins was significantly regulated upon osmotic shock, and pathway enrichment analysis revealed that the proteins reacting to osmotic shock are mainly involved in the carbohydrate and amino acid metabolism. Overall, this study demonstrates the ability of SWATH-MS to efficiently generate reproducible, consistent, quantitatively accurate measurements of a large fraction a proteome multiple Abstract Targeted mass spectrometry by selected reaction monitoring (S/MRM) has proven to be a suitable technique for the consistent and reproducible quantification of proteins across multiple biological samples and a wide dynamic range. This performance profile is an important prerequisite for systems biology and biomedical research. However, the method is limited to the measurements of a few hundred peptides per LC-MS analysis. Recently, we introduced SWATH-MS, a combination of data independent acquisition (DIA) and targeted data analysis that vastly extends the number of peptides/proteins quantified per sample, while maintaining the favorable performance profile of S/MRM. Here we applied the SWATH-MS technique to quantify changes over time in a large fraction of the proteome expressed in Saccharomyces cerevisiae in response to osmotic stress. We sampled cell cultures in biological triplicates at six time points following the application of osmotic stress and acquired single injection DIA datasets on a high-resolution 5600 tripleTOF instrument operated in SWATH mode. Proteins were quantified by the targeted extraction and integration of transition signal groups from the SWATH-MS datasets for peptides that are proteotypic for specific yeast proteins. We consistently identified and quantified more than 15.000 peptides and 2500 proteins across the 18 samples. We demonstrate high reproducibility between technical and biological replicates across all time points and protein abundances. In addition, we show that the abundance of hundreds of proteins was significantly regulated upon osmotic shock, and pathway enrichment analysis revealed that the proteins reacting to osmotic shock are mainly involved in the carbohydrate and amino acid metabolism. Overall, this study demonstrates the ability of SWATH-MS to efficiently generate reproducible, consistent and quantitatively accurate measurements of a large fraction of a proteome across multiple samples.


Introduction
In systems biology and biomedical studies targeted mass spectrometry via selected reaction monitoring (SRM) (also known as multiple reaction monitoring, MRM) has emerged as a powerful technique for the consistent and reproducible quantification of proteins across numerous complex samples [1][2][3][4][5][6]. Optimal sets of precursor/fragment ion pairs, called transitions, uniquely represent a specific peptide. They constitute a definitive mass spectrometric assay for the detection of targeted peptides, and thus the proteins from which they derive, in the complex matrix of trypsinized biological samples [1,7]. Protein quantification is then performed by relating the intensity of the acquired transition signals to suitable reference signals. Most quantification strategies commonly used in proteomics are compatible with this method [8]. Recently, the high-throughput development of S/MRM assays has been achieved via the generation of MS/MS spectral libraries from the measurements of thousands of synthetic peptides representing proteotypic peptides [9]. Moreover, many experimental and bioinformatics workflows have been developed for assay generation, assay optimization, data evaluation and the dissemination of optimized S/MRM assays [10][11][12][13][14][15][16]. In combination, these developments have supported the creation of mass-spectrometric maps of entire proteomes of selected species including Streptococcus pyogenes, Mycobacterium tuberculosis and Saccharomyces cerevisae [5,[17][18][19] and the robust use of these resources to quantify specific protein sets across multiple biological samples.
Currently, targeted proteomics by S/MRM can be multiplexed to a maximum set of approximately 100 proteins that can be measured in a single LC-S/MRM run at optimal quantitative accuracy, limit of detection and dynamic range. The quantification of higher numbers of proteins per run compromises some of the performance parameters of the method due to well understood tradeoffs [8]. Attempts have been made to further increase the degree of multiplexing of S/MRM, either by automated adjustment of the scheduled detection windows [20] or by acquiring, in a data-dependent manner, the complete set of precursor-fragment ion pairs of a given assay [21]. Alternatively, parallel reaction monitoring (PRM) approach operated on quadrupole-orbitrap mass spectrometer has shown detection and quantification performances similar or better than those obtained in SRM, due to the increased selectivity of the mass analyzer [22][23][24]. These approaches are promising, but their application relies on prior knowledge of the precursor ions that need to be targeted during the data acquisition, and they still are subject of the above mentioned tradeoffs.
Recently, we developed a novel MS strategy that combines data independent acquisition (DIA) of trypsinized protein samples with S/MRM-like, in silico targeted analysis of the acquired complete fragment ion maps [25]. We termed the method SWATH-MS, and applied the sequential isolation window acquisition principle [26] to repeatedly cycle, in a single injection, through 32 consecutive 25-Da precursor isolation windows (swaths). The process acquires fragment ion spectra of all precursors in a space defined by the 400-1200 m/z precursor range and a user-specified retention time window.
We used the prior information in MS/MS spectral libraries to extract groups of signals that uniquely identify a specific peptide, and to demonstrate that peptides could be identified and quantified over a dynamic range of 4 orders of magnitude, even when the precursors were not detectable in a survey MS scan. For the 45 proteins involved in the central carbon metabolism of yeast we demonstrated that the accuracy of quantification was equivalent to that of S/MRM [25]. However, due to the lack of adequate software tools at that time, the extensive high-throughput targeted data analysis of the SWATH-MS maps could not be fully demonstrated in that first study.
Here we demonstrate the multiplexing capabilities of SWATH-MS for the detection and quantification of significantly larger fractions of a proteome as compared to S/MRM, without compromising reproducibility, consistency and quantitative accuracy. We describe the large scale deployment of fragment ion spectral libraries and the use of S/MRM-like analysis tools specifically adapted to SWATH-MS data for the detection and quantification of temporal changes of the S. cerevisae proteome in response to osmotic stress.

Yeast culture, protein isolation and digestion
Three series of six cultures each from the yeast strain BY4741 MATa his3Δ leu2Δ met15Δ ura3Δ were grown in SD medium until they reached an A 600 of 0.8. To apply the osmotic shock, 0.4 M NaCl was added to each 50 ml culture and after 0 min (T0), 15 min (T1), 30 min (T2), 60 min (T3), 90 (T4) and 120 min (T5) the cells were harvested. At the respective time points the culture media were quenched by addition of trichloroacetic acid (TCA) to a final concentration of 6.25 % and the cells were harvested by centrifugation at 1500 g for 5 min at 4°C. The supernatants were discarded and the cell pellets were washed three times by centrifugation with cold (-20°C) acetone to remove interfering compounds. The final cell pellets were resolubilized in lysis buffer containing 8 M urea, 0.1 M NH 4 HCO 3 and 5 mM EDTA and cells were disrupted by glass bead beating (5 times 5 minutes at 4°C). The total protein amount from the pooled supernatants was determined by BCA Protein Assay Kit (Thermo, Rockford, US). Yeast proteins were reduced with 12 mM dithiotreitol at 37°C for 30 min and alkylated with 40 mM iodoacetamide at room temperature in the dark for 30 min. Samples were diluted with 0.1 M NH 4 HCO 3 to a final concentration of 1.5 M urea and the proteins were digested with sequencing grade porcine trypsin (Promega) at a final enzyme:substrate ratio of 1:100.
Digestion was stopped by adding formic acid to a final concentration of 1%. Peptide mixtures were desalted using reverse phase cartridges Sep-Pak tC18 (Waters, Milford, MA) according to the following procedure; wet cartridge with 1 volume of 100% methanol, wash with 1 volume of 80% acetonitrile, equilibrate with 4 volumes of 0.1% formic acid, load acidified digest, wash 6 volumes of 0.1% formic acid, and elute with 1 volume of 50% acetonitrile in 0.1% formic acid. Peptides were dried using a vacuum centrifuge and resolubilized in 100 µl of 0.1% formic acid and frozen at -20°C.
For in depth-fractionation experiments of NaCl-untreated yeast cells, the peptide mixtures were separated by off gel electrophoresis (OGE) using a pH 3-7 IPG strip (Amersham Biosciences) and a 3100 OFFGEL Fractionator (Agilent Technologies) with collection in 12 wells and then submitted for C18 clean-up. All samples were spiked with the retention time standard peptides iRT-Kit (Biognosys, Schlieren, Switzerland).
Peptides were then separated at a flow rate of 300 nl/min with a 180 minutes linear gradient of 2% to 35% Buffer B. For shotgun experiments the mass spectrometer was operated with a "top20" method:

Generation of spectral library and database searching
The shotgun spectral library was generated using a total of 46 shotgun injections on an ABSciex concatenated with 6750 corresponding "tryptic peptide pseudo-reverse" decoy protein sequences). For the search, we allowed for semi-tryptic digests and up to 2 missed cleavages per peptide, and we used carbamidomethylation as a fixed modification on cysteine and oxidation as variable modification on methionine residues. The Sequest and Mascot search results were converted to pep.xml and then combined using iProphet (included in TPP version 4.5.2). The search results were sorted by decreasing iProphet probability and filtered at 1% FDR by decoy counting (iProphet score cut-off 0.0242) at the peptide spectrum matches (PSM) level, resulting in 891'570 identified spectra, 78'605 unique peptides (4.7% FDR at the peptide level) and 5'125 proteins (inclusive isoforms). Those spectra were used to build a consensus spectra library using SpectraST (included in TPP 4.5.2) [27]. The transition MS/MS coordinates for the peptides were then computed using an in-house python script that used the spectrast .sptxt library as input and retrieved the top 3-4 most intense (singly or doubly charged) y or b fragment ions for each spectra applying the following algorithm: i) the fragment ion m/z was above 300 and outside of the range of the 25 Da swath/precursor fragmentation window from which the fragment ion was acquired and ii) the extracted m/z values matched the theoretical fragment ion 8 masses within 0.05 Da tolerance. We only selected the transitions originating from proteotypic peptides (i.e. those uniquely matching to a single protein isoform from the SGD database) and that did not contain oxidation on methionine. The final fragment ion library comprised 331'449 transitions for 83'520 proteotypic precursors (66'007 unique modified peptides matching to 4'596 unique protein isoforms). For each peptide, we appended its iRT value by calculating the average of the iRT values found for the corresponding precursor(s) of that peptide at 1% PSM FDR in the search results across all the shotgun runs by using a simple linear regression after re-alignment/re-scaling onto the spiked-in reference iRT peptides (Biognosys, Schlieren, Switzerland).

SWATH-MS targeted data extraction
Targeted data extraction of the SWATH-MS data was performed by the Spectronaut software from Biognosys (RC.2.0.3). Spectronaut processed the SWATH data using the above-mentioned assay list of target peptides and using a targeting extraction and scoring strategy similar to that of the S/MRM analysis tool mProphet [15]. In addition to the subscores used by mProphet, Spectronaut also used retention time prediction based on iRT [28], the m/z dimension in the SWATH MS data, mass accuracy and isotopic distribution of fragment ions to identify a peptide. A maximum of four transitions was extracted for each targeted peptide, together with their corresponding decoy-transition groups, which were generated by pseudo-reversing the sequence of the targeted peptides.
False discovery rates (FDR) were determined for SWATH-MS using an adapted decoy model similar to that used by mProphet [15]. This method is based on three critical steps, which can be used in any targeted proteomics experiment. The first step involves the generation of signal groups for decoy analytes, such that the resulting signals are consistent with the patterns in the real data, but correspond to undetectable analytes ( Figure S1). We assessed the quality of this step by the following analyses: we targeted three classes of peptides in a yeast sample, i) yeast peptides likely detectable in the sample (e.g. as determined by their identification by shotgun analysis of the sample), ii) decoy peptides generated from the yeast peptides, and iii) peptides that are not present in the sample, e.g. peptides from human proteins which are not contained in yeast proteins. If the identified decoy peptides are truly representing false identifications, the resulting (noise) signals have to be similar and the resulting score distributions have to overlap. In supplementary Figure S2A we performed such an analysis, and showed that the employed decoy model very accurately represented false identifications. The second step involves fitting a representative probability distribution to the histogram of the decoy scores.
Although in principle various types of probability distributions could be fit, we observed that with the data at hand a Gaussian distribution provides a very good fit (supplementary figure S2A). The fitted distributions allow us to calculate, separately for each peptide transition, the p-value of incorrect identification. As is well known, p-values do not account for the multiplicity of the peptide in the experiment. However, multiple methods allow us to use the p-values to control the FDR in the list of the identifications. We used the method of Storey et al. [29], which uses a two-group mixture model of the p-values to estimate the q-values, and to control the FDR. In this study, the estimates based on this method turned out to be very accurate and generally slightly conservative (supplementary Figure   S2B). This process of FDR estimation in targeted proteomics experiments has also been validated in a similar manner for S/MRM. For a detailed description please see the supplementary of [15].
For the technical replicate measurements, the features confidently identified below 1% FDR threshold were used to estimate the elution iRT value of the feature. Then we recovered the features with qvalues over 1% FDR but below 10% FDR, for which an elution iRT value was close (i.e., less than 0.35 iRT units) to the estimated elution iRT. To evaluate the FDR after this identification recovery, we considered that the iRT tolerance window only covers 8.33% of the total extraction window, and therefore only 8.33% of the recovered false identifications meet the recovery criterion, assuming that false identifications are spread uniformly in the retention time space. This keeps the actual FDR below 1%.

Statistical analysis
The investigation aimed at identifying proteins that significantly changed their abundance across the different time points. All the identified and quantified peak intensities were first transformed by the logarithm with base 2. To further filter out incorrectly identified or quantified transitions, we calculated the Pearson correlation between the intensity of each transition and the average intensity of the protein's transitions across all MS runs, and transitions with correlation below 0.5 were removed. The remaining data were subjected to constant normalization to equalize the median peak intensities of transitions between runs [30].
Protein-level quantification and testing for differential abundance were performed using MSstats [31] based on a linear mixed-effects model [32] . The model decomposes log-intensities into the effects of times, biological replicates, SWATH features (i.e., a combination of peptides and transitions), MS runs, and statistical interactions. The model specified the reduced scope of biological replication. For each investigated comparison, MSstats provided model-based estimates of fold changes, as well as pvalues that were adjusted to control the False Discovery Rate (FDR) at the cut-off 0.05 [33]. All the input and output files from the MSstats analysis together with the R scripts performed for the manuscript are provided in Supplementary data 1.

Detection of yeast proteins by SWATH-MS
We first determined the fraction of the S. cerevisiae proteome that could be detected in a trypsinized, unfractionated yeast cell lysate by SWATH-MS. For that purpose, we selected MS coordinates consisting of precursor/ fragment ion pairs, the relative fragment ion intensities and peptide elution time for each targeted peptide, and extracted signal groups corresponding to these coordinates from SWATH-MS datasets. The MS coordinates were generated as prior information via in-depth fractionation and MS sequencing of yeast cell digests using the same model 5600 QqTOF mass spectrometer that was also used to acquire the SWATH-MS datasets (Figure 1). For the DDA measurements we generated protein extracts from cells in different perturbed conditions, including stages of a diauxic shift experiment and cells subjected to salt stress. Tryptic digests of these samples were fractionated by isoelectric focusing via off-gel electrophoresis and analyzed by DDA MS. The resulting MS/MS spectra were searched against the Saccharomyces Genome Database (SGD) and the identified peptides were statistically filtered at 1% FDR at the PSM level. We compiled the resulting MS/MS spectra into a spectral library using SpectraST [27] and reported the final number of spectra, peptides and proteins in supplementary Results Figure S3 and Table S1. Next, we used this spectral and an in-house python script to extract the coordinates of MS/MS transitions to target in SWATH- Table S2). In total, we targeted 331'449 fragment ion traces (transition signals) corresponding to 66'007 proteotypic peptides (PTPs) and 4'596 proteins in each SWATH-MS. The fragment ion traces were extracted using the Spectronaut software, and the results were filtered at 1% FDR at the assay level using the error rate model originally described for mProphet [15]. The results indicated that 16178 peptides corresponding to 2578 proteins could be confidently detected with a FDR of 1 % estimated by mProphet in an unfractionated digest of a yeast cell lysate (supplementary Table S3). Overall, 71 % of the proteins were identified with at least 2 peptides per protein ( Figure   2A). According to the comprehensive western blot analysis performed by tandem affinity purification tagged yeast ORFs [34] the detected proteins spanned a concentration range of 4 orders of magnitude down to 1e 6 to 100 copies/cell ( figure 2B). Furthermore, the data showed that the detected proteins were not biased against low abundant proteins, and that SWATH-MS could confidently detect > 300 proteins that could not be detected by the quantitative western blot analysis ( figure 2C). The fraction of the spectral library that was not detected by SWATH-MS may be explained by the fact that SWATH-MS is still not sensitive at the level of the SRM technology [35]. Overall, these results demonstrate that a single-injection data independent acquisition (DIA), combined with targeted data extraction, has the power to detect more than 2500 proteins spanning the dynamic range of protein expression in yeast down to 100 copies/cell.

Reproducibility and consistency of proteome measurements by SWATH-MS
We next tested whether the same fraction of the yeast proteome could be consistently and reproducibly detected and quantified across multiple MS injections. For this purpose, we acquired several (n= 4) SWATH-MS datasets from an unfractionated tryptic digest of a yeast sample composed of extracts of cells at different states, and used the Spectronaut/ mProphet software tools for targeted data extraction of the queried proteins. In total we detected an average of 18600 +\-72 unique peptides corresponding to 2880 +/-7 proteins per single run with an estimated FDR of 1 % at the assay level (supplementary Table S4). Overall we identified on average 28% of peptides (18600 / 66007) present in the spectral library, where 80% of those could be confidently detected in all four injections, and more than 90% in three of four injections ( Figure 3A). To evaluate the reproducibility of the method for proteome wide quantification, we determined the coefficients of variation (CV%) of the integrated transition peak areas across injections. We integrated the fragment ions traces from assays that were confidently detected in all 4 runs, which corresponded to a total of 17552 peptides and 2333 proteins, respectively. Figure 3B indicates that for the majority of the assays (76 %) the observed CV was ≤ 10 %, and that for a further 20% of assays the CVs were between 10 and 40%. Furthermore, the results showed that assays detected with high signal intensities (>500) were quantified with CV's comparable to those detected with lower signal intensities (<100) ( Figure 3C), demonstrating that peptides can be reproducibly quantified over a dynamic range of 4 orders of magnitude by SWATH-MS. Overall, these data indicate that the method has the capability of detecting a significant fraction of the proteome across multiple injections at a high degree of reproducibility, and to quantify proteins with high consistency across minimally four orders of magnitude dynamic range.

Accuracy of proteome quantification by SWATH-MS
To evaluate the accuracy of quantification of large sets of proteins targeted in SWATH-MS datasets we prepared two mixtures containing tryptic digests of yeast cultures grown either on 14 N-or 15  Signals for the queried proteins were extracted with the Spectronaut/mProphet software and analyzed as described above. We integrated the fragment ions traces from those peptides that were confidently identified in all 6 runs and in both isotopic channels, i.e. a total of 3354 peptides from 780 proteins. In the case of the 1: 1 mixture, an estimated fold change of 0.92 +/-0.14 SE was obtained whereas for the 1: 1/10 mixture, 10.47 +/-4 SE by MSstats (Figure 4), thus demonstrating that SWATH-MS achieves accurate relative quantification at the proteome level. The quantification results of the 6 SWATH-MS runs are represented in supplementary Table S5.

Quantification of changes in the yeast proteome upon osmotic shock
We next deployed the technique to quantify the changes in the S. cerevisiae proteome induced by osmolarity stress. The osmolarity stress response has been extensively studied in yeast and it occurs frequently in the yeast cell's natural environment. Specifically, we investigated the time-resolved response of the yeast proteome to NaCl, a salt that is commonly used for inducing an osmotic stress response [36]. We added NaCl to a concentration of 0.4 M to cells in exponential growth phase and Next, protein-level quantification and testing for differential abundance were performed based on a linear mixed-effects model [32]. Several comparisons with respect to the baseline at time point 0 min (T0) (specifically, comparing protein abundances at time 0 min (T0) to the abundances at 15 min (T1), 14 30 min (T2), 60 min (T3), 90 min (T4) and 120 min (T5) after addition of salt) were tested for each protein.
Next, we performed unsupervised clustering of proteins according to their patterns of change in abundance over time. We enumerated all the possible 3 5 such patterns (i.e. significantly up-regulated, significantly down-regulated and no statistically significant change, at times 15 min (T1), 30 min (T2), 60 min (T3), 90 min (T4) and 120 min (T5) as compared to time 0 min (T0)). We retained four clusters with more than 50 proteins each for further examination (Figures 5A and 5B). Proteins in cluster 1 (n = 266) and cluster 2 (n = 67) are up-regulated along the complete time course, with a > 20 min delay in the response for proteins belonging to cluster 2. Pathway enrichment analysis tool DAVID (http://david.abcc.ncifcrf.gov/) revealed an over-representation of proteins involved in carbohydrate metabolism such as the glycolysis-gluconeogenesis pathway (p = 4.4e -6 ), the starch and sucrose metabolism (p = 7.2e -4 ) and the pentose phosphate pathway (p = 8.3e -3 ). These pathways are directly linked to the glycerol, threhalose and glycogen metabolism, which are known to be induced by osmotic shock to trigger the production of glycerol, an essential osmolyte for osmoadaptation [36].
In contrast, downregulated protein profiles were obtained for cluster 3 (n = 219) and cluster 4 (n = 567) upon addition of salt, with a > 20 min delay in the response for proteins belonging to cluster 3.
Mainly enzymes involved in amino acid biosynthesis (i.e. the glycine, serine, threonine metabolism pathway; p = 3e-4 and the phenylalanine, tyrosine and tryptophan pathway, p = 2.3e-2) were found to be repressed as previously suggested [37].

Validation of protein fold changes by S/MRM
To validate the protein fold changes obtained by SWATH-MS, we quantified a subset of 100 proteins in the 18 osmotic shock time course experiment using S/MRM. Among these proteins, 24 were upregulated along the complete time course according to SWATH-MS, 22 were down-regulated along the complete time course, and 5 were down-regulated along the complete time course after a > 20 min delay. The remaining 49 proteins were not regulated between any time points according to SWATH-MS. For each of the 100 proteins, we chose the highest-responding peptide for each protein that was confidently identified in SWATH-MS, together with their corresponding fragment ions (i.e. four transitions in total), for S/MRM quantification. The fragment ion traces of these peptides detected with S/MRM were extracted and integrated using Skyline [13] together with the corresponding peptides detected with SWATH-MS (Supplementary data 2). This allowed us to consistently process both data sets with the same integration/quantification parameters using the same software. Since the optimal peptides and transitions were chosen for the S/MRM quantification, all the quantified transitions were of a relatively good quality, and no downstream filtering was necessary. The quantitative values were used as input to MSstats to estimate log fold changes and their associated confidence intervals, and to test proteins for differential abundance, between 15 min (T1), 30 min (T2), 60 min (T3), 90 min (T4), 120 min (T5) and the initial time point 0 min (T0).
Supplementary data 3 shows log fold change profiles and their associated confidence intervals of the 51 regulated proteins quantified by S/MRM and SWATH-MS. To be consistent, for the results for SWATH-MS we only used the subset of the peptides and of the fragments that were also quantified by S/MRM, even though for SWATH-MS additional transition signals for these peptides and additional peptides were concurrently acquired. The data show that the majority of the confidence intervals overlap, indicating good agreement between SRM and SWATH-MS quantification. To further formalize the comparison between the methods, we compared the outcomes of tests for differential abundance. We classified the outcome of each test for differential abundance as significant upregulation (denoted as 1), significant down-regulation (denoted as -1), and absence of significant regulation (denoted as 0) (supplementary Table 7). These tests for differential abundance were applied to the 5 time points of the study resulting in five values per protein across the time course. Table 1 shows that for 64.7% of the proteins the conclusions from the two datasets agree in at least 4 time points. Conversely, only for 17.6% of the proteins conclusions from the two datasets agree in less than 3 time points, whereas 7 proteins had no agreement due to interferences or low quality peak shape.
(see supplementary data 4). The remaining discrepancies can also be due to other reasons. First, the S/MRM and SWATH-MS acquisitions occurred several weeks apart. Second, external factors such differences in chromatographic conditions (3 hours in SWATH versus. 30 min in SRM) may generate differences in ionisation or ion suppression between both measurements, in a way that impacts peptide quantification accuracy of one or the other method. Finally, the nature of the mass analysis between the two methods may result in differences in sensitivity and detection. Overall, the results demonstrated that SWATH-MS targeted analysis of complex samples can provide biological conclusions that are in a high agreement with S/MRM, but at a much higher throughput.

Correlation of protein fold changes with transcriptomics
We next correlated the protein abundance profiles obtained by SWATH-MS with their corresponding transcript profiles across the four most significantly regulated pathways (i.e. the glycolysisgluconeogenesis pathway, the pentose phosphate pathway, the glycine, serine, threonine metabolism pathway and the phenylalanine, tyrosine and tryptophan pathway). For this purpose, we used a transcriptomic data set that was previously generated for yeast treated under similar experimental conditions [37]. Figure 6 represents the quantitative patterns (as heat maps) for each transcript and Recently we developed the SWATH-MS technology that combines the DIA acquisition approach with targeted data analysis of S/MRM [35] and could thus demonstrate on a set of 60 peptides higher sensitivity than analysing the data with regular database searching. In the present study we demonstrated the multiplexing capabilities of SWATH-MS for large scale quantitative proteomics studies. We were able to confidently detect around 2500 proteins spanning the dynamic range of protein expression in yeast in a 3-hours single sample injection. In contrast, S/MRM would have required 48 hours of instrument time to detect the same number of proteins in one single sample due to its lower multiplexing capacity (i.e. ~100 proteins/run). Thus, the SWATH-MS measurements for the 18 osmotic shock time course samples were completed in ≤ 3 days. Since the data structure of SWATH-MS data is equivalent to that of S/MRM, a similar/extended bioinformatics workflow was implemented. S/MRM, several chromatographic peak groups extracted for the same targeted peptide, evaluated using automated and objective probabilistic scoring model. To confidently identify the targeted peptides by SWATH-MS, we applied Spectronaut, a bioinformatics tool the scoring strategy mProphet was developed for the automatic evaluation of S/MRM signals [15]. In addition to the chromatographic (S/MRM-like) scores (i.e. coelution, peak shape similarity, intensity and correlation of fragmentation pattern between peak groups and assays), SWATH-MS specific scores were added such as mass accuracy and isotopic distribution of fragment ions. A combined score (i.e. discriminated score) was then calculated for each detected peak group and finally used for FDR estimation. Thus for the automated analysis of the 18 SWATH-MS runs by Spectronaut less ≤ 2 days were required and allowed the detection of ≥ 2500 yeast proteins with high confidence along the time course study. To further pinpoint the proteins that were significantly changing in abundance between the different time points, we applied MSstats, a statistical modeling framework for protein significance analysis previously designed for S/MRM experiments [32]. MSstats uses an intensity-based approach decomposing the MS signals obtained for each protein across isotopic labels, peptides, charge states, transitions, samples, and conditions. It has been shown that MSstats performed better in terms of sensitivity and accuracy than simple statistical methods like t tests. By applying MSstats to our SWATH-MS data sets, we could show that out of the 2589 quantified proteins, 333 and 786 were found significantly up-and down-regulated, respectively, along the complete time course study or with a delay of 20 min . Many of these proteins were shown to be involved in metabolic pathways and were known to be active upon osmotic shock [37]. Besides, we could show on a set of 51 yeast proteins that SWATH-MS delivered similar quantitative protein profiles than S/MRM along the 18 osmotic shock time course samples. Furthermore, the reproducibility of SWATH-MS runs was demonstrated with low within-run coefficients of variation (CVs) of ≤ 10% for the majority of the targeted peptides, with only minimal dependency of peptide abundance, results that are similar to those obtained by S/MRM [38]. In conclusion, the results demonstrate that SWATH-MS allows for the quantification of large set of proteins across multiple samples with a precision, reproducibility and accuracy that is comparable to that obtainable by S/MRM. With the adaptation of S/MRM-based software tools for the SWATH-MS targeted data analysis, the technique can be rapidly applied to any type of system biology or biomedical investigation, as it was successfully demonstrated in the last years with S/MRM but with a higher throughput and higher degree of multiplexing.  Finally, the extracted chromatograms were scored using the probabilistic scoring model of mProphet and protein significance analysis was performed by MSstats.

B.
C.    in abundance is shown in yellow. Color intensity reflects the corresponding log2 fold change.