Protein Expression Regulation under Oxidative Stress*

Oxidative stress is known to affect both translation and protein turnover, but very few large scale studies describe protein expression under stress. We measure protein concentrations in Saccharomyces cerevisiae over the course of 2 h in response to a mild oxidative stress induced by diamide, providing detailed time-resolved information for 815 proteins, with additional data for another ∼1,100 proteins. For the majority of proteins, we discover major differences between the global transcript and protein response. Although mRNA levels often return to baseline 1 h after treatment, protein concentrations continue to change. Integrating our data with features of translation and protein degradation, we are able to predict expression patterns for 41% of the proteins in the core data set. Predictive features include, among others, targeting by RNA-binding proteins (Lhp1 and Khd1), RNA secondary structures, RNA half-life, and translation efficiency under unperturbed conditions and in response to oxidative reagents, but not chaperone binding. We are able to both describe general dynamics of protein concentration changes and suggest possible regulatory mechanisms for individual proteins.

son's disease (6), and cardiovascular disease (7). Moreover, protein aggregation is related to the impairment of the ubiquitin proteasome system, and both processes are hallmarks of several neurodegenerative diseases (8,9). Oxidatively modified proteins are also characteristic of cellular senescence and aging (10). The abnormal or prolonged production of oxidants is linked to DNA damage that results in gene mutations, altered gene expression and eventually cancer (11).
Baker's yeast has been successfully used as a model for neurodegenerative diseases and aging (12,13), and its response pathways to oxidative stress are evolutionarily conserved with those in mammals (3). The yeast response to oxidative stress comprises extensive transcription regulation, for example through activation of transcription factors Yap1, Skn7, Msn2, and Msn4 (14). However, oxidative stress also impacts translation and protein degradation, affecting protein expression levels in addition to changes at the mRNA level. Translation and protein synthesis are generally down-regulated during oxidative stress, but specific RNAs are independently regulated in their translation depending on the type of stress (15,16). For example, translation of the yeast Gcn2 protein kinase is inhibited, preventing phosphorylation of the eukaryotic translation initiation factor eIF2 1 (16). The reduced activity of the eIF2 complex results in decreased rates of translation initiation and protein synthesis (17,18). In general, ribosomal run-off and transit times are slower upon H 2 O 2 exposure, but stress-regulatory factors are preferentially associated with ribosomes, suggesting increased translation. Several RNA-binding proteins play essential roles during oxidative stress (19), but their specific actions and targets are often unknown.
Cellular protein concentrations are also affected by proteolysis. The proteasome is the main protease complex responsible for degradation of unfolded, damaged, and un-needed intracellular proteins in eukaryotic cells. Proteasomal degradation decreases under strong oxidative stress and increases under mild oxidative stress (20). Nondegradable oxidized proteins are prone to cross-linking and aggregation, and the aggregates may interact with the proteasome, decreasing its efficiency (9). Although proteasome function is well defined during normal proteolysis, the exact expression and functional response of the proteasome to oxidative stress is still a matter of debate (21,22).
Although these examples illustrate that protein expression with respect to translation and protein degradation is heavily affected during oxidative stress, only a few small scale measurements of protein concentration changes in response to oxidative stress exist to date, and these studies do not compare protein levels to transcript levels (23)(24)(25)(26). Time-dependent proteomics measurements with matching mRNA data are still scarce (27,28). We provide the first time-resolved concentration measurement of Ͼ1,900 yeast proteins during the 2 h after diamide-induced oxidative stress. We focus on protein expression changes, because the transcriptional response has been described extensively elsewhere (14). Integrating our measurements with transcription (14,29), translation (16,30), and other regulatory data, we characterize the general and specific proteome response to oxidative stress, highlighting possible regulatory mechanisms and their targets. We characterize expression patterns of groups of proteins and individual examples, such as Tsa1, a multi-functional protein involved in oxidative stress resistance (31), genomic instability (32), apoptosis protection (33), and prion formation (34).

Transcript Concentrations from Published Microarray
Data-Transcript information was taken from a published data set (14). The data are relative, i.e. measurements refer to expression levels at time ϭ 0 min. To estimate absolute mRNA concentrations, we multiplied the relative values at each data point with the expected average concentration of the mRNA under unperturbed conditions, as has been done previously (35). The data used in this study (14) correlate well with transcriptomics data from other studies (36, 37) (data not shown) indicating that yeast mRNA expression changes measured in different laboratories are comparable.
Protein Concentrations from Quantitative Shotgun Proteomics Experiments-Proteomics experiments were performed on yeast grown in conditions identical to those used by Gasch et al. (14). Briefly, we grew yeast DBY7286 cells to the early log phase in rich medium (YPD), treated them with 1.5 mM diamide, and collected 100 ml of cell culture at 0, 10,20,30,40,60,90, and 120 min. The cells were still in the logarithmic growth phase when harvested (data not shown). From each fraction we extracted total soluble protein as described before (38). The cells were disrupted using glass beads, and cellular lysate was extracted by centrifugation at 5,000 ϫ g for 50 min. Lysis buffer consisted of 25 mM Tris-HCl, pH 7.5, 5 mM DTT, 1.0 mM EDTA, 1ϫ CPICPS (Calbiochem protease inhibitor mixture; Sigma). Protein concentration was measured, and lysate was diluted to 2 mg/ml with buffer (50 mM Tris-HCl, pH 8.0). 50 l of diluted cell lysate was mixed with 50 l of 100% trifluoroethanol and incubated at 55°C for 45 min (15 mM DTT). The sample was cooled to room temperature and incubated with 55 mM iodoacetamide in the dark for 30 min. The sample was then diluted to 1 ml with buffer (50 mM Tris-HCl, pH 8.0) and 1:50 (w/w) trypsin was added to digest for 4.5 h at 37°C. Tryptic digestion was halted by adding 2% (v/v) formic acid. The sample was lyophilized to 20 l, resuspended in buffer C (95% H 2 O, 5% acetonitrile, 0.01% formic acid), and washed using a HyperSep C18 spintip (Thermo Fisher). The eluted sample was again lyophilized to 10 l, resuspended in 120 l of buffer C, and filtered through a Microcon-10 filter at 12,000 ϫ g. The sample was stored at Ϫ80°C until LC-MS/MS analysis.
LC-MS/MS Analysis-The samples were injected into an LTQ-Orbitrap Classic (Thermo Electron) mass spectrometer and analyzed in a 5 to 90% acetonitrile gradient over 5 h via reverse phase chromatography on a Thermo BioBasic-18 column 150-mm ϫ 0.10-mm inner diameter. Each of the runs was analyzed independently with Bioworks (Thermo Fisher), searching a database of yeast protein sequences (SGD, 2009). The results were combined for analysis by PeptideProphet (39) or ProteinProphet (40) and post-processed in the APEX pipeline (35,41) to estimate absolute and differential protein expression based on spectral counts. We accepted proteins as confidently identified if the ProteinProphet probability was above a cutoff corresponding to Ͻ5% global false discovery rate. Absolute protein concentrations were normalized to an average of 4,000 molecules/ cell based on published estimates (35,42). Relative protein expression changes are calculated with respect to measurements at time ϭ 0 min, log-transformed, and normalized to mean 0 and standard deviation 1. Significance of expression changes was calculated (relative to the measurement at time ϭ 0 min) according to the method by Lu et al. (35). Cysteine-containing peptides were extracted from the prot.xml files provided by the software.
We conducted the experiment twice (biological replicates) and collected several technical replicates (repeat mass spectrometry measurements). The raw data are published at http://www.marcottelab. org/MSdata/, as data set 15. The pep.xml files are uploaded on Tranche (ProteomeCommons), data hash: Zz/J8b5YBb8yRqW4lukA-w17Mzk56f/ItjicIy8v87h1aXAxgrj8Hcbin323ovJR4ZBcdl9yVMppGza-9REHxiqhOJj7UAAAAAAAAW6w. More information on experimental replicates is provided in the supplemental materials (notes section and supplemental Fig. S1). Basic mass spectrometry data are provided in the supplemental materials.
Data Processing and Analysis-Earlier work has shown that protein concentrations are expected to be accurate within 2-fold on average (35), which is the lower boundary of expression changes that we would consider biologically meaningful. For 69% of the proteins, concentrations vary less than 2-fold across replicates (supplemental Table S1). High quality and reproducibility of the individual files from both biological and technical replicates allowed for pooling of all data sets to increase coverage (supplemental Fig. S1). Pooling of data sets has the advantage that for a given protein whose identification is subthreshold in individual data sets, the combined information from all data sets may be strong enough to push it above threshold. Details on data quality control are presented in the supplemental materials (supplemental Figs. S1-S5 and supplemental Tables S1-S4).
Auto-correlation (see Fig. 2) was calculated using log-transformed absolute expression values, comparing protein and mRNA expression vectors of different time points against the vector at time ϭ 0 min. We clustered the column-normalized expression profiles using ClusterX (43), extracting clusters with a correlation coefficient ͉R͉ Ն 0.80 (see Fig. 1). Prior to clustering, absolute expression data were smoothed and then back-transformed into relative, log-normalized expression values. Smoothing involved recalculation of each data point as the average of the preceding and following data point, i.e. concentration (t ϭ average (concentration at t Ϫ 1, t, t ϩ 1) (moving-average method). Fig. 1 shows smoothed log-normalized relative expression data; Fig. 3 shows raw (unsmoothed) data that have been log-normalized (base 10). The goal of our analysis is to reveal general trends in time-dependent mRNA and protein expression. For that reason, we chose the simple, but relatively drastic "moving average" smoothing method to eliminate noise in the data. The moving average method enables us to extract strong trends that are consistent across many genes (e.g. the drop in protein concentration at 20 min in cluster C). The method has the disadvantage that it dampens subtle expression differences of individual genes within one cluster, e.g. those observed for Ccs1 and Sod2 (see Fig. 3, B and C, respectively). For that reason we present the unsmoothed data in Fig. 3 to enable the reader to view the original data.
To create a random model, we shuffled gene identifiers for the proteomics data and repeated the clustering with the new, synthetic mRNA-protein profiles (supplemental Fig. S7). Function analysis was performed with FuncAssociate (44). Reported function enrichments were significant with a p value of Ͻ0.001.
We compiled a set of expression attributes (features) that we used to characterize cluster membership and to reveal possible underlying regulatory mechanisms. These attributes included both sequencebased and experimental attributes (see Table I). We excluded features that were invariant across any of the 815 genes in the core data set (e.g. targets of several chaperones and RNA-binding proteins), and features that showed high correlation to other features (R Ͼ 0.90), e.g. FOP and Codon Bias Index.
To learn cluster membership, we used the WEKA machine learning software (45). Bagging with RandomForest performed best (supplemental Fig. S8). When learning individual (binary) cluster membership (member of cluster or not, {1,0}), we used cost-sensitive learning with a confusion matrix adjusted to number of positives in the training data. Ten-fold cross-validation was used to evaluate learning success. The F-measure of prediction is the harmonic mean of precision and recall, calculated as F ϭ 2 * precision * recall/(precision ϩ recall). The closer the F-measure is to 1, the better is the prediction. Similarly, the closer the area under the curve of a ROC plot is to 1, the better the prediction (see Table II).
Attribute (feature) selection was also conducted in cost-sensitive manner (CostSensiviteSubsetEval), using GreedyStepwise and Cfs-SubsetSelection as search and evaluator algorithm, respectively. Attribute selection cannot be evaluated for statistical significance, but the "merit" of the selected subset of features indicates the relative success of the procedure. After testing learning with all 157 features, we selected a subset of 17 features with the strongest predictive ability. Table II lists the t test scores for these 17 features for the three main clusters. Note that these features do not necessarily represent all features with significant t test scores (provided in supplemental Data File 1), but they are those that enable prediction of membership in the clusters. The supplemental materials also describe further details on clustering, learning algorithms, feature selection, etc. (supplemental Fig. S8 and Tables S5-S7).
Sequence motifs were identified using MEME (46) with the settings "any number of repetitions" and 4 Յ width Յ 10 (supplemental Fig. S12). Supplemental Data File 1 contains detailed information on the data set of this study. Supplemental Data File 2 contains primary information on peptide and protein assignments.

Concordance and Discordance between Protein and mRNA Expression Changes
Our experiments produced absolute protein expression data for a total of 1,907 proteins. Fig. 1 shows the normalized, log-transformed expression changes for both mRNA (14) and protein expression for a core set of 815 proteins that have data available for Ն6 of the eight time points. Protein concentrations cover 5 orders of magnitude (supplemental Fig. S4) and show a maximum of ϳ200-fold expression change. Even after accounting for delays caused by translation, most proteins (Ͼ80%) have protein expression profiles that are very different from their corresponding mRNA expression profiles ( Fig. 1 and supplemental Figs. S6 and S7), suggesting extensive regulation at the level of translation and protein degradation.
Transcript and protein expression responses display very different kinetics, as evidenced by an auto-correlation analy-sis ( Fig. 2). Most transcriptional changes occur at ϳ30 min after stress induction, indicated by the lowest auto-correlation at this time point (Fig. 2). Ninety minutes after treatment, many transcript abundances have returned to normal levels (high auto-correlation), consistent with previous results on transcription and mRNA degradation (29). In contrast, most of the protein expression response is much slower (Fig. 2), and expression profiles continuously diverge even 2 h after treatment. For many proteins, we observe strong protein abundance changes at 10 -20 min after treatment (see examples below), which is later complemented by different expression patterns. This first and early response occurs entirely at the protein level, before the majority of the transcription response. For individual proteins, we observe a 10 -20 min delay between the mRNA and protein response (supplemental Fig. S7). In contrast to transcript abundances, concentrations for many proteins are not yet back to normal even 2 h post-stress treatment. Because we did not continue our measurements beyond 2 h, we cannot directly compare the expression changes with those from a previous study using rapamycin (28).  Table II. Gray denotes missing values.
Despite conducting the experiment in log phase, some of the observed expression changes may not be due to an oxidative stress response, but to changing conditions in the batch culture. Although recent work has shown that exponential growth in batch culture is a good model of steady state (47), future studies may conduct the experiments in continuous growth culture or include further controls at additional time points. We also note that our method estimates concentrations of unmodified proteins, proteins that are heavily posttranslationally modified will not be detected and lower the apparent concentration. (However, see discussion of cysteine oxidation below.) Similar to transcript abundances (14), protein abundance profiles show distinct clusters of coregulated proteins (Fig. 1). We observe 12 clusters with Ն10 members whose combined mRNA and protein expression profile are highly similar (R Ն 0.80, smoothed data) (supplemental Fig. S10). The three largest clusters, called A, B, and C, have 127, 76, and 66 members, respectively, and are described in detail below. Proteins in these clusters have distinct characteristics (see Table II), including functional biases (p value Ͻ 0.001). Fig. 3 shows for each cluster examples of proteins with roles during the oxidative stress response. The clusters describe approximate expression patterns; the expression for individual proteins within each clusters may vary. In contrast to the smaller clusters, membership in clusters A, B, and C can be predicted using a subset of 17 of the 157 expression attributes that we compiled (Tables I and II). The tested features include amino acid composition, codon bias, targets of RNA-binding proteins or chaperones, RNA secondary structure, measures of transcript and protein stability, as well as translation efficiency. Features with predictive power suggest molecular mechanisms that may cause the observed expression patterns.

Genes with Decreasing Protein and mRNA Abundance
Cluster A-The largest cluster (A, 127 proteins; Fig. 3A) is strongly enriched for ribosomal proteins, translation factors, and tRNA synthetases (p value Ͻ 0.001; supplemental Table S5). Both protein and mRNA abundances are immediately down-regulated after stress treatment; mRNA abundances start returning to normal at ϳ40 min. Cluster membership can be predicted well (area under the curve ϭ 0.80; Table II). Ribosomal proteins are generally highly abundant under normal conditions, in accordance with their genomic sequences that are characterized by high codon adaptation indices, little structured 5Ј-UTRs, and high protein production rates (Table II; p value Ͻ 0.001, ͉t value͉ Ͼ 3.40). However, proteins in cluster A are also significantly less stable than proteins from clusters B and C, as indicated by their high intrinsic structural disorder (48) (p value Ͻ 0.001, ͉t value͉ Ͼ 3.40). Such instability is consistent with the observed decrease in protein abundance, despite recovery of the mRNA levels. In response to mild oxidative stress, translation efficiency decreases in cluster A, as measured through ribosomal association with the mRNA (30). Decreasing translation and short protein half-lives explain the decrease in protein abundance despite recovery of mRNA expression levels (Table II and Fig. 3A). Fig. 3A shows examples of cluster A: two aminoacyl t-RNA synthetases (Gln4 and Ils1), two ribosomal subunits (Rps11b and Rps2), and the eukaryotic translation initiation factor 4B (Tif2). In mammals, eIF4B is a target of the RNA-binding protein TIAR, down-regulating translation (19,49). In yeast, Tif2 mRNA is also a target of Lhp1 as discussed below.
Members of cluster A are targets of significantly more RNAbinding proteins than the average protein in the data set (p value Ͻ 0.001; supplemental Data File 1). One significant predictor of cluster membership is the RNA-binding protein Lhp1. Lhp1, the La homologous protein, is required for maturation of tRNA and U6 small nuclear RNA precursors, and it acts as a molecular chaperone for RNAs transcribed by polymerase III (50,51). Lhp1 is required for the normal pathway of tRNA maturation through protection of nascent transcripts from exonucleolytic degradation. Lhp1 also binds the coding RNAs of a number of genes and gene families, including ribosomal mRNAs, Hac1, and other genes involved in the unfolded protein response and its own Lhp1 transcript (52). Lhp1 targets are significantly enriched in cluster A (54 of 126, p value Ͻ 0.001), explaining the predictive power of Lhp1 for cluster membership. One can hypothesize that Lhp1 may stabilize coding mRNAs in a manner similar to its chaperone function with noncoding RNAs, resulting in the observed increase in mRNA levels after 40 min (Fig. 3A).
Furthermore, targets of the RNA-binding protein Khd1 are significantly depleted in cluster A (p value Ͻ 0.001, ͉t value͉ Ͼ 3.40), making it a predictor of cluster membership (Table II). Khd1 has been shown to repress translation of bud-localized mRNAs (53,54), which is consistent with the presence of many highly abundant proteins (e.g. ribosomes) in cluster A.
Secondary structures in both the coding and untranslated regions impact transcript stability and translation efficiency. Indeed, a significant lack of secondary structures in the 5Ј-UTR may support high translation initiation among cluster A proteins (Table II; p value Ͻ 0.001, ͉t value͉ Ͼ 3.40). Cluster A also has some members with a large number of secondary structures, i.e. RNA double-strands, in the coding strand (Ta-ble II; p value Ͻ 0.001, ͉t value͉ Ͼ 3.40), the biological reason behind which remains to be investigated.
Finally, proteins in cluster A are significantly enriched in arginine (p value Ͻ 0.001, ͉t value͉ Ͼ 3.40) and, accordingly, have a higher isoelectric point than other proteins (Table II). Ribosomal proteins and translation factors, which are abundant in cluster A, bind to RNA, and many RNA-binding domains in these proteins are rich in arginine, e.g. in the RG, RGG, or RS motifs (55).  Table II. The scales are adjusted to be the same across each row. The mRNA expression pattern of Ccs1 deviates from the average expression in cluster B, which may be an artifact of hierarchical clustering.

TABLE I Potential predictors of translation and protein degradation regulation
A total of 157 attributes were analyzed in their ability to explain membership of proteins in expression clusters as identified in the data in Figure 1. We assembled experimental data sets, as well as sequence features that are known to relate to post-transcriptional expression and protein degradation. These attributes include binding of RNA-binding proteins (putative regulators), protein stability estimates (experimental and theoretical), measurements of translation efficiency and transcript stability, sequence features, and a few other features outside these categories. References are provided in parentheses. DISEMBL, DISorder predictor from the European Molecular Biology Laboratory; MIPS, Munich Institute for Protein Sciences; uORF, upstream ORF; PARS, parallel analysis of RNA structure (experimental measure of doublestrandedness in RNA); PEST, proline, glutamate, serine, threonine degradation signal; RBP, RNA-binding protein.

Data type
Source/comment PARS score in the coding region, 3Ј-and 5Ј-UTR. Calculated were the average, standard deviation, relative standard deviation, minimum, maximum, median score across the whole sequence, and the first and last ten nucleotides of the sequence The higher PARS score, the higher the probability of nucleotides in the sequence to be in double-stranded conformation. RNA secondary structure influences transcript stability and/or accessibility to regulators and ribosomes (103)

Genes with Decreasing mRNA and Constant or Increasing Protein Abundance
Features Common to Clusters B and C-Clusters B and C have several characteristics in common that distinguish them from cluster A (Table II and Fig. 3, B and C). They both are significantly enriched in proteins of the direct stress response (p value Ͻ 0.001; supplemental Table S5): cluster B contains many oxidoreductases and chaperones, whereas cluster C The three largest clusters of expression patterns (R Ͼ 0.80) as indicated in Figure 1 (of 12 clusters with Ͼ10 members). Function enrichment analyzed with FuncAssociate (p value Ͻ 0.001) (44). The F measure is the harmonic mean of precision and recall. Combined prediction aims to predict membership for all 12 clusters simultaneously, i.e. membership in cluster (A, B, C, Dѧ); individual predictions predict membership of a gene in one cluster at a time, i.e. in cluster A (yes or no). All of the attributes (features) used are listed in Table I with more detailed descriptions. The value listed next to the selected attribute describes the result of a t test. A t value of Ͼ3.40 is significant at a p value of Ͻ0.001 level for all three clusters (given the cluster size) and is printed in bold type. Negative and positive t values indicate depletion and enrichment of the feature in the test set, respectively. ͓T͔ 5-10 and ͓TA͔ 3-6 are binding motifs for the poly(A)-binding proteins Pub1 and Pab1, respectively (54). AUC, area under curve (where the curve is the receiver-operator characteristic), the closer to 1 the better is the prediction; PARS, parallel analysis of RNA structure, i.e. experimental measure of double-strandedness in RNA taken from Ref. 103
Clusters B and C are each only about one-half to twothirds the size of cluster A, and their membership is less easily predicted (Table II). Proteins in both clusters have a low degree of intrinsic disorder, suggesting high protein stability (Table II). The mRNAs in both clusters are also significantly more stable than other mRNAs under normal conditions (Table II; p value Ͻ 0.001, ͉t value͉ Ͼ 3.40), in contrast to the transcript response to stress (Fig. 3, B and  C). Thus, transcript stability may be subject to stress-related regulation.
Clusters B and C are also enriched for binding sites of the poly(A)-binding protein Pab1 (Table II), and all three clusters show binding sites for another poly(A)-binding protein Pub1. We could not match any other motifs to putative regulators (supplemental Fig. S12). Pab1 binds to the poly(A) tails of mRNAs and interacts with eIF4-G to promote (cap-dependent) translation initiation (56), consistent with the stable or increasing protein levels in clusters B and C compared with the decreasing protein levels in cluster A. Pab1 also affects formation of stress granules (57), and it has been implicated in cap-independent translation through binding to an A rich element in the 5Ј-UTR (58), which we observe in cluster B (Table  II). The expression of Pab1 itself is only slightly affected by oxidative stress, and it is not member of clusters A, B, or C (data not shown).
Cluster B-To cope with stress induced by thiol oxidation, a diverse set of antioxidant responses is triggered in yeast. Several antioxidant genes (peroxidases, disulfide reductases, and chaperones) are up-regulated and grouped together in cluster B. In cluster B, protein levels increase during the first 30 min, although many proteins have a 20-min lag in their response (Fig. 3B). At later time points, protein levels are constant or slightly decreasing. Cluster B contains one of the primary enzymes in the oxidative stress response, superoxide dismutase Sod1, and its chaperone, Ccs1 (Fig. 3B). Both proteins are essential for developing resistance against oxidative stress (59). Ccs1 is necessary for the folding of Sod1, forming active Sod1 from an apo-protein (60). Protein expression is consistent with this function; Ccs1 is present when Sod1 is present (Fig. 3B).
Besides Ccs1, two yeast glutaredoxins, Grx1 and Grx2, influence Sod1 function (61). These enzymes catalyze the reduction of intra-and interprotein disulfides and low molecular weight thiols such as glutathione, which is produced in abundance during diamide-induced stress (62,63). Grx1 and Grx2 can, similar to NADPH and thioredoxins, stimulate translation (64), and Grx2 may be regulated through in-frame start codons (65). Although the two enzymes have highly similar sequences (66), they differ in their structure and biochemical activity (67), and Grx2 accounts for most of the glutathionedependent oxidoreductase activity (68). The functional difference is also reflected in the expression data, where Grx2 has a stronger response than Grx1 both at the mRNA and protein level (Fig. 3B). Both enzymes show stabilized protein expression levels compared with decreasing transcription 40 min after treatment, suggesting that protein stability is regulated. We find, for example, that Grx2 has fewer PEST degradation sites than the average protein (supplemental Data File 1).
In addition to glutaredoxins, mRNAs are up-regulated for thioredoxins and glutathione oxidoreductases, which fulfill crucial antioxidant roles during diamide-induced stress. In contrast to mRNA levels, which decrease slowly 30 min after treatment, protein concentrations remain constant, suggesting high protein stability or an increase in translation rate per available mRNA. Our data set comprises some members of the thioredoxin and glutathione systems: two peroxidases (Ahp1 and Gpx3), thioredoxin 2 (Trx2), glutathione synthetase (Gsh2) and interestingly, the thioredoxin reductase 2 (Trr2) and glutathione reductase 1 (Glr1)-enzymes that catalyze the final step in the reduction cascade of both systems (69). Fig. 3B shows Zwf1, the glucose-6-phosphate dehydrogenase that catalyzes the first, irreversible and rate-limiting step of the pentose phosphate pathway (70), which is an essential component of oxidative stress resistance (71,72). Zwf1 is involved in maintenance of cytosolic levels of NADPH, which in turn is an electron donor to several anti-oxidant systems. Despite decreases in mRNA abundance for 30 min after stress treatment, Zwf1 protein levels remain stable throughout the entire measurement time (Fig. 3B). The stability of Zwf1 may possibly be linked to it being bound by eleven chaperones, more than observed on average for the proteins in our data set (average is four chaperones per protein; supplemental Data File 1).
Cluster B shows some enrichment in secondary structures in the 5Ј-UTRs of its member mRNAs, as well as depletion of structures in the first 10 nucleotides of the coding region. Secondary structures in the 5Ј-UTR are thought to hinder translation (73); thus the increase in protein levels may derive from protein stability regulation and not translation increase.
Cluster C-Cluster C is enriched for components of the proteasome (Table II). The 26 S proteasome holoenzyme is a multisubunit protease composed of the 20 S catalytic core capped with the 19 S regulatory particle, which recognizes ubiquitin-tagged proteins. Although the exact role of the proteasome during oxidative stress is still a matter of debate, most authors suggest that only the 20 S proteasome is responsible for the hydrolysis of oxidized proteins in an ubiquitin-and ATP-independent fashion (74,75). The 26 S proteasome, i.e. capped with regulatory particles, is more stresssensitive than the 20 S core (76,77), and we observe no or only a few copies of 19 S subunits in our data (data not shown).
Protein expression of subunits of the 20 S core in cluster C (Fig. 3C) sharply decreases during the first 20 min but recovers at 30 min and is constant for the rest of the measurement time. The stabilized protein levels seem necessary to cope with the increasing levels of oxidatively damaged proteins. The "dip" in proteasome abundance at 20 min is only present at the protein level and not at the mRNA level, and it is a marked characteristic of cluster C (Fig. 3C). It is most pronounced among subunits of the 20 S proteasome but is also visible among other members of cluster C, e.g. transport proteins or superoxide dismutase 2.
We cannot tell from the data if the proteasomal subunits are truly degraded, change localization, or modified (e.g. through cysteine oxidation) and subsequently escape mass spectrometric detection. The observed changes in protein concentrations are not accompanied by changes in the fraction of cysteine-containing peptides (supplemental Fig. S5); thus cysteine-oxidation does not influence the measurements of protein concentrations. The 20 S proteasome is, however, sensitive to oxidative stress, and its S-glutathiolation can affect proteasome activity (78,79). Oxidation of cysteines by diamide can impair protein function (2) and may trigger degradation of proteins. Little is known to date about degradation of the proteasome and its regulation, i.e. if it occurs primarily by the lysosome (80) or by the proteasome itself (81). The increase in protein concentrations at later time points (Fig. 3C) may occur through replenishment via translation, reversal of the amino acid modifications, or protein localization changes.
The characteristic dip in protein expression also occurs in nonproteasomal stress proteins, for example the peroxiredoxins Ahp1 and Tsa1 (Fig. 3C). Tsa1 is a key regulator of the oxidative stress response, and an understanding of its regulation is of great importance. Tsa1 is generally expressed at high levels (82). Our data show a discrepancy in Tsa1 protein and mRNA expression regulation. Protein abundance decreases at first but then consistently increases throughout the measurement, contrasting the transcription down-regulation 1 h after treatment. The Tsa1 mRNA can be bound by five different RNA-binding proteins, among which Yra1 and Mex67 may be regulators of posttranscriptional changes in concentration of the protein (83, 84) (supplemental Data File 1).

Conclusions
Our work provides a large scale, time-resolved data set of yeast protein expression in response to oxidative stress. The protein measurements map directly to a transcriptome study that employed identical conditions (14), describing eight time points over 2 h after diamide treatment. Because of the intrinsic bias of mass spectrometry toward high abundance proteins, many transcription factors (of low abundance) are not included, but their roles in the oxidative stress response have been described elsewhere. Our analyses focus on protein expression changes beyond what can be explained by transcript changes, i.e. we examine the results of translation and protein degradation.
Overall protein expression changes reflect what is expected from the oxidative stress response: down-regulation of translation (cluster A), up-regulation of oxidoreductases, chaperones (cluster B), the proteasome, and other stress-response proteins (cluster C) (Figs. 1 and 3). However, for at least one-third of the genes in the data set, the time-dependent mRNA and protein expression profiles are different from each other ( Fig. 1 and supplemental Fig. S7), and mRNA and protein fold changes differ by up to 2 orders of magnitude (data not shown), suggesting extensive regulation at the level of translation and protein degradation. Typical stress experiments monitor transcript changes 30 -60 min post-treatment. Protein concentrations in our data often continue to change until 2 h post-treatment (Fig. 2), an observation to be considered when designing stress experiments.
Integrating the mRNA and protein expression profiles with features of translation and protein degradation (Table I), we predicted membership in regulatory clusters for 41% of the proteins in the core data set (270 of 651) with 0.66 to 0.88 area under the curve, i.e. the probability that the classifier will rank a random positive instance higher than a random negative instance (Table II). Predictive features focused on translation and protein degradation, because we did not aim to explain changes in transcription. However, many of the features likely play a role in both transcript and protein expression regulation.
The 17 most predictive features included protein stability (measured as the presence of unstructured loops), RNA secondary structures in the 5Ј,3Ј-UTR, and the coding sequence (measured as double-strandedness), general mRNA half-life and translation efficiency under unperturbed conditions, and binding of the RNA-binding proteins Lhp1 and Khd1 (Table II). Changes in translation efficiency in response to menadione (30), but not hydrogen peroxide (16), were predictive of cluster membership, suggesting similarity between the diamide and menadione response. Interestingly, the predictive features did not include the 50 chaperones for which target data are available (85), suggesting that they have only a minor role in targeted protein expression regulation. This observation may change with future, more complete chaperone data sets.
Although much previous work has demonstrated general down-regulation of translation and translation regulators, e.g. phosphorylation of translation initiation factors (86), there is much less information on the specific effects of translational regulation during stress. Our data set resolves some of the discrepancies between mRNA expression and protein activity, and we provide hints for some regulatory mechanisms. The proteomics measurements are sensitive enough to describe the detailed dynamics of protein concentration changes that have been missed by transcriptome analysis, for example the temporary decrease in concentrations of reactive cysteine-containing proteins such subunits of the 20 S proteasome.
For individual proteins, e.g. Tsa1, the proteomics data corroborate classic biochemistry experiments (87,88) and provide additional information on the time-dependent protein expression changes. The transcript and protein expression profiles for Tsa1 differ substantially (Fig. 3C), which could be caused by changes in protein localization, stability, translation, and by post-translational modifications. Future studies may provide many more time-resolved proteome measurements that will help our understanding of general and specific post-transcriptional expression dynamics. They will also help understanding even more intricate processes such as the long term adaptation of cells to stress, involving translation (89) and protein degradation regulation.