Establishment of Dimethyl Labeling-based Quantitative Acetylproteomics in Arabidopsis *

Protein acetylation, one of many types of post-translational modifications (PTMs), is involved in a variety of biological and cellular processes. In the present study, we applied both CsCl density gradient (CDG) centrifugation-based protein fractionation and a dimethyl-labeling-based 4C quantitative PTM proteomics workflow in the study of dynamic acetylproteomic changes in Arabidopsis. This workflow integrates the dimethyl chemical labeling with chromatography-based acetylpeptide separation and enrichment followed by mass spectrometry (MS) analysis, the extracted ion chromatogram (XIC) quantitation-based computational analysis of mass spectrometry data to measure dynamic changes of acetylpeptide level using an in-house software program, named Stable isotope-based Quantitation-Dimethyl labeling (SQUA-D), and finally the confirmation of ethylene hormone-regulated acetylation using immunoblot analysis. Eventually, using this proteomic approach, 7456 unambiguous acetylation sites were found from 2638 different acetylproteins, and 5250 acetylation sites, including 5233 sites on lysine side chain and 17 sites on protein N termini, were identified repetitively. Out of these repetitively discovered acetylation sites, 4228 sites on lysine side chain (i.e. 80.5%) are novel. These acetylproteins are exemplified by the histone superfamily, ribosomal and heat shock proteins, and proteins related to stress/stimulus responses and energy metabolism. The novel acetylproteins enriched by the CDG centrifugation fractionation contain many cellular trafficking proteins, membrane-bound receptors, and receptor-like kinases, which are mostly involved in brassinosteroid, light, gravity, and development signaling. In addition, we identified 12 highly conserved acetylation site motifs within histones, P-glycoproteins, actin depolymerizing factors, ATPases, transcription factors, and receptor-like kinases. Using SQUA-D software, we have quantified 33 ethylene hormone-enhanced and 31 hormone-suppressed acetylpeptide groups or called unique PTM peptide arrays (UPAs) that share the identical unique PTM site pattern (UPSP). This CDG centrifugation protein fractionation in combination with dimethyl labeling-based quantitative PTM proteomics, and SQUA-D may be applied in the quantitation of any PTM proteins in any model eukaryotes and agricultural crops as well as tissue samples of animals and human beings.

In vitro dimethyl labeling of peptides has emerged as one of the fastest and most dependable chemical-labeling strategies for quantitative proteomics since its introduction in 2003 (1). This approach has the advantages of high cost-effectiveness, high labeling efficiency, and minimal side reactions (2)(3)(4). This unique chemical labeling method utilizes formaldehyde and sodium cyanoborohydride to react with the primary amines of the N-terminal residues of peptides as well as with lysine side chains to generate dimethylamines (1). In vitro light and heavy isotope-coded dimethyl labeling can be achieved once the total cellular proteome is proteolytically digested into peptides (5). Dimethyl labeling together with label-free strategy or other types of chemical labeling, such as TMT (tandem mass tags) and iTRAQ (isobaric tags for relative and absolute quantitation), has become common choices for quantitative proteomics in cell lines, tissues, and organisms that cannot be labeled via metabolic isotopic labeling (6,7). In contrast to label-free quantitative proteomics, which normally requires a higher number of technical replicates both to ensure the consistency of sample preparation and to reduce the variability arising from LC-MS/MS analysis (3,8,9), dimethyl labeling may require fewer technical replicates as peptides are labeled and mixed at the beginning of peptide enrichment, which eliminates the variation derived from enrichment processes. In theory, there should be no limitations for applying this labeling method to samples from any biological source (10).
Because of the unique advantages of dimethyl labeling in quantitative proteomics, it has been applied to study the proteomes of many unicellular and multicellular organisms, including bacteria (11,12), human cells (1,13), mouse and rat cells (14,15), and plants (16 -18). Dimethyl labeling approaches have further been applied to quantitatively identify phosphoproteins, the predominant class of post-translationally modified proteins, in rat and mouse (19,20), zebrafish (21), and human cells (22,23). Similar examples of dimethyl labeling include glycoproteomics studies in rat (24,25) and human cells (26). The early work of Boersema et al. has combined dimethyl labeling-based quantitative proteomics and antibody-based enrichment to profile tyrosine phosphorylation in Hela cells (27).
To further develop a workflow for high-throughput dimethyl labeling-based quantitative post-translational modifications (PTM) 1 proteomic analysis, we took protein acetylation as an exemplary PTM proteomic study because it is widely recognized as one of the key PTMs involved in the regulatory mechanisms of many physiological processes (28 -31). Many acetylation sites have been identified in histone proteins (31)(32)(33)(34). The acetyl moieties of histones generally function as an epigenetic code that is recognized by "readers," which leads to changes in chromatin structure to modulate gene transcription (35). Acetyl-CoA is believed to be an important molecule for balancing gene transcription and metabolic pathways through protein acetylation (36). Given the importance of acetylation, thousands of acetylation sites have been previously found and the acetylation levels have been quantified through either isotopic labeling or label-free quantification strategies in bacteria (37,38), mammalian cell lines (39 -41), malaria parasites (42), mouse hearts (43), mouse livers (44 -46), and human cells (47).
In plants, protein acetylation is viewed as a widespread type of PTM that mediates a diverse range of biological processes and metabolic pathways through the modification of targets such as chlorophyll binding proteins, Rubisco subunits, and ATP synthase (48,49). Histone acetylases and deacetylases are known to regulate the acetylation levels of histone proteins (50). Many studies on acetylation at the single protein level, especially those of histone acetylation in various dicots and monocots, support the importance of protein acetylation in plant growth and development (51)(52)(53)(54)(55)(56)(57). To achieve a global view of acetylproteins in plants, acetylproteomics has been performed in numerous plant species, revealing nearly 1015 acetylproteins in rice (58 -60), 245 in soybean (61), 353 in stiff brome (62), 277 in wheat (63), 684 in strawberry (64), and 1383 in Arabidopsis (48,(65)(66)(67). These findings have greatly improved our understanding of plant acetylproteomes in general and have provided useful information for elucidating the mechanisms underlying the regulation of biological processes by protein acetylation.
Many proteomic approaches have been proposed to quantify either peptides or proteins (68 -71). Among them, the isotopic-labeling coupled with XIC has been widely used in many biological studies (68). Both SILAC (72) and dimethyl labeling-based quantitative proteomics (1) belong to this type of approach. To accurately analyze the data generated from these quantitative proteomics, various computer programs have also been constructed (73)(74)(75). Among them, there are several widely used software, such as MaxQuant (76), pQuant (77), XInteract (XPRESS, (78)), and Skyline (79). Their performances and robustness have been tested in many biological studies (80) except for a few of unique biological studies. Thus, more quantitative tools, such as DeMix-Q (81), PyQuant (82), moFF (83), and pyQms (84) were consequently developed. In the present study, we have also built an in-house PTM peptide quantitation software SQUA-D dedicated to the dimethyl labeling-based quantitative PTM proteomics. Several reasons are behind the development of SQUA-D software rather than using the common tools mentioned above directly. Firstly, these commonly used tools only quantify post-translationally modified peptides, whereas our aim was to quantify UPAs. Each UPA comprises a group of acetylpeptides with different missing cleavages but sharing the same UPSP. Analyzing UPAs allows us to quantify acetylpeptides in groups instead of analyzing each peptide individually. Secondly, SQUA-D introduces a batch effect adjustment function for quantification, as batch effects exist in almost all high- 1 The abbreviations used are: PTMs, post-translational modifications; XIC, extracted ion chromatogram; 4C proteomic workflow, 4-step workflow of quantitative PTM proteomics consisting of (1) either in vitro or in vivo chemical labeling, (2) chromatographic separation and enrichment followed by mass spectrometry analysis, (3) computational analysis for identification, quantitation, and statistical evaluation, (4) confirmation via alternative method; SQUA-D, stable isotope-based quantitation-dimethyl labeling; UPAs, unique PTM peptide arrays; iTRAQ, isobaric tags for relative and absolute quantitation; TMT, tandem mass tags; Col-0, Columbia-0; ein3/eil1, ethylene insensitive 3/ethylene insensitive 3-like 1; AOA, aminooxyacetic acid; ACC, 1-aminocyclopropane-1-carboxylic acid; CDG, detergentfree cesium chloride (CsCl) density gradient; UEB, urea protein extraction buffer; t, top fraction; m, middle fraction; b, bottom fraction; MSPD, membrane-solubilizing and protein-denaturing buffer; F, forward mixing of peptide samples; R, reciprocal mixing of peptide samples; SCX, strong cation exchange; NCE, normalized collision energy; DDA, data-dependent acquisition; AGC, automatic gain control; FDR, false discovery rate; ANOVA, analysis of variance; SD, Standard deviation; GO, Gene Ontology; L, light dimethyl labeling; H, heavy dimethyl labeling; PSM, peptide-spectra match; OAP, overly acetylated proteins; UPSP, unique PTM site pattern; HSP, heat shock protein; HSF, heat stress transcription factor; BH, Benjamini-Hochberg. throughput experiments (85). Thirdly, MaxQuant does not support Mascot-based identification (86) for the subsequent quantitation. Considering that Mascot is one of the most widely used proteomic software, we aimed to integrate its identification results into our quantification workflow. To that end, SQUA-D software was developed to pair unidentified yet highly confident peaks with previously identified ones given that each sample contains both light-and heavy-labeled ions. As a result, SQUA-D analysis revealed significantly more signal-regulated acetylation sites than previous studies. This entire workflow of dimethyl-labeling-based acetylproteomics can be summarized as four steps of functional quantitative PTM proteomics (or abbreviated as 4C proteomic workflow), consisting of in vitro chemical labeling (i.e. dimethyl labeling), chromatographic separation and enrichment followed by mass spectrometry analysis, computational analysis by identification, quantification and statistical evaluation software and finally confirmation of identified PTM sites with alternative method.

EXPERIMENTAL PROCEDURES
Experimental Design and Statistical Rationale-The in vitro chemical labeling-based quantitative PTM proteomics ( Fig. 1; supplemental Fig. S1) is abbreviated as "4C proteomic workflow." The "C" was taken from an English letter in the title of each step of the quantitative PTM proteomics. After plant preparation and treatment ( Fig. 1B-I, including both control and the ethylene hormone-treated plants) and CDG centrifugation fractionation of the total cellular protein followed by tryptic digestion and peptide preparation (  (4) finally confirmation using alternative molecular and cellular methods. Normally, three biological replicates were performed to meet the minimum requirement of subsequent statistical evaluation. Given that multiple steps of preparation of PTM peptides via biological replicates, CsCl density gradient fractions, labeling and mixing of peptides and HPLC separation and enrichment, it is likely that there exist batch effects among replicates and multiple steps of preparation. Thus, batch effect adjustment was introduced to process the MS data (supplemental Fig.  S1 and supplemental Table S2). The peptide-spectrum match (PSM) cut-off threshold was set at a false discovery rate (FDR) of 1% for PTM peptide identification. The FDR was estimated based on the target-decoy strategy. Because of incomplete digestion of proteins and the presence of isoforms in a protein family, various peptides sharing the identical UPSP were merged into the UPAs or PTM peptide groups. To identify significantly altered UPAs, we frequently performed one-sample t-test on the log-ratios of the same UPAs identified and grouped from different samples. Following the t-test, a p-value was determined for each UPA. Finally, the multiple hypothesis test correction or called Benjamini-Hochberg procedure (87) was performed during the selection of UPAs with p-values to produce a list of prospect candidates that have passed the correction with BH-FDR Ͻ either 0.05 or 0.1, which can be interpreted as that each selected PTM peptide has either 95% or 90% probability of being altered by a treatment.
Plant Materials, Growth Conditions, and Hormone Treatment-The wild-type Arabidopsis thaliana ecotype Columbia-0 (Col-0) seeds were purchased from the Arabidopsis Biological Resource Center (ABRC, Columbus, OH). The ein3/eil1 double mutants were gifts from Dr. Hongwei Guo at the South University of Science and Technology of China. The Arabidopsis plants were cultivated on solid agar mediums in 7.7-cm diameter and 12.7-cm tall transparent white glass jars, containing 100 M aminooxyacetic acid (AOA, an inhibitor of 1-aminocyclopropane-1-carboxylic acid (ACC) synthase frequently used to block the biosynthesis of endogenous ethylene) as previously developed (88). The Arabidopsis growth room was controlled at 23.5°C Ϯ 1°C with a constant light supply (180 -240 E m Ϫ2 s Ϫ1 ). Ethylene treatment was performed by growing Arabidopsis plants on agar medium supplemented with 10 M ACC (the ethylene precursor). Plants were grown on 50 ml of Murashige and Skoog (MS) basal agar medium in transparent glass jars for 21 days with a density of 15 plants per jar. In a biological replicate, the aerial tissues were harvested from a nearly equal number of control and the treated plants. Each group of plants, either control or the treated, were grown in 120 -160 jars. The harvested tissues were frozen immediately in liquid nitrogen and stored at Ϫ140°C.
CsCl Density Gradient Centrifugation-Based Protein Fractionation-The total cellular protein was extracted individually from both control and the treated plants using detergent-free CsCl density gradient (CDG) buffer for centrifugation fractionation, which contains 3 M CsCl and a modified urea protein extraction buffer (UEB, (88)). The frozen plant tissues were ground into fine powder and mixed with CDG buffer at a ratio of 1:4 (w/v). After centrifugation at 218,000 ϫ g for 2 h at 10°C, the top (t), middle (m), and bottom (b) protein fractions were obtained. The highly membrane-concentrated t and b fractions were re-dissolved using 5 volumes of membrane-solubilizing and protein-denaturing buffer (MSPD), which contains 20 mM Tris-HCl, PH 7.8, 8 M urea, 10 mM EDTA (ethylene diamine tetraacetic acid), 10 mM EGTA (ethylene glycol tetraacetic acid), 50 mM NaF, 2% Glycerol, *1% Glycerol-2-Phosphate disodium salt hydrate, *1 mM PMSF (phenylmethylsulfonyl fluorid), 1% SDS (sodium dodecyl sulfate), and 1.2% Triton-X100. Proteins from each one of three fractions were precipitated and quantitated according to the previously described (89). By the CDG centrifugation fractionation, the total cellular protein isolated either from control or the treated plant tissue was separated into three protein fractions (t, m, and b).
Tryptic Digestion of Proteins and Dimethyl Labeling of Peptides-Protein samples from both control and the ethylene-treated plants were dissolved into preheated (37°C) trypsin digestion buffer (40 mM Tris-HCl, PH 8.0) with protease trypsin (100:1, w:w). The final concentration of urea should be lower than 1 M. The in-solution digestion was performed for 12 h at 37°C. The digested peptides were desalted and enriched by C18 Sep-Pak cartridges (Waters Corporation, United Kingdom). The peptides prepared either from control or the ethylene-treated plants were re-suspended in 100 mM sodium acetate (pH5.5) solution and were then divided into two parts equally. The two parts of peptides were labeled separately with light isotope-coded chemicals ( 12 CH 2 O and NaBH 3 CN) and heavy isotope-coded dimethyl chemicals ( 13 CD 2 O and NaBH 3 CN) (2). The mixing of an equal amount of heavy isotope-labeled peptides from the treated plants with light isotope-labeled peptides from control plants was defined as the Forward mixing experiment (F), and vice versa defined as the Reciprocal mixing experiment (R). Both F and R mixing samples were independent experimental replicates but were not different biological replicates. Each biological replicate thus produces six mixings, three from F mixings (i.e. tF1, mF1, and bF1) and another three from R (tR1, mR1, bR1). These six mixing replicates are six different batches of peptide samples.
HPLC Fractionation and Acetyl-Affinity Enrichment of Acetylated Peptides-Each of the mixed peptide samples was further fractionated into 3 sub-fractions on HPLC equipped with a 200 ϫ 9.4 mm strong cation exchange (SCX) column (PolySULFOETHYL ATM, 5 m, 200 Å, 209SE0502, PolyLCINC, Columbia, MD) at a flow rate of 2.5 ml/min. Ultraviolet absorption was measured at 214 nm to monitor the peptide elutes. The resulting SCX-HPLC-fractionated peptide samples were suspended in a NETN buffer, which contains 100 mM NaCl, 1 mM EDTA, 50 mM Tris-HCl, 0.5% Nonidet P-40, pH 8.0) and incubated with the beads conjugated with anti-acetyllysine antibody (PTM Biolabs, Hangzhou, China) at 4°C overnight with gentle shakings. The resulting mixture solutions were washed 4 times with NETN buffer and 2 times with ddH2O, followed by the final elution with 0.1% TFA. The acetylpeptide eluates were subsequently desalted using C18 ZipTip column (ZTC18S960, Millipore, MA) and were finally subjected to LC-MS/MS analysis.
Liquid Chromatography-Tandem Mass Spectrometry (LC-MS/MS) Analysis-The LC-MS/MS analysis of acetylpeptides was performed on an EASY-nLC 1000 UPLC system (Thermo Fisher Scientific, Odense, Denmark) coupled with a Q Exactive TM Hybrid Quadrupole-Orbitrap TM mass spectrometer (Thermo Fisher Scientific, Germany). The resulting acetylpeptides were loaded onto an Acclaim PepMap 100 C18 pre-column (Dionex, Sunnyvale, CA) and separated in an Acclaim PepMap RSLC C18 analytical column (Dionex) at a constant flow rate of 300 nL/min on a gradient of 0 -24 min 6 -22% B, 24 -32 min 22-% B, 32-37 min 40 -80% B, 37-40 min 80% B, where B is (0.1% FA in 98% ACN). Mass resolution was set at 70,000 for intact peptides (MS1) and at 17,500 for ion fragments (MS2) under a normalized collision energy (NCE) of 30. The data-dependent acquisition (DDA) mode was adopted for the top 20 precursor ions, all of which exceed a threshold ion count of 5E 3 with 15s dynamic exclusion. To generate MS/MS spectra, about 5E 4 ions were accumulated whereas automatic gain control (AGC) was used to avoid overfilling the Orbitrap. The precursor m/z scan range was set between 350 to 1800.
Mass Spectrometry Data Analysis-The raw MS data files generated by the Thermo Q Exactive Mass Spectrometer were converted into the mzXML and MGF format using ProteoWizard (version: 3.0.11133 64-bit) (90). The FDR was estimated using the target-decoy strategy (91). Thus, we searched the data files against both target and decoy databases. The target database which was the TAIR10 (35,387 proteins, https://www.arabidopsis.org/download_files/Sequences/ TAIR10_blastsets/TAIR10_pep_20101214_updated) whereas the decoy database was generated by random shuffling the target peptide sequences. We applied Mascot (Version 2.5.1, Matrix Science) for peptide identification as described previously (92). Trypsin was used as the protease to digest acetylproteins. The mass tolerance was Ϯ10 ppm for mass spectra and 0.02 Da for tandem mass spectra. The maximum missed cleavage was six. The fixed modification was carbamidomethyl (57.021464 Da) on Cysteine. The variable modifications were oxidation (15.994915 Da) on Methionine, light dimethyl labeling (C 2 H 4 , 28.031300 Da) on lysine residue and N terminus of a peptide, heavy dimethyl labeling ( 13 C 2 D 4 , 34.063117) on Lysine residue and N terminus of a peptide, acetylation (C 2 H 2 O, 42.010565) on lysine residue and N terminus of a peptide. Mascot-Percolator (Version 3.1) (93) was used to estimate the FDR. Mascot-Percolator converts the original FDR into the q-value. We refer the q-value as FDR. The peptide-spectrum match (PSM) cut-off threshold was set at a FDR Յ 0.01.
Quantification of Acetylpeptides-In-solution protease digestion of acetylproteins may produce the peptides with missed cleavages. To quantitate the acetylpeptides with the common acetylation site(s), both completely digested and partially digested acetylpeptides sharing the identical unique PTM site pattern (UPSP) were combined into a peptide group or called the unique PTM peptide array (UPA). For example, "n(34.063)GGK(42.011)GLGK(42.011)GGAK(34.063)Rc" and "n(34.063)GGK(42.011)GLGK(42.011)GGAK(34.063)c," where "n" stands for peptide N terminus, "c" for peptide C terminus, "34.063" for heavy dimethyl labeling, and "42.011" for acetylation. These two peptides share the same UPSP. A new computer program, named SQUA-D (STable isotope-based Quantitation-Dimethyl labeling, Version 1.0), was developed to perform the quality control, quantification analysis, and statistical evaluation. The quality control of the quantifiable PTM peptides was based on the following criteria: 1 the number of the PSMs of the light dimethyl-coded acetylpeptide is larger than or equal to one; 2 the number of the PSMs of the heavy dimethyl-coded acetylpeptide is larger than or equal to one; 3 the acetylpeptide must be identified in at least five out of all six experiment replicates (F1, F2, F3, R1, R2, and R3); 4 The number of the identified PSMs from the F mixing experiments divided by the total number of PSM number is larger than or equal to 0.15; 5 The number of the identified PSMs from the R mixing experiments divided by the total number of PSM is larger than or equal to 0.15.
After selection of quantifiable PTM peptides, SQUA-D extracted ion chromatograms of both differentially labeled isotopic peptides, paired light-and heavy-labeled ion chromatograms, and calculated the log-ratios using the maximum intensities (76) of the smoothed ion chromatograms. To make the most use of the data, SQUA-D also paired the unidentified yet highly confident ion chromatogram with the identified one. For example, if SQUA-D detected that the identified light-labeled peptide does not pair with a heavy-labeled corresponding peptide, it would calculate the theoretical m/z value and the isotopic pattern of the deduced heavy-labeled peptide. Then, SQUA-D finds the highly possible ion satisfying three criteria: (1) the m/z value equals the theoretical m/z value with a predefined tolerance; (2) the Pearson correlation of the theoretical isotopic pattern and the observed isotopic pattern is larger than or equal to 0.7 (94); and (3) there is overlap between two retention time ranges from the light-and heavy-labeled ion chromatograms. Finally, SQUA-D pairs these two ion chromatograms and calculates a log-ratio. SQUA-D also uses half of the minimum value among all extracted intensities as a replacement for the zero intensities of certain peptides. After calculating the log-ratios for the peptides from the UPAs, each log-ratio is adjusted with the median value of all the log-ratios from the same replicate to circumvent the inevitable mixing error. Finally, SQUA-D can adjust the batch effects and perform the statistical evaluation of the log-ratios.
Statistical Evaluation of Acetylpeptides-In the 4C quantitative PTM proteomics workflow, there were six batches of mixing replicates in a single biological replicate, and the batches are tF1, tR1, mF1, mR1, bF1, and bR1. Three biological replicates produced 18 batches, to which experimental factors, i.e. plant harvesting repeats, three CDG fractions, two types of mixings, "Forward" and "Reciprocal," all contributed to the errors in data collection and measurement. Thus, three-way analysis of variance (ANOVA) was performed on these 18 peptide samples to test if there exist any batch effects (95). If there were, we would adjust them based on an empirical Bayes method proposed by Johnson et al. (96).
In principle, the log-ratios were first modeled as where i indicates different batches, j indicates different log-ratios in the same batch, g indicates different UPA, Y ijg is the log-ratio, ␣ g is the mean of all log-ratios from UPAg, X is the experimental design matrix, ␤ g is the coefficient for experimental conditions, ␥ ig is the additive batch effect, and ␦ ig ijg is the multiplicated batch effect. In our data, X is an identity matrix because we have taken care of two experimental conditions and used log-ratios to represent them. Thus, the X␤ g term can be ignored in the Equation (1). With Equation (1), the batch effect adjustment is achieved by adjusting ␥ ig term and ␦ ig ijg term.
The data was first standardized with The empirical Bayes approach was applied to estimate ␥ ig and ␦ ig After estimating the parameters, the log-ratios were adjusted with- At the end, one-sample t-test was employed to test if a given UPA's log-ratios had a mean significantly different from zero. After t-test, multiple hypothesis test correction was performed based on the Benjamini-Hochberg procedure, which resulted in the outcome of BH-FDR (87). BH-FDR is different from the target-decoy-based FDR during the selection of acceptable PSMs. The BH-FDR threshold was set to 0.1. Standard deviation (S.D.) for all UPA's log-ratio mean values were calculated afterward. The UPA with log-ratio mean values outside 0.5 ϫ S.D. and BH-FDR value smaller than or equal to 0.1, were selected as the significant findings (see supplemental Table S3). The flowchart of computational programs of the Mascot-based identification in conjunction with the extracted ion chromatogram (XIC)based quantification is shown supplemental Fig. S1.
Bioinformatics Analysis-The conserved acetylation site motif analysis was performed using the Motif-X software (97). Molecular functions enrichment, cellular components enrichment, and biological processes enrichment were conducted online by the Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.7 (https://david-d.ncifcrf.gov/tools.jsp) with the following thresholds: Gene count Ն 5, p-value Յ 0.01, FDR Յ 0.1 and then was drawn by software Cytoscape 3.3.0. The databases for Gene Ontology (GO) analysis was obtained from the website of the Arabidopsis information resource (TAIR). Comparison of the GO analysis results between the acetylproteins and the leaf proteome of Arabidopsis was performed using the following equation (92): where N and NЈ represents the total matching number of all categories of the identified acetylproteins and that of the proteins from leaf proteome of Arabidopsis (98), respectively. N i and N i Ј is the matching number of i-th category of the identified acetylproteins and that of the proteins from the leaf proteome of Arabidopsis (98), respectively. Comparison of the GO analysis results between the acetylproteins and phosphoproteins (Full data sets of PhosPhAt 4.0, (99)) was performed using the same equation.
Immunoblot Analysis-Plant proteins were extracted from plant tissues using the UEB as described previously (89). Immunoblot analysis was performed using the following antibodies: (1) Anti-Human acetyl-histone H4(K5) monoclonal antibody (PTM-163), which targets both the K5 acetylation site of SGRGacKGGKGLGK and the K6 acetylation site of Arabidopsis histone superfamily protein (AT1G07660); (2) the Anti-actin (plant) monoclonal antibodies (a0480, Sigma Aldrich, St Louis, MO, USA). The first antibody was purchased from the PTM Biolabs, Hangzhou, China. 50 g of the resulting extracted proteins were loaded onto 12% SDS-PAGE gels before being immobilized onto a PVDF membrane (GE Healthcare) and then probed with the antibodies mentioned above. The t-test was used to evaluate the significance between control and treatment.

LC-MS/MS and Computational Analysis of Arabidopsis
Acetylproteomes-A comprehensive dimethyl-labeling-based quantitative acetylproteomics has been performed on a model plant organism Arabidopsis (Fig. 1), which consists of ethylene treatment of plants ( Fig. 1B-I), protein fractionation by a detergent-free CsCl density gradient centrifugation followed by trypsin digestion (Fig. 1B- The total cellular protein isolated from both control and the treated plants were further fractionated on a CDG centrifugation (Fig. 1B-II). The protein of top (t), the middle (m) and the bottom (b) fraction was ϳ120 mg, 360 mg, and 40 mg, respectively, for either control or the treated tissues. As a result, in one biological replicate, two protein samples were prepared from both control and the treated Arabidopsis. A total of three biological replicates were performed for each of control and the treated plants. Peptides were generated subsequently by the tryptic digestion from these six protein samples. About 80 mg, 270 mg, and 30 mg of peptides were eventually produced from t, m, and b protein sample from each of control and the treated plant tissues, respectively ( Fig. 1B-II). For dimethyl chemical labeling of peptides from both control and the ethylene-treated plant tissues, about 80 mg, 90 mg, and 30 mg of peptides were taken from each of CDG fraction t, m, and b, respectively. These peptides chemically labeled at the initial step of the quantitative PTM proteomics were immediately subjected to a subsequent forward (F) and reciprocal (R) mixings (or called pairs, Fig. 1B-III, which was defined to be the first C step of 4C proteomic workflow). To generate three forward (F) peptide pairs (tF1, mF1, and bF1) prepared from three CDG fractions (Fig. 1B), three peptide samples prepared from either control or the treated plants were chemically labeled with light (L) and heavy (H) dimethyl chemicals, respectively, and mixed subsequently (see Experimental Procedure for details). Conversely, light and heavy chemicals were used to label peptides prepared from each of the treated sample and control proteins sample in a similar fashion to produce three reciprocal (R) mixings (tR1, mR1, bR1, Fig. 1B). As a result, a single biological replicate produced two sets of mixings (F1 and R1) or called experimental replicates, which consists of six batches of mixed peptide samples (tF1, mF1, bF1, tR1, mR1, and bR1, where "1" represents the first biological replicate). Thus, three biological replicates altogether produced a total of 18 batches of peptide samples, each of which was further fractionated into three sub-fractions on SCX-HPLC, resulting in a total of 54 L/H isotopic peptide samples. The yield of peptides varied from 2 to 15 mg among these fractions. Only a fraction of the isotopic peptide samples (1 to 1.5 mg of peptides per fraction) was aliquoted and subsequently enriched for acetylpeptides using the acetyl moiety-specific antibodies (see the Experimental Procedure for details). These highly enriched acetylpeptide samples were then subjected to LC-MS/MS analysis. By a standard 1% FDR cut-off, a total of 16,503 non-redundant (unique) L/H isotopic acetylpeptides were determined from 105,958 acceptable PSMs or redundant acetylpeptides that were identified by a commonly used Mascot search engine. These acetylpeptides correspond to 4480 acetylproteins (supplemental Table S1a). Further processing these PSMs by applying a Mascot delta score Ն 10 (supplemental Table S1b) as previously described (100), we obtained a total of 63,490 redundant acetylpeptides (supplemental Table S1b), out of which 60,077 acetylpeptides were found to be repeatable (i.e. acceptable PSMs Ն 2; supplemental Table S1c). Of these repetitively identified acetylpeptides, 5246 (i.e. 8.3% of 60,077) were derived from histone proteins directly (supplemental Table S1c), indicating that the histone superfamily protein is one of the most abundant acetylprotein families in Arabidopsis.
Further analysis of these 63,490 redundant acetylpeptides (supplemental Table S1b) found that there were 5399 and 5434 of non-redundant light and heavy isotopic acetylpeptides, respectively ( Fig. 2A), in the current Arabidopsis acetylproteome, in which the ratio of the numbers of two types of isotopic acetylpeptides was determined to be 1.0065, suggesting a successful non-biased chemical labeling and mixing of these peptides at the first C step of quantitative PTM proteomic workflow. To summarize the number of unique acetylpeptides measured from each experimental replicate (i.e. F1, F2, F3, R1, R2, and R3), we combined all three SCX-HPLC sub-fractionated peptide data, which were derived from three forward mixings of the same biological replicate (tF1, mF1, and bF1), into a single data set for an experimental replicate, and named it as F1. Similarly, nine data sets of subfraction peptides from tR1, mR1, and bR1 mixings were combined into a single data set for R1. It is obvious that the number of unique acetylpeptides identified increases as does the number of experimental replicate (Fig. 2B). In total, we have identified 6734 unique acetylpeptides (the number of either L or H acetylpeptides Ն 1) from these six experimental replicates (F1, R1, F2, R2, F3, and R3, Fig. 2B), among which 4540 unique acetylpeptides were identified repetitively (the number of L ϩ H peptides Ն 2 or L ϩ L Ն 2 or H ϩ H Ն 2, supplemental Table S1d).
Bioinformatic analysis of these 4540 unique and repeatable acetylpeptides revealed 5250 acetylation sites (supplemental Table S1d) and 12 highly conserved acetylation site motifs, each of which was associated with either a single or double acetylation site (Fig. 2C, supplemental Fig. S2). The double acetylation site motifs, AEKKPAEK, GKGGKGL, and PKAGKKLP, were identified among histone proteins, and they are shown in Fig. 2C. It suggests that the acetylation site motifs of histone proteins are highly ubiquitous and prevalent. Moreover  Fig. S2). The presence of acetylation site motifs in several transcription factors (supplemental Fig. S2G) suggests that acetylation may affect the functions of transcription factors in regulation of gene expression as described previously (101). A particularly interesting finding is the presence of acetylation site motifs in certain kinases (supplemental Fig.  S2B), which strongly suggest that there may exist cross-talks between acetylation and phosphorylation events in proteins as previously reported (37). The molecular mechanisms underlying how both acetylation and phosphorylation mutually affect each other on the functions of PTM proteins remain to be elucidated.
These 5250 acetylation sites are distributed on 2638 acetylproteins (supplemental Table S1d) in Arabidopsis, 17 of which located at the N termini of acetylproteins. Among these acetylation sites, 4228 lysine residue acetylation sites are classified as novel sites (Fig. 2D) according to previous publications (48,(65)(66)(67). As compared with GO analysis results of the previously identified protein acetylation sites, the GO analysis results of current novel acetylproteins (supplemental Fig. S3) revealed that these acetylproteins of novel acetylation sites are enriched in endoplasmic reticulum (ER), Golgi apparatus, plasma membrane and proteins of molecular functions in receptor binding or receptor and kinase activity (supplemental Fig. S3). Fifty-nine plasma membrane acetylproteins of novel acetylation sites have putative kinase activity (supplemental Table S4), eight of which also have receptor binding function or receptor activity (supplemental Table S4) and include Barely any meristem 1 (BAM1, AT5G65700), Barely any meristem 2 (BAM2, AT3G49670), BAK1-interacting receptor-like The measurement of log-ratios of the extracted ion chromatogram (XIC) is followed by the statistical evaluation, which included both t-test and multiple test correction (i.e. Benjamini-Hochberg procedure). The trafficking proteins are tightly associated with ER, Golgi apparatus and plasma membrane as CsCl density gradient fractionation enriches these cellular membranes (See Experimental Procedure for details). Interestingly, membrane associated receptor-like kinases are well-known to be involved in hormone (e.g. brassinosteroid), light, gravity, and development stage signaling pathways.
Moreover, over a half (56.6%) of the 2638 acetylproteins contain a single acetylation site, whereas 1.3% of them contain more than 8 acetylation sites (Fig. 2E). These acetylproteins with multiple acetylation sites were defined to be overly

FIG. 2. Proteomic analysis of Arabidopsis acetylpeptides.
A, A Venn diagram shows the identified non-redundant (or unique) light-and heavy-isotope-coded acetylpeptides. B, An accumulation curve shows all of the identified non-redundant acetylpeptides generated from six experimental replicates (i.e. F1, R1, F2, R2, F3, and R3). C, The highly conserved acetylation site motifs of histone proteins. Asterisks (*) indicate acetylation sites. D, A Venn diagram contains the previously reported acetylation sites and the acetylation sites identified from this study. Blue circle represents previously identified acetylation sites on lysine side chains and red circle represents the identified lysine side chain acetylation sites from this study. E, A distribution of acetylation sites per protein. F, Biological process enrichment for the overly acetylated proteins (OAPs, which are proteins containing Ͼ 8 acetylation sites). The thickness of the line represents the number of overlapping proteins between two processes. Thicker lines indicate more proteins. acetylated proteins (OAPs). Both molecular function enrichment and cellular component enrichment of these OAPs showed that they are distributed among the cytosol, chloroplast, plasma membrane, nucleus, vacuole, and ribosome, and many of them have binding activities (supplemental Fig.  S4). Biological process enrichment also showed that the OAPs are involved in histone protein-based protein complex organization and assembly, translation, protein folding and responses to stress stimuli, such as metal ion concentrations and temperature (Fig. 2F). These observations suggest that the cellular activities of these OAPs may be regulated by acetylation in the above mentioned biological processes.
Quantitative Analysis of the Acetylproteomes of Ethylenetreated Arabidopsis-From the 4540 unique and repeatable acetylpeptides, we selected 1288 quantifiable acetylpeptides for quantitation according to the selection criteria described in the Experimental Procedure ((92), supplemental Table S2a). These quantifiable acetylpeptides were firstly converted into 1155 UPAs (Supplemental Table S2b). Because the multiple protein isoforms of a protein gene family, or the highly conserved protein domains within a family of proteins, may contain the identical UPSP, and because incomplete tryptic digestion may produce many unique acetylpeptides that share the same UPSP, all acetylpeptides sharing the same UPSP were combined to form a single UPA for quantitation (supplemental Table S2b).
The batch effect adjustment (96) was then conducted among 18 batches (3 biological replicates ϫ 2 mixing types ϫ 3 protein fractions of CsCl density gradient fractionation) of mixed isotopic acetylpeptide samples. All MS data collected from three SCX-HPLC sub-fractions were combined to form a single batch of acetylpeptide data set (Fig. 1B). The batch effect adjustment aimed at eliminating the technical errors caused by variations in plant harvesting, protein fractionation, peptide preparation, chemical labeling, and mixing (supplemental Fig. S1; see Experimental Procedure for details). As a result, 33 ethylene-enhanced and 31 -suppressed UPAs were identified and quantified by the dimethyl labeling-based quantitative PTM proteomics and SQUA-D software (Supplemental Table S3 Table S3).
During the quantitative acetylproteomic analysis, we surprisingly found that the histone protein AT1G07660 was the most abundant acetylprotein in ein3/eil1 mutant, which accounts for 2606 of redundant acetylpeptides, which is 49.7% of all histone acetylpeptides discovered from the current Arabidopsis acetylproteome. The acetylation UPA (orange circle, Fig. 3A) contains three acetylation sites: K6, K9, and K13, which is defined as an UPSP of AT1G07660-K6-K9-K13. Both its MS and MS/MS spectra, as shown in Fig. 3B and 3C, demonstrated that the UPA of AT1G07660-K6-K9-K13 acetylprotein is ethylene hormone-suppressed whereas the other acetylation UPA that contain an UPSP of AT1G07660-K6-K13-K17, which is derived from the same protein and shares the two acetylation sites, AT1G07660-K6 and -K13, is, on the contrary, ethylene hormone-enhanced (supplemental Table  S3, supplemental Fig. S6).
Immunoblot Validation of Ethylene-regulated Unique Acetylation Site on Histone Proteins-To validate the quantitative PTM proteomic results obtained from XIC-based quantitation (the fourth C step of proteomic workflow, i.e. confirmation of hormone-regulated acetylation sites, Fig. 3J), the immunoblot analysis was performed on a histone superfamily protein (AT1G07660) using the AT1G07660-K6 acetylation-specific antibodies. The level of AT1G07660-K6 acetylation was reduced Ϫ2.46 and Ϫ3.64 folds in the wild-type Arabidopsis (Col-0) and an ethylene-insensitive mutant ein3/eil1, respectively, upon a long-term ethylene treatment (see Experimental Procedure for details, Figs. 3J). The acetylation site motif SGRGKGG (AT1G07660-K6, Fig. 3K), which was shared by eight isoforms of the same superfamily histone proteins, was predicted to be ethylene-suppressed according to quantitative PTM proteomic data ( Fig. 3A and 3B). The ethylene production rate of ACC-treated wild-type Arabidopsis was meas-ured to be 19.5 times higher than that of the control plants (supplemental Fig. S7, see Experimental Procedure for details). These immunoblot results confirmed the conclusion drawn based on the quantitative PTM proteomics.
Bioinformatic Analysis of Acetylproteins-To show the molecular, biological, cellular function category these acetylproteins belong to, the GO analysis results of these newly identified acetylproteins were compared with that of the leaf proteome of Arabidopsis (see Experimental Procedure for details). As a result, we found that these acetylproteins are concentrated in the molecular function: structural molecule activity, DNA or RNA binding, and transporter activity (A1-A3), in the biological process: electron transport or energy pathways, response to abiotic or biotic stimulus, and response to stress (A4 -A6), and in the cellular components: ribosome, cell wall, and plastid (A7-A9) (Fig. 4A and supplemental Fig. S8).
It was also interesting to find that 55.14% of the 2628 repetitively identified acetylproteins (i.e. 1449 acetylproteins) contain at least one phosphorylation site according to the PhosPhAt 4.0 database (supplemental Fig. S9A), suggesting that other type of PTMs may be tightly associated with the acetylation of proteins. GO analysis of acetylproteins and phosphoproteins demonstrated that acetylproteins are concentrated in the following molecular functions: structural molecule activity, other enzyme activity, and transporter activity, in the biological processes: electron transport or energy pathways, response to abiotic or biotic stimulus, and response to stress, and in the cellular components: ribosome, cell wall, and plastid (supplemental Fig. S9). In contrast, phosphoproteins are concentrated in the molecular functions: nucleic acid binding, kinase activity, and transcription factor activity, in the biological processes: signal transduction, DNA or RNA metabolism, and DNA-dependent transcription, and in the cellular components: Golgi apparatus, plasma membrane, and nucleus (supplemental Fig. S9). The asymmetric distribution of specific PTMs among cellular proteins in different organelles may suggest a unique role played for by specific types of PTM proteins in diverse cellular events.
One interesting observation from the biological process enrichment is that ethylene-regulated acetylation was significantly enriched in the processes of histone protein-based complex organization and assembly, photosynthesis, and responses to various stimuli and stresses (e.g. metal ions, radiation, light, osmatic stress, hormone, and bacteria, Fig. 4B). Through analysis of protein-protein interactions, we found that the predominant ethylene-regulated acetylproteins were histone proteins, ribosomal proteins, and heat shock proteins, which may collectively control transcription, translation, protein folding, and other biological processes, including responses to stimuli and stresses, photosynthesis, and ATP synthesis and transport (supplemental Fig. S10).

DISCUSSION
The results of our comprehensive and quantitative PTM proteomics workflow have substantially expanded the database of Arabidopsis protein acetylation sites, revealing 4245 acetylation sites that were not described in previous publications (48,(65)(66)(67). To compare the acetylproteins found from the CDG fractions (i.e. t, m, and b fractions) with those found by Hartl et al. (67), we performed cellular component analysis (supplemental Fig. S11). Acetylproteins of the ER, Golgi apparatus, plasma membrane, and mitochondria were found to be concentrated in the fraction t and b, whereas acetylproteins of the cytosol, nucleus, and ribosome were concentrated in the fraction m (supplemental Fig. S11) instead. The cellular locations of acetylproteins from the fraction m were like those identified by Hartl et al., especially in the Golgi apparatus, mitochondria, nucleus, and ribosome (supplemen- Fig. 3. XIC-based quantification and immunoblot confirmation. A, The XIC-based quantification of dimethyl labeled acetylpeptide groups. In the upper panel, each solid circle represents a quantified UPA (supplemental Table S2b). The up-and down-regulated UPAs are marked with red and blue circles, respectively. The yellow circle indicates the UPA that is derived from a histone superfamily protein (AT1G07660) and verified by the immunoblot assay. The x axis indicates the log-ratio means from pairs of the treated and the control peptides. The y axis represents the values of Ϫ10 log (p-value). The significant threshold for the selection of ethylene-regulated acetylpeptide UPAs is BH-FDR , which has been demonstrated to be ethylene-suppressed according to both XIC-based quantification and immunoblot assay. D-I, Profiles of acetylation level of the quantified UPAs derived from heat shock protein 70 family protein (AT3G09440) (D), translational elongation factor EMB2726 (AT4G29060) (E), histone superfamily protein (AT2G28720) (F), heat shock protein 90 -7 (AT4G24190) (G), AAC1, ADP/ATP carrier 1 (AT3G08580) (H), and ATP synthase subunit beta (ATCG00480) (I). Green bars represent those acetylated UPAs that pass the cutoff threshold (BH-FDR Յ0.1). J, Immunoblot assay on the ethylene-regulated acetylation level at K6 site of a histone superfamily protein (AT1G07660). The acetylation level of the protein samples treated with 10 M of ethylene precursor 1-aminocyclopropane-1-carboxylic acid (ACC) is marked with yellow bar whereas that of the control sample with green bar. The blot signal of actin is used as a loading control for normalization. Three biological replicates were performed for each genotype. Two technical replicates of immunoblot assays were performed on each biological replicate to confirm the proteomic finding of a PTM (defined as the forth C of proteomic workflow). The average value of six repeats are shown with error bars (ϮS.E., standard error). Statistical significance is determined using one-sample t-test. *, **, *** represents p Ͻ 0.05, Ͻ 0.01 and Ͻ 0.001, respectively. K, The conserved acetylation site motifs built on the immunoblot assay-validated histone superfamily proteins. Asterisks (*) stands for the acetylation sites. tal Fig. S11). The ratio of the protein amount of t, m, and b fraction was determined to be 3:9:1. It is likely that the abundant acetylpeptides prepared from the fraction m would mask the acetylpeptides from the other two fractions in LC-MS/MS analysis if the total cellular acetylprotein was not fractionated on CDG centrifugation. Thus, the membrane protein enrichment procedure via the CDG centrifugation may be the reason why we have identified more acetylpeptides and novel acetylation sites. Comparison of GO analysis results between the proteins of novel acetylation sites and those proteins of previously identified acetylation sites (supplemental Fig. S3) also supports our conclusion. The novel acetylation sites were mostly discovered from those acetylproteins associated with ER, Golgi apparatus, and plasma membrane (supplemental Fig. S3), which were concentrated in the fraction t and b of CsCl density gradient (supplemental Fig. S11). Another reason why we have successfully identified a significantly larger number of acetylpeptides may result from the use of ureabased protein-denaturing buffer (UEB) during the initial cell lysis, which immediately inactivates proteases and deacetylases or any other protein modification enzymes to prevent in vitro degradation or deacetylation or modification of acetylproteins (88).
To find relatively low-abundant post-translationally modified peptides from the total cellular peptide digests, an en-richment step based on the charge properties, chemical reactions, antibody recognition, or other affinity-based binding methods has been frequently introduced into PTM proteomic workflows (113). For example, both charge property-based immobilized metal affinity chromatography (IMAC) and titanium dioxide (TiO 2 ) have been successfully applied in phosphoproteomics (114). Other strategies, including lectin affinity binding, boronic acid chemistry, and hydrazide chemistry, have been used in purification of glucopeptides (115). Antibody-based immunoprecipitation has been extensively used in identification of PTM sites such as ubiquitination (116), phosphorylation (117), methylation (118), acetylation (45), ␤-hydroxybutyrylation (119), glutarylation sites (120), etc. In the present study, we further improved these steps of protein fractionation and PTM peptide enrichment (Fig. 1B) and applied a four-dimensional (4D) enrichment workflow in the isolation of acetylpeptides, which consists of (1) protein fractionation by CsCl density gradient centrifugation, (2) peptide separation on SCX-HPLC, (3) enrichment of acetylpeptides by antibody affinity purification, and (4) finally the C18-HPLC separation before MS/MS analysis. This new workflow allowed us to further enrich acetylpeptides derived from plasma membrane and endomembrane systems. Based on Mascot searching results, the proportion of acetylpeptides in six experimental replicates, F1, R1, F2, R2, F3, and R3, was 92.71%, 96.64%, 86.90%, 88.60%, 95.28%, and 96.46%, respectively. This high purification efficiency strongly supports the advantages of multi-dimensional acetylpeptide enrichment workflow adopted in this acetylproteomics. This novel procedure may be further applied to the study of other PTM proteomes.
To establish a versatile quantitative PTM proteomic approach for the research on organisms, such as crops, animals, or even human, rather than the medium-grown model plant Arabidopsis or the cultured cell lines, both chemical labeling and label-free quantification methods are largely preferred over the metabolic labeling. As the label-free-based quantification relies on the highly consistent peptide sample preparation and repeatable analysis (3,8,9). It is normally difficult to achieve high consistency and repeatability because a multiple-step chromatography-based separation and affinity enrichment workflow is frequently adopted to enrich the post-translationally modified peptides. To mitigate the disadvantage of label-free-based quantitative PTM proteomics, the chemical labeling and subsequently mixing at early step in the proteomic protocol are frequently introduced to reduce variation among biological replicates, experimental replicates, or batches of peptides preparation. Because the costs of both TMT and iTRAQ labeling of the total cellular peptides digested from cell lysates or from the total cellular protein is unbearable for most laboratories (2), a relatively inexpensive stable-isotope-coded dimethyl chemical is therefore introduced to performing PTM proteomics. The ion intensities of light and heavy isotopic PTM peptides are often converted into ratios. The ratio adjustment of the ion intensities of isotopic acetylpeptides can be used to solve problems of incomplete dimethyl-labeling, the biased mixing, and so on.
Moreover, the multiple replicates are often performed under variable conditions such as the altered room temperatures and various experimental biases. These circumstantial effects are artificially introduced into multiple data sets. Batch effects therefore exist in most of the high-throughput data sets (85). To our knowledge, few studies have so far considered the adjustment of these batch effects during the analysis of PTM proteomics data. In the present study, we found three types of batch effects (see Experimental Procedure for details). It is therefore that we applied the three-way ANOVA test to our data and found that p-value ϭ 2.08 ϫ 10 Ϫ7 for the effect of the CsCl density gradient centrifugation fractionation and p-value ϭ 0.00246 for the effect of the plant harvesting replicates. Consequently, we adjusted these effects using an empirical Bayesian approach (96). Such an adjustment has been widely used in various microarray data analysis before. The batch effect adjustment turned out to be quite successful in our experiments. For an example, before the adjustment, we have only identified 8 acetylation UPAs of a BH-FDR Յ 0.1 whereas, after adjustment, we eventually found a total of 64 acetylation UPAs of a BH-FDR Յ 0.1. The follow-up validation showed that those findings were true positives. Taken to-gether, we have demonstrated that the batch effects exist in multiple proteomics data sets and the batch effect adjustment results in more findings.
Previous studies have found that various organisms utilize specialized PTMs on ribosomal proteins (121)(122)(123). These PTMs help guide nuclear events, expand molecular structures, and facilitate activity regulation (124 -129). Yang et al. (127) found that mitochondrial protein synthesis is enhanced by the reversible acetylation of mitochondrial ribosomal protein L10. The N-terminal acetylation of ribosomal proteins has been demonstrated previously to be necessary for the maintenance of protein synthesis (128). An important finding from the present quantitative acetylproteomics was the identification of four ethylene-regulated acetylation UPAs from ribosomal proteins. Both the assembly of ribosomes and translation activity may be controlled via acetylation events at these sites. We thus hypothesize that acetylation of ribosomal proteins may be an additional mechanistic regulation of protein translation regardless of the possible positive or negative impact brought about by acetylation.
In addition to ribosomal proteins, heat shock proteins (HSPs), which are known as stress proteins, are responsible for protein folding, translocation, disaggregation, and degradation of damaged proteins (130 -132). In plants, the transcription of HSPs is regulated by at least 21 heat stress transcription factors (HSFs) (133). Previous studies have found that PTMs influence the ATP binding and chaperone activity of HSP 90 (134,135). A point mutation, K294Q, in the HSP 90␣ protein was used to mimic constitutive acetylation, reducing protein-protein interactions between this protein and some of its target proteins. Additionally, the K294R mutated isoform of HSP 90␣ was also used to mimic constitutive deacetylation, which resulted in stronger interactions with some of its target proteins (136). Our quantitative acetylproteomics has shown that these novel acetylation sites on an HSP 70 family protein is ethylene-suppressed whereas that of an HSP 90 family protein was ethylene-enhanced (supplemental Table S3). These results indicate that the plant hormone ethylene or other signals, may regulate HSP-dependent proteinfolding, protein-protein interactions, and stress responses via regulating acetylation level on these HSPs. This finding may represent a discovery on the ethylene-mediated regulation of stress responses through protein acetylation event.
In addition, this acetylproteomics has successfully identified a total of 44 acetylated histone proteins, among which 8 histone acetylation UPAs are regulated by ethylene. Quantitative PTM proteomics revealed that the acetylation UPA with an UPSP of AT1G07660-K6-K9-K13 is ethylene-suppressed whereas the acetylation UPA with the UPSP of AT1G07660-K6-K13-K17 is, on the contrary, ethylene-enhanced in the same protein (supplemental Table S3). The ion intensities of acetylpeptide containing the AT1G07660-K6-K9-K13 were much more abundant than that of AT1G07660-K6-K13-K17 (supplemental Fig. S6), indicating that acetylpeptide group containing AT1G07660-K6-K9-K13 may contribute more to the overall acetylation level at K6 and K13 acetylation sites because K6 and K13 are common acetylation sites between these two acetylation UPAs. Immunoblot analysis was performed on the K6 acetylation site of histone superfamily protein (AT1G07660, Fig. 3J), which demonstrated that the acetylation site is ethylene-suppressed in both Col-0 and ethylene-insensitive double mutant ein3/eil1. This immunoblot-validated and ethylene-suppressed acetylation site of histone protein (AT1G07660) was found to be homologous to the K5 acetylation site of both human and yeast histone H4, suggesting that K5 acetylation site of yeast and human H4 histone protein may also be regulated by a cellular signal. Similarly, given that the histone H3 protein of both Arabidopsis and yeast are homologous to each other, Zhang et al. (137) used the acetylation site-specific antibodies for yeast histone H3 protein to study the acetylation of K14 and K23 sites of Arabidopsis histone H3 protein. They also found that the acetylation level of these sites was enhanced by ethylene in the etiolated Col-0 seedlings following 4 h of ethylene pre-treatment. If the acetylation of histone proteins facilitates transcription mostly by acting as transcription factor recognition sites and by providing space for the transcription process (50), these new findings suggest that the ethylene-regulated acetylation of histone proteins may contribute to ethylene-regulated gene transcription in an EIN3 and EIL1-independent manner.
Comparative GO analysis of both acetylation and phosphorylation modified proteins revealed that phosphoproteins are more abundant in kinase-dependent signal transduction and transcription factor-dependent transcription in the plasma membrane and nucleus (supplemental Fig. S9). This bioinformatic finding is consistent with the involvement of protein phosphorylation and dephosphorylation in ethylene signaling (138 -142). Changes in protein phosphorylation have been demonstrated to directly regulate the activities of transcription factors (142,143). In contrast, the identified acetylproteins are largely involved in the regulation of enzyme activities (supplemental Fig. S9), thus affecting energy metabolism (144 -147). Acetylation of both histone and non-histone proteins is mainly involved in stress responses (supplemental Fig. S9). For example, histone deacetylase 6 (HDA6) regulates responses to salt stress (148). Drought stress-induced deacetylation of protein N termini are regulated by the plant hormone abscisic acid (149). In addition, both protein acetylation and phosphorylation may mutually affect each other (supplemental Fig.  S9A) as previously reported (150,151) even though the cause-effect relationship of these two modifications is unknown so far.
In short, the combination of detergent-free CDG centrifugation fractionation of the total cellular protein with a 3C quantitative PTM proteomic workflow, consisting of the dimethyl chemical labeling, chromatographic separation and enrichment, XIC-based and SQUA-D-assisted computational analysis of PTM peptide, allowed us to identify 5250 acetyla-tion sites and 64 ethylene hormone-regulated acetylation UPAs from Arabidopsis (this number of UPAs may increase if SQUA-D would be improved later for its efficiency and versatility in processing the results obtained from other PSM identification computational programs like SEQUEST, MaxQuant, pFind, Comet, Crux, and so on). These ethylene hormoneregulated changes in the acetylation UPAs may result either from the alteration of the acetylation level of proteins or from the alteration of the protein level itself (141). To differentiate these two possibilities, we have also analyzed the total cellular protein peptides (supplemental Table S5) and found that at least 7 ethylene hormone-regulated acetylation UPAs happen to have the same level of corresponding unmodified peptides in between both control and ethylene hormone-treated protein samples (i.e. it was found that 7 out of 15 regulated acetylation UPAs have the corresponding quantified and unmodified peptides, supplemental Table S5) or they may have an opposite trends of regulation in between acetylated and unmodified peptides. These data strongly confirmed (the forth C of the PTM proteomic workflow) that on some of the PTM proteins, it is the acetylation level, instead of protein level changes, changes in response to ethylene treatment. A convincing example of such a regulation would be the histone superfamily protein (AT1G07660). Both results of quantitative PTM proteomics (Fig. 3A and 3B) and immunoblot analysis (Fig. 3J) have demonstrated that the acetylation level at K6 site is ethylene-suppressed, whereas its protein level is not significantly changed by ethylene treatment (supplemental Table S5).