Application of an Improved Proteomics Method for Abundant Protein Cleanup: Molecular and Genomic Mechanisms Study in Plant Defense*

High abundance proteins like ribulose-1,5-bisphosphate carboxylase oxygenase (Rubisco) impose a consistent challenge for the whole proteome characterization using shot-gun proteomics. To address this challenge, we developed and evaluated Polyethyleneimine Assisted Rubisco Cleanup (PARC) as a new method by combining both abundant protein removal and fractionation. The new approach was applied to a plant insect interaction study to validate the platform and investigate mechanisms for plant defense against herbivorous insects. Our results indicated that PARC can effectively remove Rubisco, improve the protein identification, and discover almost three times more differentially regulated proteins. The significantly enhanced shot-gun proteomics performance was translated into in-depth proteomic and molecular mechanisms for plant insect interaction, where carbon re-distribution was used to play an essential role. Moreover, the transcriptomic validation also confirmed the reliability of PARC analysis. Finally, functional studies were carried out for two differentially regulated genes as revealed by PARC analysis. Insect resistance was induced by over-expressing either jacalin-like or cupin-like genes in rice. The results further highlighted that PARC can serve as an effective strategy for proteomics analysis and gene discovery.

One of the constant challenges for proteomics is inadequate protein identification because of the interference of high abundance proteins (1). The challenge is particularly critical for plant proteomics analysis because of the prevalence of Rubisco (Ribulose-1,5-bisphosphate carboxylase oxygenase) in green tissue. As a major enzyme involved in carbon fixation, Rubisco consists of 30 to 50% of total plant protein from green tissues and causes less sensitivity, dynamic range, and protein identification of plant proteomics (2)(3)(4). Influences of high abundance proteins like Rubisco affect both gel-based and shot-gun proteomics analysis. In one of the most popular shot-gun proteomics platforms with the data-dependent MS/MS acquisition, the peptides derived from the abundant proteins have more chance to be sampled by the MS instrument than the peptides from other functional proteins. Thus, the dynamic range and detection sensitivity will be sacrificed because of the prevalence of high abundance proteins (2). To address this challenge, we developed and evaluated a new method by combining PEI (Polyethyleneimine) 1 precipitation and protein sample fractionation to improve the performance of multidimensional protein identification technology (Mud-PIT)-based proteomics analysis.
PEI is a positively charged polymer broadly used for removing nucleic acids from proteins (5). The compound can also be employed to remove acidic proteins like Rubisco from the total protein (6). PEI precipitation can be considered as a fractionation process to separate acidic proteins from the total protein, and thus can be used for both Rubisco removal as well as fractionation of plant proteins from green tissues. Despite the potential to be used for sample preparation, very few studies optimized the PEI precipitation for plant proteomics analysis and evaluated the effectiveness of the approach for enhancing proteomics performance. In this study, we demonstrated that PEI-Assisted Rubisco Cleanup (PARC) and fractionation could significantly improve protein identification. The method was also used to investigate the molecular mechanisms for plant insect interaction and discover two important regulators for rice defense against herbivorous insects.
Plant insect interaction represents an important traditional research field because of the relevance for crop growth and yield. Proteomics has emerged as a major approach to study plant physiological, pathological, and developmental processes at the systems level (7)(8)(9). Despite the progresses, only a few studies have been performed to comprehensively study plant defense against herbivorous insects using proteomics approach. Most of the previous research was carried out using gel-based platforms, resulting in the identification of limited numbers of differential proteins (10 -13). Plants respond to insect attacks with systemic up-regulation of regulatory, metabolic and defense-related proteins, and the coverage of proteome is crucial for revealing in-depth mechanisms (14). Our previous transcriptomic analysis using rice as a model plant revealed molecular and genomic mechanisms for plant defenses, including the regulation of enzymes involved in secondary metabolisms and volatile production (15). Despite the transcriptomic analysis, the protein level regulation of rice defense against herbivorous insects is largely unknown. The comprehensive proteome profiling will thus help to reveal novel mechanisms for protein-level defense regulation and to identify the new regulators in plant defense that will guide the crop improvement essential for food security.
In this study, we developed and evaluated a novel plant protein isolation and fractionation method to improve the performance of MudPIT-based shot-gun proteomics. We further applied the method for a plant insect interaction study. The method increased protein identification as compared with the protein isolated by a commercial kit. The novel method also resulted in discovering almost three times more differential proteins. The significantly improved differential protein identification helped to reveal new proteome-level mechanisms for plant defense against herbivorous insects and to identify key genes involved in defense regulation. In particular, the proteomics results revealed dynamic re-allocation of carbon resources as a defense mechanism against herbivorous insects in plant. Furthermore, we carried out two levels of validation of proteomics data. The first was real-time PCR analysis to verify the gene expression of the up-regulated proteins. The second was the functional validation of two key regulatory proteins for plant defense. The first gene was a cupin-like protein specifically identified in PARC. The cupinlike protein was previously reported to be important for reactive oxygen species (ROS) production during the defense response. For example, in a study researched by Carrillo et al., a cupin-like protein Os03g48470 was demonstrated to be an oxalate oxidase to produce H 2 O 2 (16). However, it was not clear which one of more than 100 cupin-like proteins in the protein family was actually involved in plant defense against herbivorous insects (17)(18)(19). The second gene is a jacalin-like protein with unknown molecular function. Transgene overexpression experiments validated both genes to be important regulators for rice defense against herbivorous insects. Overall, our results highlighted that the PARC method could significantly improve the performance of shot-gun proteomics to enable the in-depth analysis of plant defense mechanisms and the discovery of new regulators for plant defense against herbivorous insects.

EXPERIMENTAL PROCEDURES
Plant and Insect Growth-Rice (Oryza sativa ssp. Japonica cv. Nipponbare) seeds were germinated at 30°C in the dark for 5 days. Then the seedlings were grown at 28°C with 14 h of light for 2 weeks. Fall armyworm (FAW) (Spodoptera frugiperda) eggs were obtained from Benzon Research (Carlisle, PA, USA) and hatched at 28°C. FAW larvae were raised on an artificial diet and second-instar FAW were used for herbivore treatment. Two larvae were placed on the leaves of a single two-week old rice seedling around nightfall at 10 pm. After 12 h, ϳ5-10% of the leaf area was consumed. Insects were then removed and the rice plants were harvested to snap-freeze in liquid nitrogen. The samples were stored in ultralow freezer at Ϫ80°C until further analysis.
Protein Sample Preparation Methods-The protein sample preparation integrates both PEI precipitation optimization ( Fig. 1) and sample fractionation. The flow of the different fractionation and protein extraction approaches was as shown in Fig. 2. For the reference, the total plant protein was extracted with Plant Total Protein Extraction Kit (Sigma-Aldrich, St Louis, MO) according to the manufacture's instruction with minor modifications. The protein sample is referred to as TP (Total Protein) fraction, which represented plant total protein extracted by conventional method. Briefly, 100 -250 mg of leaf tissue was ground into a fine powder in liquid nitrogen. The powder was washed with methanol and acetone, then pelleted and dried with a SpeedVac. The plant tissue pellet was dissolved in Reagent Type 4 Working Solution supplied by the kit. The protein extraction solution provided by the commercial Sigma kit contains 7 M urea, 2 M thiourea, 40 mM Triszma base, and 1% 3-(4-Heptyl)phenyl-3-hydroxypropyl dimethylammoniopropanesulfonate (C7BzO) as the detergent. The solution was adjusted to pH 10.4. The TP sample preparation using this commercial kit involves chaotropic reagents, particularly, with a zwitterionic detergent to dissolve most proteins including hydrophobic membrane proteins. After the extraction and removal of plant tissue debris, the TP samples were ready for proteomic analysis. However, the TP fraction cannot be used for PEI precipitation to remove Rubisco (Fig. 1B) potentially because of the changes in protein charges by extraction buffer for the high concentration of the zwitterionic detergent.
Besides the TP sample prepared by the commercial kit, a modified SDS (sodium dodecyl sulfate)-based total protein extraction method was used to validate the effectiveness of the commercial kit (20). The SDS solubilized total protein was subject to SDS-PAGE (sodium dodecyl sulfate polyacrylamide gel electrophoresis) analysis. The gel fragment containing proteins were washed, distained, dehydrated, and further processed for in-gel digestion according to previous studies (21).
To carry out PEI precipitation, a soluble protein extraction method was developed with a nonionic detergent. Approximately 200 mg rice leaves were ground into a fine powder in liquid nitrogen using mortar and pestle. 1 ml of extraction buffer (50 mM Tris-HCl pH 7.5, 10 mM MgCl 2 , 150 mM NaCl, 0.1% Nonidet P-40, 1 mM PMSF) was then added and incubated with the sample. The extract was then centrifuged at 13,000 rpm for 10 min at 4°C, and the supernatant was then transferred to a new micro-centrifuge tube. The supernatant con-tained most of the soluble protein in the cell and was thus referred to as TS (total soluble protein). The pellet was used to continue to extract protein using the same Plant Total Protein Extraction Kit (Sigma-Aldrich) according to the manufacturer's instruction. The sample contained mostly the insoluble protein and was thus referred to as IS. In addition to PEI precipitation, we also evaluated ammonium sulfate for its capacity to selectively precipitate Rubisco. 40% ammonium sulfate was used to precipitate Rubisco according to previous publication (22). The ammonium sulfate precipitated pellets were then re-suspended and analyzed by SDS-PAGE gel in the same way as PEI precipitated proteins.
PEI precipitation was carried out to remove the Rubisco contamination in the TS fraction. A titration was performed to determine the effective concentration for PEI precipitation as shown in Fig. 1A. For LC-MS/MS experiments, the TS samples were first mixed with 100 mg/g of PEI, vortexed vigorously for 10 s, and then precipitated on ice for 5 min. Samples were then centrifuged for 15 min at 13,000 rpm. The supernatant was removed by pipetting and collected as SS (Supernatant Soluble) fraction. The SS fraction is the Rubisco removed soluble protein. The PEI precipitated pellets were then resuspended in resuspension buffer and centrifuged for 10 min at 13,000 rpm to re-pellet the debris. The supernatant was collected as PS (precipitated soluble protein). The fraction should be Rubisco enriched fraction containing many acidic proteins. The protein concentration for each fraction was measured by Bradford Protein Assay Kit (Thermo Scientific Pierce).
MudPIT and Shot-gun Proteomics-MudPIT-based shot-gun proteomics were carried out to analyze each aforementioned fraction as described by Washburn et al. (23). Approximately 100 g protein was digested by Mass Spectrometry Grade Trypsin Gold (Promega, Madison, WI) with 1:40 w/w at 37°C for 24 h. The digested peptides were desalted using a Sep-Pak C18 plus column (Waters Corporation, Milford, MA) and then loaded onto a biphasic (strong cation exchange/reversed phase) capillary column using a high pressure tank. The two-dimensional liquid chromatography separation and tandem mass spectrometry were carried out as previously described (24).
The separated peptides were analyzed using a linear ion trap mass spectrometer, Finnigan LTQ (Thermo Finnegan). The mass spectrometry was set to the data-dependent acquisition mode. Then the full mass spectra were recorded on the peptides over a 300 -1700 m/z range, followed by five tandem mass (MS/MS) events for the most abundant ions from the first MS analysis. The Xcalibur data system (Thermo Fisher Scientific) was used to control the LC-LTQ system and collect the data. The experiments were carried out with duplicate biological samples for all fractions except the TP fraction.
Data Analysis-A data preprocessing pipeline based on Xcalibur Development Kit was used to generate DTA files in the same way as the ThermoFinnigan Bioworks (2.0) software. Tandem mass spectra were extracted from the raw files and converted into the MS2 file. The MS2 file was searched against the rice protein database containing 66,338 protein sequences from the MSU Rice Genome Annotation (Release 7), the same number of reverse sequence, and common contaminant proteins. The reverse sequences of the original data set were included calculate confidence levels and false-positive rates. The common containment proteins were included for data quality control. The ProLuCID (version 1.0) algorithm was used to assign peptide sequence to peptide fragmentation spectra using the Texas A&M Supercomputing Facility (25). The ProLuCID parameters were set as the follows: the precursor mass accuracy was set at 100 ppm; fragment ion mass accuracy was set at 600 ppm. No fixed or variable modifications were considered. At least two distinct peptides (semitryptic) were required to identify a protein with no sequence coverage assigned.
Protein quantification was achieved by spectral counting. The validity of peptide/spectrum matches were assessed in DTASelect v.2.0 using a 0.05 false discovery cutoff, with a cross-correlation score (XCorr) that's larger than 1, and normalized difference in cross-correlation scores (DeltaCN) larger than 0.08. Semitryptic peptide was included in the final calculation. DTASelect software listed the accession numbers together for those proteins combined into one group. Each individual accession numbers were then listed in the supplementary data. Number of peptides and spectra used for protein quantification was shown in supplemental Table S1. Detailed information was shown in supplemental Table S6. Proteins identified with more than two peptides were used for quantification. Protein identification was based on both specific and nonspecific peptides. Quantification is based on duplicate biological samples.
Ontology and Pathway Analysis-PatternLab (26) software (version 2.1.1.5) was used for data analysis to discover differentially expressed proteins. Isoforms of protein were combined in groups by PatternLab. The Row Sigma normalization was carried out to adjust systemic errors. The TFold test was applied to derive differentially expressed proteins. The cutoff of p value (BH q-value) and FDR were 0.05 for both. F-stringency was optimized by software automatically. Gene ontology analysis was performed using agriGO (27). Each protein was classified with respect to its biological process using GO annotation. The pathway analysis of differentially regulated proteins was analyzed using KEGG (28).
Alignments and Phylogenetic Analysis-A phylogenetic tree was generated using sequences download from NCBI reference proteins and UniProtKB/Swiss-Prot database. The threshold for blasted sequences of Os03g48770 was 1e-54. The amino acid sequences were initially aligned using ClustalX (29). The trees were created using the neighbor-joining method of MEGA version 5.05 with the Poisson model having uniform rates for all sites. The phylogenetic trees were tested by 1000 bootstrapping (30). In addition to phylogenetic tree, multiple sequence alignment was carried out for wheat oxalate oxidase GF-3.8 (P26759), rice germin-like protein 3-6 (Os03g48780) and the cupin domain containing protein Os03g48770 by Clustal Omega (http://www.ebi.ac.uk/Tools/msa/clustalo/).
Real-time Quantitative PCR-For the qRT-PCR analysis, wild type rice seedlings were infested with FAW larvae and the gene expression was analyzed at series of time points of 0 h, 12 h, 24 h, and 48 h. RNA isolation using TRI Reagent was performed following manufacture's instruction (31).
Reverse transcription of total RNA was performed using Super-Script III First-Strand Synthesis kit (Invitrogen, Carlsbad, CA). Triplicate quantitative assays were performed using the SYBR Green Master Mix (Applied Biosystems, Foster City, CA) with an ABI 7900HT sequence detection system. All data were normalized with 18S rRNA as an internal reference gene using the following primers: forward 5Ј-CGGCTACCACATCCAAGGAA-3Ј; reverse 5Ј-TGTCACTACCTC-CCCGTGTCA-3Ј. Primers used for quantitative qRT-PCR analysis were listed in supplemental Table S2. The relative quantification method (modified ⌬⌬CT) was used to evaluate the differential gene expression. The ⌬CTs were normalized against the lowest expressed samples within the group to derive ⌬⌬CT, where the smallest ⌬⌬CT for each group will be 0. The 2 ⌬⌬CT will represent the ratio of gene expression over the lowest expressed sample in the group. Both the ratio of gene expression and 95% confidence intervals derived from triplicate assays were presented.
Rice Transformation-For overexpression of Os03g48770 and Os12g14440, full-length cDNA were subcloned into a binary vector pCXUN with ubiquitin promoter (32) and introduced into Nipponbare rice embryonic calli by Agrobacterium tumefaciens-mediated transformation (33). Transgenic plants were selected on selection plates with 50 mg/L hygromycin. T1 generation plants were used to perform insect feeding assay.
Leaf Area Damage Measurement-Transgenic and wild type rice leaves from T1 plants were collected and cut into small pieces. The pieces were inserted into 10 ml 0.7% agarose in a 30 ml plastic cup. One 3rd instar larva was put on the leaf and allowed it to feed for 48 h. The percentage of leaf area damage was calculated by ImageJ (34). ANOVA analysis was carried out to compare control groups versus transgenic lines with 95% confidence intervals. Three validated transgenic lines from each target genes were used. For each transgenic line, six leaf segments were used for insect treatment bioassay and leaf area measurement.

RESULTS
The study integrated method development with functional proteomics analysis. First, a new strategy for plant shot-gun proteomics, PARC (PEI-Assisted Rubisco Cleanup) was developed to improve the protein identification and differential expression analysis. Second, the method was used to investigate the proteomic and molecular mechanisms for plant defense against herbivorous insects, which further validated PARC as an effective method to reveal in-depth mechanisms and novel regulators for plant insect interaction.
PARC as an Efficient Approach to Remove Rubisco-The method development started with optimization of PEI precipitation for removing Rubisco from extracted protein samples (Fig. 1). PEI precipitation was carried out using two types of input protein, the total soluble protein (TS) extracted by soluble protein extraction buffer and the total protein (TP) extracted by the urea/thiourea based buffer in the Plant Total Protein Extraction Kit (See Experimental Procedures for details). As shown in Fig. 1A, PEI precipitation could effectively remove Rubisco at the concentration of 50 to 100 mg/g from the total soluble proteins TS. However, PEI precipitation could not selectively precipitate Rubisco from the total protein (TP) extracted by the commercial Sigma Aldrich kit (Fig. 1B). As shown in Fig. 1B, when 25 and 50 mg/g concentrations of PEI were used, both Rubisco and other proteins were precipitated from the total protein sample TP. The difference in selectivity might result from the change of protein charges in the TP sample caused by the urea and thiourea buffer.
Based on the results, we developed the strategy as described in materials and supplies to remove Rubisco and to fractionate protein (Fig. 2). In this strategy, the total soluble protein TS was first extracted and then fractionated into supernatant Rubisco reduced fraction (SS) and pelletized Rubisco enriched fraction (PS) using 100 mg/g PEI. In the Two additional control experiments were also carried out. First, the effectiveness and suitability of the commercial kit as reference was further evaluated by comparison of the protein identification using SDS-based protein sample preparation with that of TP. As shown in supplemental Fig. S1, more nonredundant proteins were identified in the TP fraction than those from the SDS-based method. Second, different chemical compounds including PEI and ammonium sulfate can be used to "selectively" precipitate Rubisco. We briefly compared the effectiveness of PEI versus ammonium sulfate. As shown in supplemental Fig. S2, the addition of ammonium sulfate resulted in the precipitation more non-Rubisco proteins as compared with PEI. Our PARC method development thus is majorly based on PEI because of its higher specificity. In particular, the efficacy of PARC for protein identification and differential protein expression can be evaluated by comparing (1) SS and PS versus TS for soluble proteins as well as (2) SS, PS, and IS versus TP for total proteins.
Effective Removal of Rubisco, Improved Protein Identification and Enhanced Differential Analysis of Shot-gun Proteomics-The aforementioned protein fractions were subject to the MudPIT-based proteomics analysis. The effectiveness of PARC was evaluated from three aspects. First, we analyzed the percentage of spectral counts from Rubisco as part of the total spectral count for each fraction (TS, SS, PS, IS, TP). As shown in Fig. 3, PARC could efficiently remove Rubisco from soluble proteins. The Rubisco spectra counts were 21.33% of the total protein spectral counts TP. Similar to TP sample, the total soluble protein TS fraction contained 23.22% of Rubisco spectral counts. The insoluble fraction (IS) contained a surprisingly low percentage of Rubisco peptide counts at 9.81% of total spectral counts. SS and PS fractions were derived from TS when the sample was subject to PEI precipitation. As expected, the supernatant Rubisco-reduced fraction (SS) had very little Rubisco contamination with the Rubisco at 0.14% of total spectral counts. Meanwhile, the precipitated Rubiscoenriched fraction (PS) had a significantly higher percentage of Rubisco spectral counts at 36.12%.
Second, the protein identification was improved by fractionation and Rubisco removal as shown in Fig. 4. Total soluble protein TS was fractionated into SS and PS fractions through Rubisco removal and PEI precipitation. The fractionation process led to the identification of 523 more proteins and a total of 24.2% increase in the number of protein identified (Fig. 4A). Furthermore, the Rubisco removed SS, Rubisco enriched PS, and the total insoluble proteins (IS) should represent the total proteins, which can be compared with the total protein (TP) extracted by the commercial kit. The three fractions together identified 1488 more proteins than TP. In other words, PARC increased the protein identification by 68.0% (Fig. 4B). Further validation was carried out by comparison of the number of proteins identified by three biological TP samples versus the one set of SS, PS, and IS fractions (Fig. 4C). This comparison is justified by the same number of LC gradient to remove the bias caused by more experimental LC-MS/MS runs for combination of the three fractions. As shown in Fig. 4C, the combined SS, PS and IS fractions still identify 431 more proteins than the three times of TP sampling. The result highlighted that the fractionation can increase protein identification, even though the exact level of improvement is hard to define.
Third and most importantly, the improved protein identification led to the better evaluation of differential protein expression. To further validate the effectiveness of PARC and to investigate mechanisms for plant insect interaction at the systems level, differential protein expression analysis was carried out between insect treated rice and untreated reference. The study revealed superior performance of the PARC, as shown in Fig. 4D. The comparison of TP samples between insect treated and control only identified 40 differentially expressed proteins. PARC enabled significantly improved differential protein discovery. One hundred and eighteen differentially expressed proteins were identified when SS, PS, and IS fractions were analyzed. PARC therefore increased the identification of differentially expressed proteins by almost three times.
Overview of Differential Protein Expression for Plant Insect Interaction as Revealed by the Proteomics-The systemslevel overview of proteomics results included both Gene Ontology (GO) and pathway analysis. Considering that PARC significantly improved the performance of proteomics analy- sis, the functional proteomics analysis of rice-FAW (Fall Armyworm) interaction was focused on the PARC-based study (i.e. SS, PS, and IS fractions) rather than the traditional approach (i.e. TP fraction). In fact, the PARC analysis identified much more relevant pathways than those of the traditional methods.
Gene Ontology analyses revealed that proteomics analysis based on PARC can identify much more GO terms than the reference sample (TP), assumingly because of the greater amount of proteins identified in the PARC-based analysis (Fig.  5). Even though both methods identified metabolic processes and responses to stimuli as major GO groups, the PARCbased analyses enabled the identification of new GO terms such as signaling and cellular component organization.
A group of differentially expressed metabolic proteins were shown in Table I and detailed functional classification of all differentially expressed proteins was shown in supplemental  Table S3. The analyses revealed that plant responded to herbivore insect damage by up-regulating proteins involved in a broad range of processes including primary metabolism, secondary metabolism, upstream jasmonic acid biosynthesis like lipoxygenase, defense relevant proteins and others. An important aspect of metabolism changes was the carbohydrate biosynthesis and degradation. As shown in Fig. 6, both the starch and carbohydrate degradation pathway and sucrose biosynthesis were up-regulated, leading to the net carbon fluxes toward both sucrose and secondary metabolite biosynthesis.
The correlation of protein and mRNA changes induced by insect damage was evaluated at both global and gene-specific levels. The comparison of this proteomics study with previous transcriptomics analysis of FAW treated rice revealed correlated mRNA and protein expression for some key secondary metabolism genes. These genes include Os10g28200, an NAD dependent epimerase and dehydratase family protein, and Os12g37260, a lipoxygenase chloroplast precursor (15). In addition, Os12g14440 (jacalin-like lectin domain containing protein) was also shown to be up-regulated by the microarray study. This gene was chosen for the downstream functional analysis. It should be noted that previous study used an early version of long-oligo microarray which contains only 22,000 features and a relatively high error rate (15). A more comprehensive comparison of protein and mRNA expression can be achieved by comparing the proteomics data with the MPSS data for insect treatment (35). As shown in supplemental Table S4, among the 88 up-regulated proteins caused by FAW treatment, 32 were up-regulated at mRNA level when treated with beet armyworm (BAW). For reference, all proteins identified from proteomics data were included in supplemental Table S5.
Validation of Proteomics Results by qRT-PCR-The genespecific validation was focused on the up-regulated proteins as identified by the PARC-based proteomics analysis. Eleven up-regulated proteins were chosen for gene expression validation using real-time PCR. Most of these proteins were chosen because they are related to insect defense according to previous studies. As shown in Fig. 7 and supplemental Fig. S3 eight out of 11 genes showed correlated mRNA and protein expression level. In other words, the triplicate qRT-PCR assays revealed that eight genes have significantly increased mRNA expression between insect treatment and control plants at 12 h after treatments.
Among the 8 genes verified by qRT-PCR, Os03g48770 (cupin domain containing protein) and Os12g14440 (jacalinlike lectin domain containing protein, JRL), were chosen for functional study using transgenic analysis. The cupin protein was exclusively identified as an up-regulated protein by the PARC approach. Both genes were up-regulated at 12 h after insect treatment according to the qRT-PCR results (Fig. 7).
Biological Validation of Proteomics Results by Transgenic Analysis-To verify the gene function of Os03g48770 and Os12g14440, transgenic lines have been generated to over-  express the two target genes for insect treatment bioassay. The transgene overexpression was validated for T1 rice (supplemental Fig. S4). Six leaf segments from each transgenic line and the wild-type rice were subject to insect treatment bioassay by feeding the leaves with the FAW larvae ( Fig. 8 and supplemental Fig. S5). After 2 days of larvae feeding, the leaf area damage was measured and calculated. The average percentages of leaf area damage were 1.73%, 2.71%, and 2.48% for the Os03g48770 over-expressing lines and 2.36%, 4.01%, and 5.71% for the Os12g14440 over-expressing lines, respectively. In contrast, the wild type rice underwent 12.19% leaf damage. Statistical analysis of leaf area damage after FAW larvae feeding showed significant difference between the six transgenic lines and wild type rice with p Ͻ 0.01 (Fig. 8). The statistical analysis were also carried out for each individual line, which separated the samples into two groups (a and b). The samples in group b (all transgenic lines) have significantly decreased leaf area damage as compared with group a (wild type). The result indicated that the overexpression of the Os03g48770 and Os12g14440 in rice contributed to the resistance to the FAW larvae. More reliable functional studies need to be carried out in T3 transgenic lines in the future and is beyond the scope of PARC method development. The functional validation suggested that the differentially regulated proteins identified by PARC played important roles in insect defense.

PARC as an Effective Platform for Shot-gun Proteomics-
Even though proteomics has emerged as a powerful platform to study systems and molecular mechanisms for biological processes like plant defense, the application of the technology is still limited by some inherent challenges. One example is that high abundance proteins will reduce sensitivity and hinder protein identification. Different strategies have been developed to address these issues caused by high abundance proteins and to improve the detection limit and performance of shot-gun proteomics. These strategies included target proteomics, fractionation, antibody-based protein removal, improved algorithm for protein identification, and others. Recently, we developed an organelle-enriched method to enhance the identification and differential expression analysis of mitochondrial proteins (24). The method involved both organelle enrichment for sample preparation and bioinformatics classification. Even though the method improved the identification of mitochondrial proteins, the impact of Rubisco was still not properly mitigated. We hereby presented another approach to improve the sensitivity, detection, protein identification and differential expression analysis for MudPIT-based shot-gun proteomics. Moreover, the method was used to dissect the proteomic and molecular mechanisms for plant defense against herbivorous insects, leading to the identification of two key regulators in plant defense.
Several sample preparation methods were reported to deplete Rubisco from plant protein samples. These methods include sucrose density gradient centrifugation, Ca 2ϩ /phytate precipitation, and Rubisco antibody column (4, 36 -39). Despite the progresses, these procedures were either labor intensive, time consuming, costly, or leading to limited improvement in protein identification. We therefore developed PARC as a new strategy to combine the Rubisco removal and protein fractionation. In PARC, the total protein was first separated into two fractions with a nonionic detergent. The two fractions were a total soluble protein (TS) and an insoluble protein (IS). The TS fraction was then treated with PEI to derive two other fractions as Rubisco-removed soluble protein (SS) and PEI precipitated fraction (PS). PEI is a polymer with repeating units that can contain primary, secondary and tertiary amino groups. The negative charge of the molecule can selectively precipitate positively charged proteins like Rubisco. Our results indicated that PEI could be more selective for Rubisco removal than ammonium sulfate, which is another common compound for Rubisco precipitation.
In the PS fraction, proteins other than Rubisco were also precipitated. In Fig. 3E, the spectra count from Rubisco composes 36.12% of the total spectra count in PS fraction, which suggested coprecipitation of other positively charged proteins. However, because the PEI precipitated fraction will also be subjected to the shotgun proteomics analysis, other coprecipitated proteins will still be included in the identification process, so that no extra information will be lost because of Rubisco removal. Our studies eventually led us to the finding that the integration of fractionation and Rubisco removal by PEI led to the detection of more than 3500 proteins in the rice proteome. This translated into 68% increase of the number of proteins identified using a commercial kit without any fractionation. The number of identified proteins was also significantly higher than those previously published (40,41). More importantly, the PARC method essentially tripled the number of differentially regulated proteins identified in the plant insect interaction study.
The identification of more plant proteins may also benefit from other novel approaches such as combining proteomic and genomic forces together (42). Briggs' lab has identified novel proteins by using the unique peptide sequence information from a comprehensive proteomic survey of the Arabidopsis proteome. The new proteo-genomics and bioinformatics approach has led to discovery of more than 10,000 novel peptides. This suggests the current state-of-the-art Arabidopsis genome is still incomplete. The informatics-intensive proteo-genomics research can be further pursued in future researches.
Nevertheless, considering the limitation of current proteomics platforms, fractionation is still one of the viable alternatives to expand the dynamic range of proteomics detection. The disadvantage of the fractionation approach is prolonged analysis time and increased reagent cost. However, the gain of more protein identification and in-depth differential protein expression analysis in this study could compensate the tripled instrumentation time and the increased data analysis time, because the proteomic setup and data analysis are heavily based on computerized and instrumentation-based platform. The principle of abundant protein precipitation and fractionation can be adapted and applied to studies beyond the plant species as PEI can be used to precipitate acidic proteins in general. More importantly, the improved protein identification led to significantly improved differential protein expression analysis, enabling much more in-depth systems and molecular mechanism analysis in the rice-FAW interaction study. The increased number of differential expressed proteins led to many more GO terms being discovered, more comprehensive pathway annotation, and therefore a more in-depth understanding of mechanisms for plant defense.
Systems and Molecular Mechanisms of Plant Defense against Herbivorous Insects as Revealed by PARC-As aforementioned, the PARC-based analysis enabled a greater comprehensive proteome profiling than the traditional approach, delivering in-depth mechanisms from the systems levels to the molecular levels. The GO and protein classification indicated the up-regulation of pathways involved in secondary metabolism, primary metabolism, defense, and other processes. The data correlates with the fact that plants respond to herbivore attack through many dimensions, such as direct and indirect, physical and chemical, constitutive and inducible defenses (14,(43)(44)(45)(46). The proteomics data suggested a complex, yet surprising protein-level regulation of carbon utilization. As shown in Fig. 6, the starch degradation enzymes were up-regulated, which could potentially lead to increased production of ␣-D-Glucose-1P. Furthermore, the ␣-D-Glucose-1P could be used for either glycolysis and secondary metabolite production or sucrose biosynthesis. Some enzymes leading to the channeling of carbon flux to secondary metabolites were up-regulated. In addition, enzymes involved in sucrose biosynthesis were also up-regulated. Previous studies suggested that sucrose biosynthesis was up-regulated during plant defense against pathogen and insects in a wide range of crop and model plant species, including rice, cotton, tomato, and Arabidopsis (47)(48)(49)(50)(51)(52)(53)(54). The study correlated with previous studies indicating the important roles for sucrose in plant defense against herbivorous insects. In addition, as compared with our previous transcriptomic analysis, the proteomic analysis successfully identified new up-regulated pathways in carbohydrate metabolism during defense (15). Overall, active changes in carbon metabolism seemed to be a major feature of plant defense against herbivorous insects. Further validation of the hypothesis will rely on metabolite analysis, which is a major focus in plant defense field.
The study also revealed the correlations of transcriptomic and proteomic regulations. Even though the comparison of previous microarray data and this proteomics analysis only revealed a few over-lapping genes, the comparison of the more comprehensive MPSS analysis with this study revealed almost 36% correlation between transcriptomic and proteomic up-regulated genes. The correlation is significant as several factors contributed to the differences between transcriptomics and proteomics dynamics. First, the insect species, treatment conditions, and plant growth stages were all different between the previous transcriptomics and current proteomics studies. Second, as shown in the qRT-PCR experiments (supplemental Fig. S3), the mRNA level changes are often transient and dynamic, which might not correlate with protein level changes at a given point. Third, several biological processes including RNA degradation, protein translation, and protein stabilization could lead to the differences between protein and gene expression level. Despite these key issues, we still found that 32 out of 88 up-regulated proteins were also up-regulated at mRNA levels. The study indicated that transcriptional regulation of plant defense is crucial in regulating the defense-related protein level. Despite the significant correlation of up-regulated genes and proteins, the down-regulated proteins are not well correlated with mRNA level down-regulation. This could be because of the dynamic protein degradation during defense process.
Functional Validation of Plant Defense Genes as Revealed by PARC-Besides the systems and pathway level analysis, further functional verification was carried out by transgenic functional study of two genes identified from the proteomics study. The Os03g48770 (cupin-like protein) and Os12g14440 (jacalin-like lectin protein) over-expressing lines showed improved resistance against rice herbivore FAW. These two genes represented new regulators for rice defense. Jacalinrelated lectins (JRLs) can bind specifically with different carbohydrates to regulate biological processes (55). Previous studies showed that several JRLs were up-regulated significantly during insect defense and it was suggested that binding with foreign glycans might be important for the defense process (56). Other research has indicated that JRLs might regulate the signal transduction by modulating plant proteincarbohydrate interactions (57). For example, JLRs can regulate the size of Arabidopsis beta-glucosidase PYK10 complex antagonistically to retain the activity against pathogens instead of losing the enzyme activity (58). For the phytophagous insects, JRLs can bind with sugars on the gut epithelial cells and therefore inhibit nutrient uptake of insect (59). We hereby showed that a specific JRL induced by insect treatment can promote the rice defense against FAW larvae. However, the detailed mechanisms for this JRL to regulate plant defense need further investigation.
Os03g48770 belonged to the cupin super gene family and was identified by the PARC method only. The members of this superfamily have very diverse functions and their roles in plant disease defense have been studied in several plants species (19,60,61). In particular, one of the cupin superfamily members was proven to be an oxalate oxidase which produces H 2 O 2 and can serve as an important signal in plant disease defense (62). In this study, we showed that insect treatment led to the overexpression of a member of rice cupin superfamily genes and the overexpression of this gene resulted in significantly improved defense against FAW. The further phylogenic analysis indicated that this particular cupin gene is a homology of a wheat oxalate oxidase, GER3 (63, 64) (supplemental Fig. S6A and B). Os03g48770 also shares 97% similarity with Os03g48780, which encodes an oxalate oxidase involved in ROS signaling (16). It will be interesting to further study if this gene is involved in rice defense against both insect and pathogen, and how the up-stream and downstream regulation is involved. The Os03g48770 gene has the potential to be directly used in crop improvement. Even though the F1 transgenic lines have shown promising insect resistance for both genes, extensive research in homozygous T3 transgenic lines need to be carried out to validate if these genes can be used in engineering crop resistance to herbivorous insects.
Overall, PARC represented an effective method to improve the sensitivity, protein identification and differential expres-sion characterization in shot-gun proteomics studies. The method was implemented to identify both novel mechanisms and important regulators for plant defense against herbivorous insects. The technique can be broadly applied to proteomics studies for new gene discovery and mechanism elucidation.