Comparative Proteomics of Phytase-transgenic Maize Seeds Indicates Environmental Influence is More Important than that of Gene Insertion

Proteomic differences were compared between phytase-transgenic (PT) maize seeds and nontransgenic (NT) maize seeds through two-dimensional electrophoresis (2-DE) with mass spectrometry (MS). When maize was grown under field conditions, 30 differentially accumulated proteins (DAPs) were successfully identified in PT seeds (PT/NT). Clusters of Orthologous Groups (COG) functional classification of these proteins showed that the largest group was associated with posttranslational modifications. To investigate the effects of environmental factors, we further compared the seed protein profiles of the same maize planted in a greenhouse or under field conditions. There were 76 DAPs between the greenhouse- and field-grown NT maize seeds and 77 DAPs between the greenhouse- and field-grown PT maize seeds However, under the same planting conditions, there were only 43 DAPs (planted in the greenhouse) or 37 DAPs (planted in the field) between PT and NT maize seeds. The results revealed that DAPs caused by environmental factors were more common than those caused by the insertion of exogenous genes, indicating that the environment has much more important effects on the seed protein profiles. Our maize seed proteomics results also indicated that the occurrence of unintended effects is not specific to genetically modified crops (GMCs); instead, such effects often occur in traditionally bred plants. Our data may be beneficial for biosafety assessments of GMCs at the protein profile level in the future.


Results
Comparison of protein profiles between field grown PT and NT maize. The 2-DE maps of total proteins from field-grown PT and NT maize seeds were obtained as previously described 30 . Analysis of the protein profiles of PT and NT maize seeds revealed a total of 1027 ± 121 spots in NT maize seed gel maps and 1228 ± 284 spots in PT maize seed gel maps (Figs 1; S1). There were approximately 1079 matched spots between NT and PT maize seed gel profiles. Only those spots showing changes of >1.5-fold or <0.67-fold and detected in all replicates were determined to be DAPs 30 . The 2-DE image analysis revealed 37 DAPs (5 higher abundance spots and 32 lower abundance spots compared with those in NT maize) between PT and NT maize seed samples grown in the field (Table S2).
protein identification via MALDI TOF/TOF MS. A total of 37 DAPs were manually excisted from colloidal Coomassie Blue (CCB)-stained 2-DE gels for MS/MS analysis and 30 protein spots were successfully identified (Fig. S2). Among these identified DAPs, 3 were up-regulated proteins, and 27 were down-regulated proteins (Fig. 1). The averaged ratio of volume% of the identified protein spots was shown in Tables 1 and S2. The database search for protein identification was based on homology to Zea mays proteins. If one spot was identified as containing more than one protein via MS/MS, then the protein with the highest score was chosen for further functional analysis 30 . There were 29 unique proteins in the 30 identified protein species since one protein (glucose-1-phosphate adenylyltransferase large subunit 1) was represented by two spots (Tables 1, S3). The protein that was indicated as an unknown protein was subjected to BlastP (protein-protein Blast) against the National Center for Biotechnology Information (NCBI) (http://blast.ncbi.nlm.nih.gov/Blast.cgi) to determine its identity.
A radial chart was used to evaluate the quality of the identified protein spots. The theoretical ratios and experimental ratios of the molecular mass (Mr) were presented in the radial chart as the radial axis labels, and the theoretical ratios and experimental ratios of the isoelectric point (pI) are presented as the annular radial axis labels ( Fig. 2A). Approximately 91% of the identified proteins exhibited a relative Mr ratio in the range of 1.0 ± 0.2, and 94.3% of the identified proteins exhibited a relative pI ratio in the range of 1.0 ± 0.2, which suggested that most identified proteins' experimental Mr and pI values were similar to their theoretical values.
To predict protein-protein interaction networks, the 29 identified unique proteins were subjected to STRING (v10.5) analysis online (http://string-db.org) with high confidence. Among these proteins, 13 were involved in protein-protein interactions with 3 up-regulated and 10 down-regulated proteins. Hided disconnected nodes in www.nature.com/scientificreports www.nature.com/scientificreports/ the network, there were five tightly connected clusters after MCL clustering (Fig. 2C). There were 4 proteins in the red cluster, including HSP70, EIF4A, EF2, and RuBisco β. HSP70 and EF2 were found to be the most interactive proteins in these interaction networks, associating with three other proteins, followed by the yellow cluster, with 3 proteins. The 3 remaining clusters contained two proteins that interacted with each other. Among these proteins, four interacting proteins were mainly related to "post-translational modification, protein turnover, and chaperones", while three interacting proteins were related to "energy production and conversion" among the COG categories.
To confirm the significantly enriched Gene Ontology (GO) functional groups of the identified DAPs in cellular component, biological process, and molecular function categories, GO annotation was further conducted through an online search using WEGO software (http://wego.genomics.org.cn/cgi-bin/wego/index.pl). GO information was obtained with BLAST2GO. The results showed that 30 proteins were successfully mapped with GO annotations, which were classified into three ontologies containing 43 functional groups (Fig. 3A). At the cellular level, 11 GO terms were obtained, including the cellular component category (GO: 00044464), which contained 38.7% of the proteins. For the molecular function ontology, 11 GO terms were found, and the major functional groups were binding functional groups (GO: 0005488), containing 44.7% of the proteins, and catalytic activity (GO: 0003824), containing 35% of the proteins. In the biological process, 21 GO terms were assigned. The major functional group of the proteins was involved in metabolic process (GO: 0008152), including 53.6% of the proteins, followed by cellular processes (GO: 0009987) with 51.2% of the proteins.
A Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of the identified DAPs was performed using the BLAST2GO 4.0 program to investigate their biological functions. The results showed that a total of 18 proteins (58%) were mapped to 28 pathways in the KEGG database. The most represented pathway was "purine metabolism", which contained five sequences (spots 3, 5, 11, 15 and 27). The other major pathway was "carbon fixation pathways" which contained three sequences (spots 4, 13 and 30). Two proteins were involved in each of the following pathways: "thiamine metabolism", "starch and sucrose metabolism", "glutathione metabolism", "pyruvate metabolism", "cysteine and methionine metabolism", "streptomycin biosynthesis", "alanine, aspartate and glutamate metabolism", and "amino sugar and nucleotide sugar metabolism". The remaining pathways contained only one protein sequence (Fig. 3B, Table S4).  www.nature.com/scientificreports www.nature.com/scientificreports/ Comparison of the protein accumulation and gene expression patterns. We selected ten identified proteins for qRT-PCR analysis to validate the expression patterns of their corresponding genes. To obtain the PT/NT fold-change ratios, the transcript level of the NT maize template was set to 1.0. The changes in the protein accumulation and mRNA expression levels of the selected identified proteins are shown in Fig. 4. Most of the  www.nature.com/scientificreports www.nature.com/scientificreports/ proteins exhibited similar changes at the translational and transcriptional levels; only one down-regulated protein (glutathione transferase 41, spot 20) showed no difference at the transcriptional level. Such inconsistency between the patterns of change in protein accumulation and mRNA expression levels was described in our previous studies 21,30 ; this phenomenon probably resulted from the presence of various posttranslational modifications 31 .
Comparison of protein profiles in maize seeds from different environments. We identified quantitative differences in the protein profiles between greenhouse-planted PT and NT maize seeds using both the traditional 2-DE and the newly developed high-throughput iTRAQ-based approaches 21 . Then we compared the protein profiles between field-planted PT and NT maize seeds using traditional 2-DE approaches. To analyze the effects of different planting environments on the PT maize seeds and the control, we further compared the 2-DE gel profiles of maize seeds planted in the field or in a greenhouse (Figs 5, S1, Table 2). The protein spots with changes >1.5-fold were termed as DAPs. There were 76 DEPs between the NT maize seeds grown in two different environments, including 45 up-regulated protein spots in the greenhouse and 31 up-regulated protein spots in the field (Fig. 5A,B, Table S5). Seventy-seven DEPs were detected in the PT maize seeds, with 32 up-regulated protein spots in the greenhouse and 45 up-regulated ones in the field (Fig. 5C,D, Table S6). However, as mentioned above, after comparing the 2-DE profiles of PT and NT maize seeds in the same planting environment, there were only 43 DAPs (PT/NT, planted in the greenhouse) 21 or 37 DAPs (PT/NT, planted in the field). These results demonstrated that the growth environment was more important than the gene modification itself for the protein profiles in maize seeds.

Discussion
Many DAPs in the field-grown maize seeds were posttranslational modification-related chaperone proteins. The 30 identified DAPs were obtained between PT and NT maize seeds, which were www.nature.com/scientificreports www.nature.com/scientificreports/ collected from the field. COG functional classification showed that the largest group (23% of the DAPs) was associated with "posttranslational modification, protein turnover, chaperones", such as HSP70, ubiquitin carboxyl-terminal hydrolase, glutathione transferase, and ubiquitin-conjugating enzyme. Under field conditions, plants are vulnerable to various stresses, such as drought, disease and insect pests. Posttranslational modification proteins may play important roles in response of abiotic stresses 32 . As a chaperone protein, HSP70 promotes the degradation of aberrant proteins, prevents the aggregation of denatured proteins and promotes proper folding of denatured proteins 33,34 . Ubiquitination is an important process in all eukaryotic cells, and the ubiquitin proteasome pathway participates in all aspects of the regulation of eukaryotic cells due to the degradation of proteins in such cells 35,36 . Ubiquitin-conjugating enzyme E2 can catalyze ubiquitin substrate transfer to protein hydrolysis 37 .
Environmental influence is more important than gene insertion. In evaluating the unintended effects in GMCs, an important factor to consider is the impact of environmental conditions during maize planting 38 . We compared the proteomics of PT maize seeds and a control planted in a greenhouse to eliminate variation related to the genome alteration 21 . In contrast, comparing the proteomic profiles of the same variety (NT/NT, PT/PT) grown under different environmental conditions enabled the elimination of any variation related to the environmental effects on maize seed proteomic profiles 26 .
In a comparison of the seed proteome profiles of the same variety grown under different environmental conditions, e.g., in a greenhouse or the field, DAPs would be related to the environmental impact. The genomes of NT or PT maize seeds were not different between the greenhouse and field. The 2-DE gel maps of NT maize seeds  www.nature.com/scientificreports www.nature.com/scientificreports/ revealed 76 DAPs between the greenhouse and field-planted seeds, and similarly, there were 77 DAPs in PT maize seeds between greenhouse and field planted samples. However, under the same growth conditions, there were only 43 DAPs (greenhouse) or 37 DAPs (field) when PT maize was compared with NT maize. These data revealed that the insertion of exogenous genes can lead to plant genomic changes causing DAPs, but the influence of the environment on protein profiles (numbers of DAPs) is stronger than the influence of exogenous genes. We think that environmental factors have more important effects than exogenous gene insertion on seed protein profiles. In addition, comparative proteomics of NT maize seeds planted in a greenhouse vs. in the field also revealed that the occurrence of unintended effects is not specific to GM crops. This is a common inherent phenomenon, as it often occurs in the traditional breeding of crops. Environmental impacts on crops are much stronger than those of gene insertion, which is consistent with a previous report 26,39 . Previous observations also indicated that transgenes have very limited unintended effects, while large differences were observed between lines produced by conventional breeding [40][41][42][43] .
To clearly understand whether PT maize causes unintended effects, we systematically compared the proteomics of seedling leaves and seeds between PT and NT maize grown under control conditions 21,30 . We detected insubstantial differences between the seeds of PT maize and those of NT maize. In this study, we further compared the proteomes of PT and NT maize seeds planted in the field condition, and 30 DAPs were successfully identified in these samples. COG functional classification showed that the largest group was associated with "posttranslational modification, protein turnover, chaperones". In addition, we compared the seed proteome profiles of the same maize species but grown in different locations. Our results revealed that the number of DAPs caused by the environment was much greater than that caused by the insertion of exogenous genes. Thus, the environment had more important effects on seed protein profiles than exogenous gene insertion, as it. The occurrence of unintended effects is not specific to GM crops, and it often occurs in traditional breeding. Our comparative proteomics techniques serve as an exploratory method to determine the safety of GM maize seeds. In addition, in this study, a proteomic comparison of maize seeds was carried out for only one season of field planting. However, the proteome is highly dynamic and can be changed by the cell cycle, environmental influences, and tissue/cell types 44 . Therefore, the proteomes in long-term-and multi-season-planted maize seeds need to be further compared. In conclusion, the proteomics data of PT maize seeds provided much more information and will be beneficial for the biosafety assessment of PT maize in the future.

Materials and Methods
Plant materials and growth conditions. The phytase-transgenic maize variety is 10TPY006 (PT maize), and the corresponding near-isogenic variety is the conventional hybrid LIYU16 (NT maize). PT and NT maize seeds were provided by Beijing Origin Seed Technology, Inc. The genetic background of the materials was as previously described 30 . First, conventional maize (LIYU91158 and LIYU953) was crossed with the phyA2 transgenic maize line BVLA430101, and a phyA2-insertion event was introduced into the LIYU91158 and LIYU953 backgrounds. Then, the LIYU91158 and LIYU953 transgenic lines were backcrossed six times to their recurrent parents to minimize genetic background mixing, and two self-pollinations were performed to obtain homozygous plants (OSL931 and OSL930) of each inbred line. Because its DNA was similar to that of LIYU16, the GM line LIYU006 was further derived by crossing OSL931 and OSL930. In the same manner, the NT line of LIYU16 (used as a non-GM control) was derived by crossing the LIYU91158 and LIYU953 inbred lines as described in our previous study 30 . Materials were planted at the experimental base of the Institute of Tropical Biosciences and Biotechnology (E: 110°45′42″; N: 19°32′18″). These PT and NT seeds were planted side-by-side in the field, and each line was planted in three microplots to represent three replicates. These maize seeds were planted in the same experimental sites as those grown in a greenhouse 21 . After sowing, the plants were treated according to local agricultural practices. Ears of each microplot were harvested at the same time on the same day when physiologically mature and immediately stored at −80 °C for further study.
Protein extraction. For comparative proteomic analysis, central seeds of each ear were ground into fine powders in liquid nitrogen using a mortar and pestle. Semiquantitative RT-PCR and western blotting analysis were conducted to detect the expression of exogenous genes and the accumulation of target proteins as described previously 21,30 .
Three biological replicates of PT and NT seed proteins were extracted using a modified Borax/PVPP/Phenol (BPP) protein extraction method as described previously 21,45 . Approximately 3 g of frozen maize seed fine powders were resuspended in precooled extraction buffer. After added an equal volume Tris-saturated phenol (pH 8.0), the mixtures were centrifuged. Then the upper phase was transferred into a new centrifuge tube and clarified twice. After adding ammonium sulfate saturated-methanol and protein precipitates were obtained. The proteins were quantified according to the Bradford method for the following experiments or were stored at −20 °C.

2-DE. 2-DE
was performed on an Ettan IPGphor isoelectric focusing system according to the manufacturer's instructions (2-DE Manual, GE Healthcare, Uppsala, Sweden). The 24 cm IPG strips (immobilized pH gradient) with a linear pH gradient of 4-7 (GE Healthcare) were used, approximately 1,300 µg protein samples were loaded on, and 12.5% sodium dodecyl sulfate (SDS) polyacrylamide gels were used for SDS-polyacrylamide gel electrophoresis (SDS-PAGE). Each protein extracts were performed on 2-DE gels in triplicate for technical replicates. The experimental procedures were as previously described 30 .
Gels were stained using a GAP staining method 46 and scanned with the ImageMaster Labscan V3.0 (GE Healthcare). Image analysis was conducted using a ImageMaster 2D Platinum software package (GE Healthcare). Only the spots that were present in all replicate gels and shown a Student's t test p-value < 0.05 and a relative change in quantity of at least 1.5-fold in their quantity, were considered as DAPs for further analysis 30 . www.nature.com/scientificreports www.nature.com/scientificreports/ Protein identification in 2-DE Gels via MALDI TOF MS. DAPs were manually excised from 2-DE gels, washed with MilliQ water, and then destained using a destaining solution containing 50 mM NH 4 HCO 3 and 50% acetonitrile (ACN). After air dried, in-gel digestion with bovine trypsin (Trypsin, Roche, Cat. 11418025001) was performed as previously described 47 .
The digested protein peptides were detected for peptide map fingerprinting (PMF) by using an AB SCIEX matrix-assisted laser desorption/ionization time-of-flight (MALDI TOF) 5800 system (AB SCIEX, Shanghai, China) equipped with neodymium and a laser wavelength of 349 nm. Mass spectra were obtained as previously described 48 and searched against the Zea mays amino acid sequence database (including 87,603 sequences) using MASCOT software in-house for protein identification. The search parameters were set as described 30 . If peptides matched to multiple proteins, the protein with the highest score was selected for bioinformatics analysis. For unknown proteins, a BLAST search was performed in NCBI (http://www.ncbi.nlm) to identify homologous proteins.
Bioinformatics analysis. Functional annotations of the identified DAPs were performed. COG analysis of DAPs was conducted for functional classification through an online search (http://eggnogdb.embl.de/). GO classification was carried out online using WEGO software according to GO terms as described (http://wego. genomics.org.cn) 49 . In addition, KEGG pathways were analyzed to predict the main reaction networks in which DAPs were involved in using Blast2GO 4.0 software. Finally, protein-protein interactions were analyzed using the STRING database (version 10.5) online (http://string-db.org) and network was clustered to a specified "MCL inflation parameter".
qRT-PCR analysis. Total RNA was isolated from maize seeds with TRIzol reagent (CWBIO, Beijing, China), and cDNA was generated with a reverse transcriptase kit (TaKaRa, Tokyo, Japan) for quantitative real-time RT-PCR. Approximately 20 μL of mixed solution was prepared for qRT-PCR reaction using SYBR Green PCR Master Mix (TaKaRa, Tokyo, Japan), and the reactions were performed on an Mx3005P sequence detection system according to the manufacturer's instructions. The maize endogenous gene zSSIIb was used as an internal control to normalize the amount of template cDNA. qRT-PCR primer pairs were designed with Primer 5.0 software (Table S1). Data were analyzed with MxPro software (version 4.10).