Comprehensive Cell-specific Protein Analysis in Early and Late Pollen Development from Diploid Microsporocytes to Pollen Tube Growth

Pollen development in angiosperms is one of the most important processes controlling plant reproduction and thus productivity. At the same time, pollen development is highly sensitive to environmental fluctuations, including temperature, drought, and nutrition. Therefore, pollen biology is a major focus in applied studies and breeding approaches for improving plant productivity in a globally changing climate. The most accessible developmental stages of pollen are the mature pollen and the pollen tubes, and these are thus most frequently analyzed. To reveal a complete quantitative proteome map, we additionally addressed the very early stages, analyzing eight stages of tobacco pollen development: diploid microsporocytes, meiosis, tetrads, microspores, polarized microspores, bipolar pollen, desiccated pollen, and pollen tubes. A protocol for the isolation of the early stages was established. Proteins were extracted and analyzed by means of a new gel LC-MS fractionation protocol. In total, 3817 protein groups were identified. Quantitative analysis was performed based on peptide count. Exceedingly stage-specific differential protein regulation was observed during the conversion from the sporophytic to the gametophytic proteome. A map of highly specialized functionality for the different stages could be revealed from the metabolic activity and pronounced differentiation of proteasomal and ribosomal protein complex composition up to protective mechanisms such as high levels of heat shock proteins in the very early stages of development.

As pollen represents the severely reduced male gametophyte of higher plants, it expresses a very unique set of genes (29) required for the fast and energy-consuming polar outgrowth of the pollen tube during the fertilization process (30). Enzymes required for metabolism and energy generation are overrepresented, but there are also components of the exocytotic machinery, including signaling proteins (14) required for the deposition of pectin compounds at the tip of the growing pollen tube.
Although mature pollen and in vitro-grown pollen tubes have been the focus of research because of the ease of harvesting procedures and are widely used for cell biological studies (31)(32)(33)(34)(35)(36), this is not the case for earlier stages of pollen development.
In angiosperms, mature pollen develops from microsporocytes in the anthers of the flower in a series of distinct stages (37). After the microsporocytes have completed meiosis, they form tetrads that release microspores with one central haploid nucleus. These microspores undergo polarization and asymmetric mitosis. The bigger vegetative cell internalizes the smaller cell, which later divides again and forms the two sperm cells. Finally, the pollen desiccates. When the pollen falls on the stigma, it rehydrates, and the vegetative cell forms a pollen tube that delivers the two sperm cells through the transmitting tract to the ovule (38).
Even though pollen development studies using electron microscopy date back to the 1960s (39) and many mutants are described that are disrupted in this process (14), informa-tion on the proteome of developing pollen remains relatively sparse and is mostly restricted to whole anthers (40,41). Only very recently, a work on tomato pollen was conducted covering five developmental stages (42).
The transcriptome of Arabidopsis pollen has been analyzed from the microspore stage on (43), but the earlier stages of microsporocytes, meiosis, and tetrads were not studied, most likely because of a limitation of available material. However, this study was able to show dramatic changes in the transcriptome during the development from microspores to the mature pollen. Similar studies have been performed with Brassica napus (44) and rice pollen (45).
A comparative analysis of the proteome from these stages, as presented in our study, can have special relevance, because in pollen the proteome can greatly differ from the transcriptome not only quantitatively, but also qualitatively, as has been shown for Arabidopsis pollen (14). It seems that often the mRNA is degraded while the protein persists or mRNA is stored in desiccated pollen to be transcribed after rehydration (14).
Additionally, in our proteomic study, we extended the analysis to even earlier stages, including the stage of meiosis. We were able to compare, for the first time, the proteome of a total of eight stages: the diploid microsporocytes, cells undergoing meiosis, tetrads, microspores, polarized microspores (undergoing mitosis I), bipolar pollen, desiccated pollen, and finally pollen tubes. We found that the proteome underwent great changes during development, especially during the polarized microspore stage.

EXPERIMENTAL PROCEDURES
Plant Growth and Pollen Collection-Tobacco was grown under greenhouse conditions (12 h of light, 120 mol m Ϫ2 s Ϫ1 , 23°C during the day, 20°C at night, 60% humidity). Flowers of different sizes were collected, and the anthers of individual flowers were sampled in 200 l of 10% mannitol. Anthers were gently squeezed open and vortexed, and the supernatant including the released pollen was transferred to a new tube. Pollen was spun down at 100 ϫ g for 1 min and washed twice with 10% mannitol. A subfraction of the pollen of each individual flower was analyzed under a microscope to determine the developmental stage. Samples not representing a stage with at least 90% of their pollen were discarded.
Young leaves and roots were ground in liquid nitrogen, and proteins were extracted accordingly.
Microscopy-Pollen samples were fixed in 10% mannitol and 4% formaldehyde overnight, collected via centrifugation, and resuspended in 1 g/ml DAPI and 1% triton X-100 5 min prior to microscopy.
Images were recorded with an upright point laser scanning confocal microscope (LSM780, Zeiss, Oberkochen, Germany) using a 405-nm diode laser for excitation and a band-pass filter ranging from 450 -550 nm. Acquired images were processed using Fiji software.
Quantitative Proteome Analysis (GeLC-LTQ-Orbitrap MS)-For each sample, pollen from between 5 and 30 flowers (depending on the stage) was pooled, freeze-dried, cooled in liquid nitrogen, and ground for 3 min in a shaking mill using three 2-mm steel balls per tube. The pollen fragments were resuspended in 200 l of protein extraction buffer (62.5 mM Tris-HCl pH 6.5, 5% SDS (w/v), 10% glycerol (v/v), 10 mM DTT, 1.2% (v/v) plant protease inhibitor mixture (Sigma P9599)) and incubated for 5 min at room temperature. After this time, the samples were mixed again by pipetting, incubated for 3 min at 90°C, and then centrifuged at 21,000 ϫ g for 5 min at room temperature. Supernatants were carefully transferred to a new tube. After the addition of an equal volume of 1.4 M sucrose, proteins were extracted twice with Tris-EDTA buffer-equilibrated phenol. The combined phenolic phases were counter-extracted with 0.7 M sucrose and subsequently mixed with five volumes of 0.1 M ammoniumacetate in methanol to precipitate the proteins. After 16 h of incubation at Ϫ20°C, samples were centrifuged for 5 min at 5000 ϫ g at 5°C. The pellet was washed twice with 0.1 M ammonium-acetate and once with acetone and then air-dried. Pellets were redissolved in 6 M urea, 5% SDS, and protein concentrations were estimated via bicinchoninic acid assay (47).
Proteins were analyzed via a new gel-LC-MS protocol (48). 40 g of protein were loaded into a mini-protean cell and run for 1.5 cm. Gels were fixed and stained with methanol:acetic acid:water:Coomassie Brilliant Blue R-250 (40:10:50:0.001). Gels were destained in methanol:water (40:60), and then each lane was divided into two fractions. Gel pieces were destained, equilibrated, and digested with trypsin as previously described (49). Peptides were then desalted with the use of Bond-Elute C-18 stage tips (50) and concentrated in a SpeedVac. Prior to mass spectrometric measurement, protein digest pellets were dissolved in 4% (v/v) acetonitrile, 0.1% (v/v) formic acid. 10 g of digested peptides were loaded per injection into a onedimensional nano-flow LC-MS/MS system equipped with a pre-column (Eksigent, Redwood City, CA, USA). Peptides were eluted using a monolithic C18 column Chromolith RP-18r (Merck, Darmstadt, Germany) of 15-cm length and 0.1-mm internal diameter during an 80min gradient from 5% to 50% (v/v) acetonitrile/0.1% (v/v) formic acid with a controlled flow rate of 500 nl/min. MS analysis was performed on an Orbitrap LTQ XL mass spectrometer (Thermo Fisher Scientific, Bremen, Germany). Specific tune settings for the MS were as follows: the spray voltage was set to 1.8 kV using a needle with a 30-m inner diameter (PicoTip Emitter, New Objective, Woburn, MA), and the temperature of the heated transfer capillary was set at 180°C. Fourier transform MS was operated as follows: full scan mode, centroid, resolution of 30,000, covering the range of 300 -1800 m/z, and cyclomethicone used as a lock mass. Each full MS scan was followed by 10 dependent MS/MS scans performed in the ion trap, in which the 10 most abundant peptide molecular ions were dynamically selected with a dynamic exclusion window set to 90 s and an exclusion list set to 500. Dependent fragmentations were performed in collision-induced dissociation mode with a normalized collision energy of 35, an isolation width of 2.0, an activation Q of 0.250, and an activation time of 30 ms. Ions with an unassigned charge or a charge of ϩ1 were excluded for fragmentation. The minimum signal threshold was set at 1000.
Raw data were searched with the SEQUEST algorithm present in Proteome Discoverer version 1.3 (Thermo, Germany) as described elsewhere (51). In brief, identification confidence was set at a 5% false discovery rate, and the variable modifications were set as acetylation of the N terminus, oxidation of methionine, and carbamidomethyl cysteine formation, with mass tolerances of 10 ppm for the parent ion and 0.8 Da for the fragment ion. Up to two missed cleavage sites were permitted. Three different databases were employed (tobacco 7.0, a CDNA library from the gene index project with 120,122 entries; a tobacco protein database from UniProt 09.2011 with 4826 entries; and a genomic sequence database from the Tobacco Genome Initiative 11.2008 with 349,877 entries, resulting in 2,099,262 entries after six-frame translation). Databases were translated with an in-house tool, taking into consideration only the longest open reading frame of all reading frames. In the case of the genomic sequences, the longest open reading frames of all reading frames were considered.
When the database from the gene index project was used, additional variable modifications were allowed: phosphorylation of threonine, serine, and tyrosine; methylation and dimethylation of lysine and arginine; and acetylation and trimethylation of lysine.
Peptides were matched against these databases plus decoys, with a significant hit considered as one in which the peptide confidence was at least medium or high, and the xcorr score threshold was established at 2.5 for ϩ2 ions and 3.5 for charge states of ϩ3 or greater. The high thresholds were chosen to minimize false identifications based on the incomplete databases used.
The identified proteins were quantitated via a label-free approach based on peptide count followed by a normalized spectral abundance factor (NSAF) 1 normalization strategy (52), in which the total number of spectra counts for the matching peptides from protein k (PSM) was divided by the protein length (L) and then divided by the sum of PSM/L for all N proteins. Multivariate Statistical and Bioinformatic Data Analysis-Multivariate statistical analyses such as principal components analysis (PCA) and k-means clustering were performed with the statistical toolbox COVAIN (53). The software and parameter settings can be accessed online. Missing values were estimated from the dataset, and data were log transformed before the PCA. For cluster analysis, the mean NSAF value of each developmental stage was calculated and normalized for each protein, setting the total amount throughout the stages to 1.
All proteins in the three used databases were blasted for the closest Arabidopsis (TAIR10) homologue using an unpublished Python script in conjunction with stand-alone BLAST v2.2.26ϩ using the default matrix, and entries in the TAIR Arabidopsis MapMan mapping file (Ath_AGI_LOCUS_TAIR10_Aug2012) were replaced as previously described (54). This way, most tobacco protein accessions could be assigned to a functional bin and an Arabidopsis homologue.
Tobacco and Arabidopsis microarray results were binned according to the MapMan mapping files Ntob_AGILENT44K_mapping and Ath_AGI_LOCUS_TAIR10_Aug2012, respectively. Tobacco bin numbers were slightly adjusted to fit the tobacco protein bins.
Further blasting of the tobacco protein sequences versus the list of Arabidopsis proteins found in Arabidopsis pollen (14) and a list of pollen-affected Arabidopsis mutants (extended list from Ref. 14) was performed using the same Python script.

Isolation and Proteomic Analysis of Early and Late Pollen
Developmental Stages-Pollen from a total of eight developmental stages was harvested for proteomic analysis (Figs. 1 and 2).
Immature pollen was obtained by gently opening the anther buds of individual flowers and vortexing in 10% mannitol. In this way, pollen in the supernatant could be easily separated from larger cell debris via simple pipetting. Smaller cell debris and soluble proteins could be removed with the supernatant after low-speed centrifugation.
Although pollen from the microsporocyte and meiosis stages was obtained as large aggregates, which were associated with cell debris (Fig. 1), it was possible to isolate individual cells (or tetrads) from later stages (Figs. 1 and 2).
As the stage of pollen development cannot be easily determined by the size of the flower or anthers, especially in early stages, the developmental stage of the pollen of each individual flower was determined via microscopy, and a sufficient amount of pollen was pooled for protein extraction.
Desiccated pollen was harvested after anthesis, and pollen tubes were grown in vitro for 6 h.
For comparison, proteins from young tobacco leaves and roots were analyzed.
Three biological replicates of each stage (or tissue) were analyzed and separated into two fractions via SDS-PAGE prior to tryptic digestion and LC-MS/MS analysis.
The spectra of all identified peptides (supplemental Table  S1) from the different stages can be reviewed online in the proteomics database PROMEX (http://promex.pph.univie. ac.at/promex/Experiment; Nic taba002 for pollen and Nic taba003 for roots and leaves). Additionally, the mass spectrometry proteomics data have been deposited in the ProteomeXchange Consortium via the PRIDE partner repository (55) with the dataset identifier PXD000469.
In total, 3817 protein groups were identified from all pollen stages (Table I, supplemental Table S2), with stages A-D and F-H showing the most overlap (Fig. 3A). When the results were compared with data on extracts from tobacco roots and leaves, a total of 4262 protein groups were identified: 1217 from leaves, 1285 from roots, and 3888 from pollen ( Fig. 3B, Table I, supplemental Table S3; the increased number of pollen protein groups is due to different groupings of the proteins). The high number of identified pollen proteins was in part caused by the large tobacco genome (4.5 billion bp), leading to the finding of many homologue isoforms, but it is also attributable to the great changes that took place in the proteome during the development.
The protein groups represent a total of 12,728 putative protein accessions in pollen (supplemental Table S2) and 14,323 proteins in all the samples (supplemental Table S3). For easier reading, the protein groups are referred to as proteins hereinafter.
Protein abundances were quantified by peptide count and an NSAF normalization strategy (52). For further analysis, only proteins that were detected in all three biological replicates of at least one of the developmental stages (or tissues) were considered, leading to datasets of 1869 proteins when only pollen proteins were considered and 2135 proteins when leaves and roots were included. Proteins were classified by identifying the closest Arabidopsis homologue and assigning a function according to functional Arabidopsis mapping for MapMan (supplemental Tables S2, S3, and S7).
Of the 2135 proteins used for quantification, 837 were not detected in any of the root or leaf samples (supplemental Table S3). It cannot be ruled out that these proteins are also present in minor amounts in these organs or in other nonanalyzed tissues. However, the proteins showing high expression levels in one of the pollen stages (Table II) can be considered as especially strong candidates for being specifically expressed in developing pollen, or at least for serving specific purposes in these cell types. One example is the highly abundant ethanol dehydrogenase, which serves a specific function in the primary metabolism of pollen tubes (56) and is not needed in roots and leaves, at least under normal conditions, as well as cell wall degrading enzymes. Another protein that was also not found in roots and leaves is the Rab-GDP dissociation inhibitor, which is crucial for G-protein signaling and, thus, maintaining cell polarity during polar tip growth.
Multivariate Statistical Data Mining-A PCA of the pollen proteins alone revealed that, on the proteome level, tobacco pollen development could be separated into three major phases (Fig. 4A), with the first one including the stages from microsporocytes to microspores (A-D), the second one including only the polarized microspores (E), and the third one including the binuclear pollen stage to the pollen tubes (F-H).
This separation in the PCA and the stage specificity of the proteomes are clearly based on different cell functionalities. In the first four stages (A-D), the principal function of the pollen is its own transformation from diploid microsporocytes to microspores, whereas the obviously very different function of the rehydrated pollen is to produce and elongate a pollen tube. To facilitate a quick outgrowth, many proteins that support this function are apparently already synthesized prior to desiccation, which leads to the observed similarity of the last three stages (F-H). The polarized microspore stage (E) could be a transition stage. However, this stage also contains a unique set of proteins not present in any of the other stages.
The distinct composition of the proteome of this stage was also apparent in the individual principal components (supplemental Table S4). PC1 separated the samples according to their ongoing development, and PC2 separated stage E from all other stages (Fig. 4A). The proteins with the highest loadings in PC2 showed comparatively high abundance in stage E (Fig. 4B), but they were also present in stages A, F, and H. Among these proteins were several subunits of the 26S proteasome.
In comparison, the proteins with the most negative loadings in PC2 showed an inverse expression pattern (Fig. 4B). The proteins with the highest PC2 loadings were ribosomal proteins, hinting at a severe rearrangement of the ribosomal complex during stage E: from a total of 155 detected ribosomal proteins, 71 were missing in stage E, and 44 of these were detected in all other stages. A specific set of 38 ribosomal proteins had higher loadings in stage E than the average over all samples.
Thus, we observed a pronounced reprogramming of the protein synthesis machinery that might prepare the ribosomal complex machinery for the high demands of protein synthesis during pollen tube growth.
As PC1 differentiated the samples according to development, the negative loadings represented proteins with high abundance in early stages that declined during development, whereas positive loadings represented proteins with high expression levels in the last stages (Fig. 4B).
Among the negative loadings, histones were especially well represented, probably because of their lower cell-volume-tonucleus ratios in early stages relative to mature pollen.  The highest positive loadings included proteins required for pollen tube growth such as enzymes of the primary metabolism, ethanolic fermentation, and cell wall synthesis.
A PCA additionally including roots and leaves revealed a clear separation of the different tissues (Fig. 4C, supplemental Table S4). Whereas PC1 discriminated especially between the different pollen stages and the sporophytic tissues, PC2 dis-criminated between roots and leaves, with the earlier developmental pollen stages being more closely related to leaves. This closer connection of leaves and the early male gametophytes was also apparent from the correlation coefficients (Pearson's R, Fig. 4C).
In order to further group the pollen proteins according to their presence in the different stages, the NSAF scores were  normalized for each protein and the proteins were clustered using the k means algorithm (supplemental Table S6, supplemental Fig. S1). 12 of the 35 clusters showed proteins that were almost exclusively expressed in one of the stages (Table III).
Again, stage E stood out in terms of the number of specifically expressed proteins (clusters 5, 6, and 21; Fig. 5A). Among them were three proteins similar to Skp1 (BP531238, TC168823, and 191216821), a core component of the E3 ubiquitin ligase that targets protein for degradation by the 26S proteasome. The isoform expressed in stage E might interact with specific F-Box proteins, which could target a distinctive set of proteins for ubiquitination and breakdown, in order to adjust the proteome as required for the change in cellular function.
It is also possible that the skp1 proteins are directly involved in mitosis. The 26S proteasome is a key factor in the degradation of cell cycle proteins (57). Also, a skp1-like 1 (ASK1) of Arabidopsis is essential for meiosis in pollen (58), where it is essential for nuclear reorganization and homolog juxtapositioning (59). It was also found to be involved in mitosis (60).
Several proteases identified in stage E could also be in part responsible for the major rearrangement of the pollen proteome during in this transition stage.
Many potential cellulases, glucosidases, and mannosidases were expressed during stage E (supplemental Table S6). They  have previously been proposed to support the loosening of the cell wall required for cell expansion taking place between stages D and F (61). A subset of proteins grouped in cluster 23 (Fig. 5B) was expressed predominantly during meiosis (B) and mitosis (E), and these proteins might take part in the regulation of mitosis and meiosis, as they are specifically expressed in these stages (supplemental Table S6). One example is annexins (TC137724, BP533244, Q56D09), which might act in targeted secretion (62), required for cytokinesis. Another example is a set of potential subtilases (TC132351, TC133164, TC133288, 191501021) that could take part in signaling or specific protein cleavage and degradation.
Other proteins expressed during these two stages are predicted to play a role in secondary metabolism, which makes them unlikely to take direct part in the cell cycle. Three of them (191361943, CN949712, and O24625) show homologies to the anther expressed proteins less adhesive pollen 5 and 6, which show similarities to chalcone synthases and are essential for exine formation (63).
Before pollination, the pollen desiccates and has to drastically adjust its physiology to protect its membranes from breaking and its proteins from denaturation. In the cluster analysis, a set of proteins was grouped (clusters 13 and 24; Fig. 5C) that was almost exclusively expressed in the desiccated stage (G) and disappeared after rehydration and pollen tube growth (H). Among these proteins were late early abundant proteins (TC132846, TC146808, TC165472, 191501982) that also play a role in the desiccation of seeds (64) and are proposed to protect pollen during dehydration (65). Also, a homologue of an Arabidopsis tonoplast monosaccharide transporter (TC129132) and potential signaling proteins that could play a role in adaption to desiccation were grouped in this cluster.
Additionally, a set of proteins (cluster 22; Fig. 5D) was identified showing that some proteins might be degraded prior to desiccation and resynthesized after rehydration. Proteins in this cluster included enzymes of ␤-oxidation and of other primary metabolic pathways (supplemental Table S4). We can only speculate about the reason for this temporary degradation. The proteins might have a negative effect on the adaption to desiccation or be unstable under this condition.
Functional Remodeling of the Proteome During Pollen Development-During development, the pollen cells have to adjust their metabolism to suit their functions. In order to get a better overview of the functionality, proteins were matched against their closest Arabidopsis homologues and grouped according to their predicted functions (supplemental Table  S7). The total NSAF scores were added (Fig. 6).
Some functional groups were predominantly present in specific stages, such as the already mentioned late early abundant proteins in desiccated pollen (stage G); the gluco-, galacto-, and mannosidases, factors of protein degradation in polarized microspores (stage E); and the enzymes of secondary metabolism during meiosis and mitosis (stages B and E).
Starch synthesis seems to occur in microsporocytes and binuclear pollen prior to desiccation (stages A and F), most likely to store energy for cell division and pollen tube growth Many enzymes required for energy-consuming pollen tube growth are synthesized starting from the polarized microspore or binuclear pollen stage (stages E and F). These include enzymes required for ethanolic fermentation, which is performed by pollen tubes due to anoxia caused by rapid oxygen consumption during their growth (56). Furthermore, enzymes of sucrose, lipid, and amino acid degradation and proteins involved in cell wall metabolism and vesicle trafficking follow a similar expression pattern. These proteins are required to support pollen tube growth with sufficient energy (30) and the machinery to deposit large amounts of cell wall and membrane at the tip of the growing pollen tube (67).
Interestingly, the abundance of proteins involved in anabolic pathways like gluconeogenesis and sucrose synthesis is also increased in the later stages, which is somewhat surprising, as sucrose can be taken up by the pollen tube from the surrounding tissue (68) and was also present in large amounts in the medium used for cultivation in this study.
Proteins associated with cell division showed increased abundance during the polarized microspore stage, but their levels were also increased in desiccated pollen, probably in preparation for the mitosis of the generative cell, which takes place after pollen tube germination. One subgroup of proteins (C5MQG8, FG636560, Q1G0Z1, TC141620) in this functional group, the cell cycle controlling CDC48 (69), has numerous functions (70), including spindle disassembly at the end of mitosis (71). The heterozygous Arabidopsis mutant cdc48a (72) displayed incomplete pollen germination of the mutated pollen. Our data suggest an additional role of CDC48 in pollen mitosis, which was not affected in the described mutant, the reason being most likely the presence of two other isoforms of CDC48a in the Arabidopsis genome (73).
Hot and cold temperature stresses can be detrimental to all phases of pollen development and have major effects on sexual reproduction. Heat shock proteins have been described as strongly abundant in pollen during the later stages of development in comparison to other tissues (74). The analysis of earlier stages including microsporocytes, meiotic cells, and tetrads in our study revealed an even greater abundance of proteins associated with heat stress.
A total of 67 heat-stress-associated proteins were identified in our study, including isoforms of the chaperones HSP 70 and HSP 90 and luminal binding proteins (supplemental Table  S7). Taken together, these proteins constituted up to 10% of the total protein abundance according to our NSAF calculation (Fig. 6), highlighting the importance of these groups of proteins for pollen development, especially in early stages.
The functional comparison of the pollen stages with roots and leaves displayed a number of functional groups including ethanolic fermentation, polyamine metabolism, and late early abundant and storage proteins, which were almost not found in the sporophytic tissues and which highlight the highly spe-cialized functionality of the developing pollen (Fig. 6). The comparison also displayed the previously described high rate of anabolic metabolism, especially in the late pollen stages. Also, the synthesis of fatty acids seemed to be much higher in pollen tubes than in leaves and roots, as the abundance of the related proteins was much higher. This shows that pollen tubes do not rely solely on previously synthesized and oilbody stored fatty acids and also require de novo synthesis to cope with the rapidly expanding membranes.
The rate of protein synthesis, in contrast, did not seem to be strongly enhanced in developing pollen or, especially, pollen tubes when taking into account the abundance of proteins associated with protein synthesis (ribosomal and nonribosomal). This observation once more supports the idea that growing pollen tubes rely strongly on presynthesized proteins.
Analysis with Respect to Previous Transcriptomics, Proteomics, and Genetic Studies-In order to find out to what extent protein and transcript levels differed, the data in this study were compared with expression data from a previous study (75). The microarrays of the latter study were based on transcripts obtained from mature tobacco pollen grains and pollen tubes.
Because different accessions were used in this and the previous study and in order to simplify the comparison, the transcripts and proteins were grouped according to their MapMan bins, and the individual values were added (supplemental Table S8).
From the comparison, it is apparent that the transcripts of proteins involved in signaling were much higher than the protein levels, maybe because of high turnover rates (Table  IV). In contrast, protein levels of enzymes of the primary metabolism were much higher relative to their transcript levels (Table IV). This could be because the translation rate of these transcripts is much higher or the turnover rate of the proteins is lower. Another possibility is that the proteins are synthesized in earlier stages of pollen development and persist while the mRNA is degraded. Unfortunately, no expression data on developing tobacco pollen are available to date.
Transcript data on Arabidopsis stages ranging from unicellular microspores to pollen tubes have been previously generated (43,76), and we compared our dataset to these transcript levels. Again, the transcripts and proteins were grouped according to their MapMan bins (supplemental Table S9). Once more it was apparent that the abundance of proteins of the primary metabolism was greater than the corresponding transcript levels. Additionally, they followed a different pattern. The enzymes phosphoglycerate kinase and pyruvate decarboxylase, for example (Fig. 7), showed their greatest abundance in desiccated pollen, whereas the corresponding Arabidopsis transcripts peaked much earlier and were completely abolished in mature pollen and pollen tubes, respectively. It could be speculated that this difference is simply due to the different analyzed species. The detection of both proteins in substantial amounts in a proteomic survey of mature Arabidopsis pollen (14), however, makes a strong case that proteins are synthesized in earlier stages and persist though desiccation, rehydration, and pollen tube growth, by which time their transcripts are already degraded. This seems to be true for many enzymes of glycolysis, as the total protein and transcript abundance of this pathway follows a similar pattern. Another group that shows high protein levels in desiccated tobacco pollen and pollen tubes contains proteins associated with cell wall metabolism. Here, however, the Arabidopsis transcripts show a similar dynamic, maybe because the proteins in this group show a higher turnover rate and have to be resynthesized.
In order to be able to study the different dynamics of transcripts and proteins in better detail, it should be a future goal to generate either protein data from earlier Arabidopsis stages or transcript data throughout tobacco pollen development.
It must be concluded that the transcript levels in mature pollen can be very misleading when considering the importance of a specific gene for pollen tube growth or, even worse, pollen development, especially when only the transcript levels in mature pollen are considered, which is often the case (as, for example, in the commonly used open access version of GENEVESTIGATOR).
To find out how tobacco pollen might differ from Arabidopsis pollen, we compared the proteins in our study to the proteins found in the already mentioned proteomic survey of Arabidopsis (14), blasting all the sequences of the identified proteins from tobacco against the protein sequences from Notes: Transcript data were previously published by Hafidh et al. (75). Transcripts and proteins were binned according to MapMan. The complete dataset can be surveyed in supplemental Table S8. Arabidopsis pollen. All matches with an E-value equal to or less than 10 Ϫ10 were considered as homologues (supplemental Table S10).
Of the 3817 proteins in this study, only 1055 did not have a homologue in Arabidopsis pollen. Of the 1869 proteins considered for quantification, an even lower proportion (320 proteins) had no homologues in Arabidopsis pollen. Even though this indicates high similarity of the proteomes, there are some distinct differences.
The ortholog of Arabidopsis alcohol dehydrogenase (At1g77120), catalyzing the conversion from acetaldehyde to ethanol, was one of the enzymes with the greatest abundance in tobacco pollen tubes but was not found itself in Arabidopsis pollen. On the transcript level, this gene also showed only very weak expression in microspores and bicellular pollen and no expression in later stages. As the production of ethanol is one of the hallmarks of pollen tube metabolism in many species such as lily (77), tobacco, and petunia (56), this indicates a strong difference in primary metabolism between Arabidopsis and tobacco, probably based on the much shorter growing distance of Arabidopsis pollen tubes, which decreases the problem of anoxia. It is also possible that another enzyme takes this role in Arabidopsis pollen tubes, as several other proteins in Arabidopsis pollen showed a high similarity to the tobacco alcohol dehydrogenase. However, the protein with the highest similarity, ADH2, is already described as a glutathione reductase. It remains to be investigated whether Arabidopsis pollen tubes produce ethanol during growth.
Another possibility is that the acetaldehyde produced by pyruvate decarboxylase (which was found in Arabidopsis pollen) is directly converted to acetate and, later, acetyl-CoA, a pathway termed the pyruvate dehydrogenase bypass. An enzyme that is a strong candidate to perform the oxidation of acetaldehyde, aldehyde dehydrogenase (78), was detected in substantial amounts in Arabidopsis.
Another example of how the primary metabolism might differ is in the conversion of fructose-6-phosphate to fructose-1,6-bisphosphate. Whereas in Arabidopsis pollen only the phosphofructokinase was found, tobacco pollen and pollen tubes additionally contained several pyrophosphate-fructose-6-P phosphotransferase isoforms with a total abundance that was more than 8-fold greater than that of the phosphofructokinase (supplemental Table S9). This way, pyrophosphate can be used instead of ATP for the second activation step of glycolysis, increasing the ATP yield per hexose by one. Although this increase might not be so significant under aerobic conditions, it does make a big difference when ATP is generated via fermentation, which is the case in tobacco but might not be in Arabidopsis.
In Arabidopsis, many mutants are described that are affected in pollen development and pollen tube growth. After a survey of the literature, we updated a previously published list (14) of affected genes from 127 to 215 (supplemental Fig.  S11). From these, we found 135 to have homologues (E-value equal to or less than 10 Ϫ10 ) in tobacco pollen. This supports the theory that most proteins with important functions in pollen development could be detected in tobacco pollen. However, from the 3817 proteins identified, only 320 homologues have been described so far in Arabidopsis mutant studies according to our literature survey, leaving tremendous room for future pollen research.
Post-translational Modifications-The identification of posttranslational modifications was not the focus of this study, and the available material from early developmental stages was too limited for the enrichment of modified peptides. However, 655 potentially modified peptides were detected, including methylation of lysine and arginine; acetylation of lysine; and phosphorylation of serine, threonine, and tyrosine (supplemental Table S12). The protein with the greatest number of modifications was a homologue of elongation factor 1-␣ (Fig.  8). This protein is a member of the family of small G-proteins and serves a multitude of functions (79) including elongation of protein translation (80), regulation of the cytoskeleton (81,82), and signaling (83)(84)(85). It has been previously shown to be methylated (86,87) and acetylated (88). Multiple potential methylation sites were found; however, they must be considered with caution, as the mass of the peptide can be identical to a nonmodified peptide of a homologue protein, leading to ambiguous identifications (supplemental Table S12). This is an even bigger problem when the organism used in the study is not entirely sequenced, as potential ambiguous identifications might be missed because of the incomplete database. Also, methylated lysine and arginine residues lie at the N terminus of peptides cleaved by tryptic digest, making a confirmation of the modification based on the MS2 spectrum harder, as y-ions are too small to be detected. Therefore, only methylation sites can be considered as strong candidates (Fig. 8) if the modified amino acid lies in the middle of a miscleaved peptide and the MS2 spectrum shows the correct b-and y-ions of the modified amino acid.
The identification of additional modification sites and the study of their dynamics should be goals for the future. At least, the material available from the microspore stage on should be sufficient for metal oxide affinity chromatography enrichment of phosphopeptides, as has been performed recently for desiccated and activated pollen (28). CONCLUSION The comparative proteomic analysis of pollen development was, for the first time, extended to eight stages ranging from diploid microsporocytes to pollen tubes. In order to compare the data to results for sporophytic tissues, leaves and roots were also investigated, leading to the identification of a total of 4262 proteins.
Based on these data, pollen development can be divided into three phases (Fig. 9). The early phase that is still more closely related to leaves ranges from the microsporocytes to meiosis, extends to the formation of tetrads, and ends with the release of the microspores. The proteome of this phase is relatively static, with a high abundance of heat shock proteins to protect the cells in the process of meiosis and cell division. It appears that the "sporophytic proteome" synthesized in the microsporocytes is sustained throughout the development to early microspores.
The late phase ranges from binuclear pollen via desiccated pollen to pollen tubes and presents a "gametophytic proteome." Many proteins required for pollen tube growth, such as enzymes of the primary metabolism, and for cell wall synthesis are already produced prior to desiccation, so as to later allow a rapid outgrowth of the pollen tube without a bulk protein synthesis, as has also been previously observed (18,24).
From the comparison of our protein data to Arabidopsis transcript data, it could also be concluded that many proteins, especially from the primary metabolism, do not seem to be further synthesized during pollen tube growth.
In between the early and the late phase, which are clearly very different in their cellular functionality, the pollen under- goes an intermediate phase. During this polarized microspore stage, the "sporophytic proteome" is partially degraded, accompanied by ribosomal rearrangement and a strong increase in the abundance of proteins associated with protein degradation. However, this phase not only appears to be a turning point between the sporophytic and the gametophytic phase, but also seems to represent a phase on its own, because many proteins identified in this work were exclusively found during this stage.
Reasons for this could be the strong expansion in cell size, which is unique to this cell stage in pollen development, the performed polarization and asymmetric mitosis, and also the degradation of the sporophytic proteome, which would need distinct protein degradation machinery.
The great changes in the proteome observed during the three phases underlie the complexity of the protein networks required for male gametogenesis, which are just starting to get unraveled. As this work represents a first thorough proteomic map of pollen development, it could lay the base for a better understanding of these networks, especially of the early stages.