Dissecting the Subcellular Compartmentation of Proteins and Metabolites in Arabidopsis Leaves Using Non-aqueous Fractionation *

Non-aqueous fractionation is a technique for the enrichment of different subcellular compartments derived from lyophilized material. It was developed to study the subcellular distribution of metabolites. Here we analyzed the distribution of about 1,000 proteins and 70 metabolites, including 22 phosphorylated intermediates in wild-type Arabidopsis rosette leaves, using non-aqueous gradients divided into 12 fractions. Good separation of plastidial, cytosolic, and vacuolar metabolites and proteins was achieved, but cytosolic, mitochondrial, and peroxisomal proteins clustered together. There was considerable heterogeneity in the fractional distribution of transcription factors, ribosomal proteins, and subunits of the vacuolar-ATPase, indicating diverse compartmental location. Within the plastid, sub-organellar separation of thylakoids and stromal proteins was observed. Metabolites from the Calvin–Benson cycle, photorespiration, starch and sucrose synthesis, glycolysis, and the tricarboxylic acid cycle grouped with their associated proteins of the respective compartment. Non-aqueous fractionation thus proved to be a powerful method for the study of the organellar, and in some cases sub-organellar, distribution of proteins and their association with metabolites. It remains the technique of choice for the assignment of subcellular location to metabolites in intact plant tissues, and thus the technique of choice for doing combined metabolite–protein analysis on a single tissue sample.

Metabolic and regulatory processes within eukaryotic cells are compartmented between cellular organelles, with trans-port occurring between these subcellular compartments (see Ref. 1 for a review). In addition, sequential enzymes of some metabolic pathways function as distinct protein complexes, known as metabolons, and are often associated with structural elements within the cell (see Refs. 2 and 3 for reviews). The functional association of enzymes in such complexes allows the efficient channeling of intermediates between active sites of enzymes without mixing in the bulk phase within the organelle or compartment. Evidence of metabolons has been provided for the Calvin-Benson cycle (see Ref. 2 for a review), the tricarboxylic acid (TCA) 1 cycle (see Ref. 4 for a review), and, more recently, the glycolytic pathway (5). These protein complexes are dynamic, and their association and dissociation offer an additional level of metabolic control within subcellular compartments in response to cellular needs.
In order to locate or identify proteins within subcellular compartments, organelles need to be isolated in high purity. This can be achieved through density gradient fractionation in aqueous media using sucrose or Percoll as a matrix. Homogenized material or crude organelle preparations are applied to density gradients, and fractions are collected after centrifugation. The distribution profiles of organelles in these partially enriched fractions are characterized, for example, by assays of organelle-specific enzymes (6,7). Proteins displaying a distribution profile similar to those of organelle marker proteins can then be assigned to this subcellular compartment. This technique has for a long time been used in plant biology to study enzyme activities associated with organelle-enriched fractions (8). In recent decades, the number of proteins identified within cellular compartments has significantly increased, in particular because of improvements in mass-spectrometrybased proteomic methods. Among others, the human centrosome proteome (9), the mouse organelle proteome (10), and membrane proteins from Arabidopsis (11)(12)(13) have been characterized.
Sucrose or other aqueous media for density gradient fractionation cannot, however, be used to determine the subcellular distribution of metabolites. In contrast to proteins, particular care needs to be taken to conserve in vivo metabolite levels during the procedure that is used for cell fractionation (14). It has been demonstrated that some metabolites, particularly phosphorylated intermediates, have extremely short turnover times, sometimes less than 1 s (15). Large changes in metabolite levels will occur during slow fractionation methods such as sucrose gradient centrifugation. This is because of the mixing of metabolites and enzymes as a result of cell and organelle breakage or the leakage of metabolites from organelles, and because metabolism may continue in intact organelles, leading to major changes in the metabolite levels within these organelles. Harvesting of the plant material has to be performed via rapid quenching of metabolism, by using either a freeze-clamping device in the case of species with large leaves (16) or rapid pouring of liquid nitrogen for species with small leaves, such as Arabidopsis (15). To prevent changes in metabolite levels due to enzymatic activity or potential leakage of metabolites out of organelles, the subsequent fractionation needs to be carried out in conditions that prevent metabolite leakage and completely suppress enzymatic activity (13). For whole tissues, the only available method for this is non-aqueous density gradient fractionation (14,(17)(18)(19). In this method, to prevent the modification of metabolite levels, all steps are performed at extremely low temperatures or under anhydrous conditions. After freezing in liquid N 2 , the biological material is lyophilized at low temperature, taken up in water-free heptane, ultrasonicated, and then applied to a gradient made of anhydrous organic solvents prior to centrifugation and the subsequent collection of fractions. Briefly, during lyophilization, proteins, metabolites, and other cellular components that are in spatial proximity precipitate together. Ultrasonication then breaks the lyophilized material into small particles, which are enriched for different localities in the cell. These are then separated on the basis of density in a centrifugation step; materials rich in lipids, proteins, or salts tends to be of low, medium, or high density, respectively (14). This method provides good enrichment of material derived from the chloroplasts, cytosol, or vacuoles, as shown by the distribution of marker enzyme activities and metabolites. First applied in plants to spinach leaf material (18,19), this procedure has been adapted for leaves of other species, including bean, maize, and barley (20 -23). It has also been applied to non-photosynthetic organs such as potato tubers (24). Recently the non-aqueous fractionation method was successfully optimized for the study of metabolite compartmentation in Arabidopsis leaves (25, 26).
Originally, non-aqueous fractionation analysis was used to study a relatively small number of metabolites from photosynthesis (18). Subsequent studies investigated amino and organic acids (21,24). More recently, the subcellular distributions of 1,117 polar and 2,804 lipophilic mass spectrometric features associated with known and unknown compounds from Arabidopsis leaves were characterized using GC-MSand LC-MS-based metabolite profiling (25). However, this did not increase the coverage of phosphorylated intermediates of central carbon metabolism. Greater coverage can be achieved using a combination of different techniques, including enzymatic assays, reverse phase/ion pair chromatography coupled to tandem MS (15), anion exchange chromatography coupled to tandem MS (27), and GC-MS (28). Together, these allow the quantification of 66 metabolites, covering almost all intermediates of the Calvin-Benson cycle (the exceptions being 1,3-bisphosphoglyceric acid and erythrose-4-phosphate), sucrose and starch synthesis, and glycolysis, as well as a large number of metabolites involved in the TCA cycle or photorespiration.
Because of the different methods involved, subcellular fractionation techniques have been performed for the determination of protein or of metabolite distributions in subcellular compartments. To date, no studies have performed both types of analysis in the same gradient. In the current study, non-aqueous fractionation of Arabidopsis leaf material was performed and used for an extensive proteomic analysis to determine the distributions of about 1,000 proteins. These data are used for two main purposes. The first goal is to provide more information about the separation of organelles and membrane classes in the gradients. Up to now, only small sets of enzyme and metabolite markers have been used for the identification of three compartments: plastids, vacuoles, and cytosol (25). The use of proteomics allows the robustness of the separation to be assessed and can provide information about the distribution of material from other subcellular compartments within the gradients. In addition, the heterogeneities within subcellular compartments can be estimated via the proteomic approach. The second goal was to search for spatial associations between proteins and metabolites. Metabolite analysis was performed and combined with the proteomic analysis to provide a comprehensive analysis of the subcellular distributions of metabolite intermediates and proteins involved in primary carbon pathways of Arabidopsis. These metabolite distributions were integrated with protein distributions, focusing on the major metabolic pathways. The resulting clusters can be used to flag proteins that might previously have been misannotated or unassigned, and to potentially reassign them to the correct subcellular compartment.

Plant Growth and Harvesting Procedure-Arabidopsis thaliana [L.]
Heynh., accession Col-0, was grown for 5 weeks under 8 h/16 h day/night cycles at an average irradiance of 120 mol m Ϫ2 s Ϫ1 , temperatures of 22°C/20°C, and relative humidities of 60%/75% or grown for 3 weeks under 8 h/16 h day/night cycles at an average irradiance of 150 mol m Ϫ2 s Ϫ1 , temperatures of 20°C/18°C, and relative humidities of 60%/75% and transferred to 12 h/12 h day/night cycles at an average irradiance of 150 mol m Ϫ2 s Ϫ1 , temperature of 22°C, and relative humidity of 80% for 2 weeks. Plants were grown in individual 6-cm-diameter pots with water/gas-permeable plastic foil (Aquafol, Meyer, Emsdetten, Germany) between the rosette and the soil. Rosettes were harvested at 4 to 5 h into the light period. A large volume of liquid nitrogen was poured over the rosette leaves, avoiding shading of the plants at all times. All frozen plant material above the plastic foil was collected, ground to a fine powder at Ϫ70°C using a cryogenic grinding robot (29), and stored at Ϫ80°C until further use.
Non-aqueous Fractionation-Frozen Arabidopsis material (4 g fresh weight) was freeze-dried (ice condenser temperature, Ϫ80°C; vacuum strength, 0.03 mbar) for 96 h. Non-aqueous fractionation was performed as described in Refs. 25 and 26, except that the gradient was divided into 12 fractions, which is more than in previous studies. After centrifugation at 3,200 ϫ g (4°C) for 10 min, the supernatant was discarded to remove the solvent from the fractions. The resulting tissue was resuspended in 7 ml of heptane and subsequently divided into seven aliquots of equal volume. After the pellets had been dried in a vacuum concentrator without heating, aliquots were stored at Ϫ80°C until further use (workflow presented in Fig. 1). Gradients were prepared in biological triplicate (gradients 1, 2, and 3) using for each of them material corresponding to about 50 pooled Arabidopsis plants grown for 5 weeks under 8 h/16 h day/night cycles.
In addition, gradients from a different biological material (Arabidopsis plants grown for 3 weeks under 8 h/16 h day/night cycles and 2 weeks under 12 h/12 h day/night cycles) were prepared in technical triplicate. These gradients were sampled into 10 fractions instead of 12 and named 10-fraction gradients 1, 2, and 3. As data for 10fraction gradient 1 are presented in this manuscript, it is referred to as the 10-fraction gradient.
Enzyme and Metabolite Assays-Prior to analysis, dried pellets were thoroughly homogenized with the appropriate extraction buffer by the addition of two steel ball bearings (2-mm diameter) and shaking at 25 Hz for 1 min in a ball mill (Retsch MM300, Retsch GmbH, Haan, Germany). This procedure was used for all of the extractions described in "Experimental Procedures." The activities of adenosine diphosphate glucose pyrophosphorylase, phosphoenolpyruvate carboxylase, and acid invertase were measured as described in Ref. 30. Uridine diphosphate glucose pyrophosphorylase and Rubisco activities were measured according to Refs. 31 and 32, respectively. Starch, glucose, fructose, sucrose, and nitrate were quantified via coupled enzymatic assays, and chlorophyll content was determined spectrophotometrically, as described in Ref. 33.
Extraction and Metabolite Analysis via GC-MS-Pellets were homogenized in 1,460 l of methanol containing 60 l of 1.3 mM ribitol. Afterward, extraction, derivatization, standard addition, and sample injection were performed exactly as described in Ref. 28. The GC-MS system comprised a CTC CombiPAL autosampler, an Agilent 6890N gas chromatograph (Agilent, Böblingen, Germany), and a LECO Pegasus III TOF-MS running with electron ionization in positive mode. Metabolites were identified in comparison to database entries of authentic standards (34). Chromatograms and mass spectra were evaluated using Chroma TOF 1.0 (Leco, Mönchengladbach, Germany) and TagFinder 4.0 software (35).
Extraction and Metabolite Analysis via LC-MS/MS-Samples were extracted as described in Ref. 15, with some modifications. The chloroform:methanol mixture (3:7, v/v) was added to the dried pellets (1,000 l for fractions 0, 2, 3, and 12; 500 l for the others). Extraction was carried out using 250 l from homogenized fractions 0, 2, 3, and 12 and 500 l from the remaining homogenized fractions. The chloroform phase was washed four times with 400 l of water. The combined aqueous phases were freeze-dried and dissolved in 300 l of water. We removed viscous, high-molecular-mass components from the samples by applying individual extracts to separate wells on a Multiscreen Ultracel-10 filter (Millipore, Darmstadt, Germany) and centrifuging at 2,300 ϫ g for 2 to 3 h at 12°C. Metabolites were quantified by means of ion pair chromatography MS/MS as described in Ref. 15 and anion exchange chromatography MS/MS as described in Ref. 27, except that the liquid chromatograph was coupled to a QTrap 5500 triple quadrupole mass spectrometer (AB Sciex, Darmstadt, Germany). A full list of percentage distributions for all enzyme activities and metabolites quantified through either mass spectrometry or enzymatic assays is presented in supplemental Table S1.
Extraction and Protein Analysis via LC-MS/MS-Dried material from each fraction was solubilized in 6 M urea, 2 M thiourea, 10 mM Tris/HCl, pH 8, and the protein concentration was determined via dye-binding assay (36). Equal amounts of protein (20 g) were reduced by DTT, and disulfide bonds were alkylated using iodoacetamide (37). Proteins were predigested for 3 h with endoproteinase Lys-C (0.5 g l Ϫ1 ; Wako Chemicals, Neuss, Germany) at room temperature. After 4-fold dilution with 10 mM Tris/HCl, pH 8, samples were digested further with 4 l of sequencing-grade modified trypsin (0.5 g l Ϫ1 ; Promega, Mannheim, Germany) overnight at room temperature. Subsequently, digested samples were acidified by the addition of TFA to a final concentration of 1% and desalted using C18 tips (38).
Tryptic peptide mixtures were analyzed via LC-MS/MS using a nanoflow Easy-nLC (Thermo Scientific) and an Orbitrap hybrid mass spectrometer (LTQ-Orbitrap, Thermo Scientific) as a mass analyzer. Peptides were eluted from a 75-m analytical column (Reprosil C18, Dr. Maisch GmbH, Tübingen, Germany) with a linear gradient of 4% to 64% acetonitrile over 120 min and sprayed directly into the mass spectrometer. Proteins were identified via MS/MS through information-dependent acquisition of fragmentation spectra of multiple charged peptides. Up to five data-dependent MS/MS spectra were acquired in the linear ion trap for each full-scan spectrum acquired at 60,000 full-width half-maximum resolution in the Orbitrap. The overall cycle time was approximately 1 s.
Protein identification and ion intensity quantitation were carried out using MaxQuant version 1.3.0.5 (39). Spectra were matched against the Arabidopsis proteome (TAIR10, 35,386 entries) as well as typical contaminants (e.g. human keratin, trypsin) using Andromeda (40). For peptide identification, carbamidomethylation of cysteine was set as a fixed modification, oxidation of methionine was set as a variable modification, and up to two missed cleavages were allowed. The mass tolerance for the database search was set at 20 ppm on full scans and 0.5 Da for fragment ions. Multiplicity was set to 1. For label-free quantitation, retention time matching between runs was chosen within a time window of 2 min. The peptide and protein false discovery rates were set at 0.01. Hits to contaminants (e.g. keratins) and reverse hits identified by MaxQuant were excluded from further analysis. A full list of all identified peptides from the proteome profiling experiment is presented in supplemental Table S2. The raw files and identified spectra were submitted to ProteomeXChange (http:// proteomecentral.proteomexchange.org) and can be accessed under the identifier PXD000609.
Quantitative Data Analysis-Quantitative analysis of protein distribution was carried out based on summed peptide ion intensity values as obtained from MaxQuant in the result file evidence.txt (41). Protein ion intensity sums were calculated using the R-based script cRacker (42).
For each gradient, quantitative values for metabolites and proteins in each fraction were expressed as the percentage of the total recovered from the whole gradient. These distribution values were then submitted to a principal component analysis and used to generate a Pearson correlation matrix.
Functional annotation of metabolites and proteins was carried out according to MapMan (43), and subcellular consensus location was obtained from SUBA3 (44) and from PPDB for plastidial subcompartments (45).

RESULTS
To analyze the subcellular compartmentation of proteins and intermediates of primary metabolism in Arabidopsis rosette leaves, we quenched illuminated material by pouring a large volume of liquid nitrogen over the rosette, avoiding shading at all times. The frozen material was subsequently ground, lyophilized, ultrasonicated, and fractionated using non-aqueous gradient centrifugation. Twelve fractions were collected from the top to the base of the gradients (Fig. 1). Non-aqueous fractionations were performed in biological triplicate (gradients 1, 2, and 3). Compartment-specific markers were determined in each fraction and expressed as the percentage of the total from the summed fractions, as shown for gradient 1 (Fig. 2A). The observed dark green ring in the gradient corresponds to a plastid-enriched fraction as shown by the distributions of adenosine diphosphate glucose pyrophosphorylase and Rubisco markers (46 -48). As this fraction represented a rather large volume of the gradient, it was not taken as a whole but subdivided into two fractions, namely, fractions 2 and 3. Acid invertase and nitrate, markers for the vacuolar compartment (49 -52), were predominantly present in fraction 12, which was essentially a pellet. The markers for the cytosol, uridine diphosphate glucose pyrophosphorylase and phosphoenolpyruvate carboxylase (53,54), appeared to be evenly distributed throughout the gradient. Eight fractions were sampled between the plastid-enriched and the vacuoleenriched fractions. In previous studies, non-aqueous gradients were divided into a smaller number of fractions (25). We divided the gradients into a greater number of fractions to improve resolution and increase the probability of identifying fractions enriched in subcellular compartments other than the three compartments (plastid, cytosol, and vacuole) routinely FIG. 1. Simplified scheme of non-aqueous fractionation, sampling of the gradient, and subsequent analyses. A total of 12 fractions were taken from the gradient (F1 to F12). After centrifugation, the pellet of each fraction was resuspended in 7 ml of heptane and divided into seven aliquots of 1 ml, each of which was used for the various analyses indicated. Two of the aliquots were kept as reserves.
FIG. 2. Typical distribution profiles of compartment-specific enzyme and metabolite markers from Arabidopsis rosettes in nonaqueous gradients. Markers were measured in enzymatic-coupled assays (A), and the total protein amount was measured via dye-binding assay or calculated as the total ion intensity count (TIC) from LC-MS/MS peptide analysis (B). In A, the compartment-specific markers in each fraction are expressed as the percentage of the total from the summed fractions. Data are from gradient 1.
identified by enzyme and metabolite markers. The overall distribution of markers was reproducible between the independent gradient preparations (gradient 1 in Fig. 2A, gradients 2 and 3 in supplemental Fig. S1); therefore proteomic analysis was carried out on a representative gradient (gradient 1).
As shown in Fig. 2B, total protein amounts in the fractions ranged from 30 to 50 g. Fractions 1 and 11 contained extremely low amounts of compartment-specific markers, as well as extremely low amounts of protein and total ion counts ( Figs. 2A and 2B, respectively). Following centrifugation of the gradient, fraction 1 contained little cellular material; in some previous studies this fraction was excluded from further analysis (25). However, in our study, all gradient fractions were included for metabolite and protein analyses. In addition to gradient 1, proteomic analysis was carried out on a gradient obtained from another biological material and sampled into 10 fractions instead of 12 (referred to as the 10-fraction gradient throughout this manuscript, and for which results are presented in supplemental Fig. S2).
Proteins associated with the plastidial, cytosolic, or vacuolar compartments (Figs. 3A, 3B, 3C, and 3H) showed more or less distinct distribution profiles that were reminiscent of those obtained with compartment-specific markers ( Fig. 2A, supplemental Fig. S1). The proteins of the stroma and thylakoids presented similar distribution profiles, with a strong peak in fractions 2 and 3 and relatively low abundance further down the gradient. Cytosolic proteins were distributed fairly evenly throughout the gradient. Vacuolar proteins showed substantial enrichment in fraction 12. Mitochondrial, nuclear, and peroxisomal proteins displayed abundance profiles that strongly resembled the profiles of cytosolic proteins (Figs. 3D, 3E, 3F, and 3C). Extracellular proteins showed a distribution similar to that of vacuolar proteins (Figs. 3I and 3H). Plasma membrane proteins displayed a distribution profile that was most similar to that of proteins from endoplasmic reticulum (Figs. 3J and 3G), being most abundant in fractions 7-10. Based on these protein abundance profiles, non-aqueous fractionation separated the cell into four main groups of subcellular compartments: (i) plastids; (ii) cytosol, mitochondria, nucleus, and peroxisomes; (iii) extracellular proteins and vacuole; and (iv) endoplasmic reticulum and plasma membrane. However, to a certain extent, subcompartmental structures remained intact and clustered together, as seen, for example, in the subcompartmentation of plastidial proteins.
These observations were supported by a principal component analysis (PCA) of protein abundance distributions. Proteins that had a similar distribution in the gradient grouped in the PCA analysis. The first component separated proteins according to their known or SUBA3-predicted subcellular locations. Plastidial proteins were clustered in a large group with a broad distribution along the PC1 axis but a narrow distribution along the PC2 axis (Fig. 4A, green circles; see supplemental Fig. S3 for separate displays of each compartment). The plastidial proteins separated from the other compartments. The cytosolic, mitochondrial, nuclear, peroxisomal, endoplasmic reticulum, extracellular, plasma membrane, and vacuolar proteins showed some differences but did not group into clearly distinct, compartment-specific clusters. These findings were also observed for the 10-fraction gradient (supplemental Fig. S2A).
To test whether these observations were statistically significant, mean and standard deviations of PCA distances between proteins from the same compartment (supplemental Table S3 for gradient 1) were used for one-way analysis of variance with Holm-Sidak pairwise multiple comparison analysis (Fig. 4B, supplemental Table S4 for gradient 1; supplemental Fig. S2B for the 10-fraction gradient). Proteins whose distributions showed no significant differences from each other were displayed as a connected network. In this view, it became obvious that plastidial proteins had a significantly different distribution along the gradient relative to proteins from all compartments, with the exception of the endoplasmic reticulum proteins for gradient 1 and cytosol for the 10-fraction gradient (Fig. 4B and supplemental Fig. S2B, respectively). However, the length of the connection between plastidial and vacuolar proteins in the 10-fraction gradient indicated a substantial separation between these two groups. In gradient 1, proteins from the cytosol, endoplasmic reticulum, mitochondria, peroxisomes, and Golgi grouped together, and some of these compartments showed tight connections to nuclear and plasma membrane proteins (Golgi, and endoplasmic reticulum, peroxisome and Golgi, respectively). Vacuolar proteins and extracellular proteins had significantly similar distributions along the gradient and were not connected to the plastids or the cluster of proteins from the cytosol, other intracellular organelles, endoplasmic reticulum, and plasma membrane (Fig. 4B). Thus, according to the stringent PCA, non-aqueous fractionation achieved separation of subcellular compartments into three groups of proteins: (i) plastids; (ii) cytosol, mitochondria, nucleus, peroxisomes, endoplasmic reticulum, Golgi, and plasma membrane; and (iii) extracellular and vacuole (these groups were also found for the 10-fraction gradient as shown in supplemental Fig. S2B). This assessment differs from the previous visual analysis of distribution profiles (see above), which indicated that endoplasmic reticulum and plasma membrane were resolved from the other compartments in group ii. As PCA is the more stringent test, caution must be exercised in assigning proteins to compartments within this group based only on their distribution profiles.
For the plastidial proteins from gradient 1, further subcompartmental annotation was obtained from PPDB (45), allowing stromal, thylakoid, envelope, plastoglobular, and plastidial ribosome proteins to be distinguished (Fig. 4C). Although separation was not absolute for these subcompartments, significant clustering was observed for all of these substructures within the plastids (Fig. 4D, supplemental Table S4). For better visibility, the PCA plots for individual compartments and plastidial locations can be found in supplemental Fig. S3.
The metabolites included in the PCA showed a clear separation in PC1 and PC2 for gradient 1 and in PC1 for the 10-fraction gradient ( Fig. 4A and supplemental Fig. S2A, respectively). A large portion of them co-clustered with plastidial proteins, and the remainder did not co-cluster with any compartmental markers. Metabolites that clustered with plas-tidial proteins are shown for gradient 1 in Fig. 4C. Some of these metabolites are involved in the Calvin-Benson cycle and starch metabolism. Others are involved in glycolysis and nitrogen assimilation, which are processes that occur partly in the plastid and partly in other subcellular compartments (55)(56)(57).
Subcompartmental Homogeneity-To test whether the various organellar compartments had different degrees of heterogeneity, distances in PC1 and PC2 between proteins from the same compartment were used as a measure for heterogeneity, assuming that large scatter on the gradient separation reflected different locations of that protein within the same organellar compartment. In gradient 1, the greatest heterogeneity was observed among proteins annotated as either extracellular proteins or vacuolar proteins (Fig. 5A). By contrast, peroxisomal proteins and plasma membrane proteins displayed the least scatter in the non-aqueous gradient, followed by plastidial, cytosolic, mitochondrial, and endoplasmic reticulum proteins. This general finding was also confirmed in Proteins are shown as circles with coloring according to SUBA3 locations, and metabolites are indicated as black triangles. B, network of compartments based on the separation in the PCA of proteins assigned to these compartments. A connection between two compartments indicates that there was no significant distribution difference, and the absence of a connection means that there was a significant distribution difference as determined via one-way analysis of variance with Holm-Sidak pairwise multiple testing correction (supplemental Table S4). Edge distance is inversely proportional to the p value. C, PCA plot of all plastidial proteins and co-distributing metabolites colored according to their plastidial subcompartmentation, as given in PPDB. Metabolites were included in the same data matrix and are indicated as triangles. D, circular layout of a network of plastidial subcompartments based on the separation in the PCA of proteins assigned to these compartments. Network obtained as described in A. PCA plots of individual compartments are shown in supplemental Fig. S3. the 10-fraction gradient (supplemental Fig. S2C), where the greatest heterogeneity was also observed among vacuolar proteins. However, in the 10-fraction gradient, greater scatter was observed among plasma membrane and nuclear proteins than in the 12-fraction gradient. Among the plastidial proteins, the least scatter was observed for peripheral thylakoid proteins from either the stromal or lumenal sides, and for plastidial ribosomal proteins (Fig. 5B).
These results indicate that different cellular compartments present diverse degrees of heterogeneity. The diversity of cell types within the leaf material applied to the gradients is a potential source of variation. For example, vascular tissue and mesophyll tissue will have different organellar compositions, as will young versus old tissue. The clearest separation and least scatter were found for proteins from compartments defined by distinct envelope membranes such as peroxisomes, mitochondria, and plastids, and for plasma membrane proteins.
Complex Formation and Functional Distributions-Proteins involved in the same metabolic pathway often associate with each other to provide efficient channeling of metabolic intermediates. In gradient 1, proteins involved in glycolysis (MapMan bin 4.1), starch synthesis (MapMan bin 2.1.2), nitrogen assimilation (MapMan bin 12.2), or the Calvin-Benson cycle (MapMan bin 1.3) showed significantly lower standard deviations for their distribution profiles in the non-aqueous gradient (Fig. 6). Other classes of proteins showed greater heterogeneity. Proteins annotated as cytosolic ribosomal proteins (MapMan bin 29.2.1) displayed higher standard deviations. There was considerable heterogeneity among transcription factors (MapMan bin 27.3), indicating that this class of proteins is not necessarily present only within one organellar compartment (i.e. the nucleus), and can also be associated with the cytosol or even the plasma membrane (58,59). However, only two of the transcription factors identified in this study are predicted to contain transmembrane domains. The V-ATPase subunits (MapMan bin 34.1.1) had particularly high standard deviations in the PCA plot, probably because they are present in very different organellar compartments, including the tonoplast, endoplasmic reticulum, Golgi, and plasma membranes. As a result of this broad distribution, the same V-ATPase subunit protein may be identified as a member in any of these compartments, leading to large heterogeneity and thus scatter across the non-aqueous gradient. This might reflect the known location of V-ATPase in different endomembrane systems. Again these findings were consistent in the independent gradient sampled with 10 fractions, particularly the considerable heterogeneity observed among transcription factors and V-ATPase subunits (supplemental Fig. S2D).
Integration of Protein with Metabolite Data-To integrate metabolite distributions across the non-aqueous gradient with protein abundance profiles, a pairwise Pearson correlation analysis of percentage distributions of peptide ion intensity sums, metabolite contents, and enzyme activities was performed across all fractions. From this correlation matrix, a network was created using a very stringent cutoff (0.98) for the correlation coefficients. Correlation pairs were visualized as an edge-weighted spring-embedded network. Components within each subnetwork were assigned to metabolic pathways, and the most common assignment was used to name the subnetwork. The following analysis has a particular focus on the major pathways for carbon metabolism because the majority of the metabolites analyzed in this study were intermediates in these pathways (Fig. 7, supplemental Fig. S4, supplemental Table S5).
When we examined the locations of proteins and metabolites, we annotated the nodes in the Calvin-Benson cycle and starch synthesis subnetworks as plastidial, as expected (Fig.  7). The nodes in the glycolysis subnetwork were annotated as cytosolic, although not all glycolytic proteins are fully cytosolic. The proteins involved in sucrose synthesis were annotated as cytosolic, in agreement with the known location of this pathway (60). Proteins in the TCA cycle were assigned to the mitochondria. Metabolic intermediates and proteins of the photorespiration subnetwork were annotated as predominantly localized to the peroxisome and mitochondria. However, a closer examination of this protein-metabolite network revealed some conflicts with the subcellular locations indicated in the SUBA3 database. For example, AT1G30120, a pyruvate dehydrogenase, and At1g34430, an acetyl transferase, are both currently assigned in SUBA3 to the plastid, based on the consensus of prediction programs and direct experimental evidence. However, they displayed close interactions with TCA cycle proteins and metabolites, which would suggest that they are located in the mitochondria and not the plastids. This finding was also confirmed in the 10-fraction gradient (supplemental Fig. S2D). In addition, AT1G01090, a pyruvate dehydrogenase E1 ␣ subunit currently assigned in SUBA3 to the plastid, showed close interaction with the TCA cycle, suggesting a mitochondrial localization. In fact, some prediction programs do assign these proteins to mitochondria, but this conflicts with experimental evidence of a plastidial location provided by proteomic analysis of isolated organelles (61,62).
Proteins and metabolites involved in the Calvin-Benson cycle formed a tight subnetwork in which Rubisco was especially closely connected to ribose-5-phosphate and 3-phosphoglycerate. The TCA cycle proteins and metabolites formed a subnetwork with two hubs that were connected by the metabolites succinate and fumarate. In the glycolytic pathway, phosphoenolpyruvate carboxylase was closely connected to phosphoenolpyruvate and pyruvate. The Calvin-Benson cycle was connected to the glycolytic subnetwork by  Table S5. dihydroxyacetonephosphate and fructose-6-phosphate. Dihydroxyacetonephosphate is a component of the Calvin-Benson cycle and is exported to the cytosol via the triose phosphate:phosphate translocator, where it is used for sucrose synthesis. Fructose-6-phosphate is actually present in similar amounts in the plastid stroma and cytosol (18 -22), being a component of the Calvin-Benson cycle and the precursor for starch synthesis in the plastid, and an intermediate in the sucrose synthesis pathway in the cytosol (63). Starch and sucrose synthesis, as well as photorespiration, formed distinct subnetworks for which the connections to glycolysis and the TCA and Calvin-Benson cycles were below the significance threshold. No connections between these subnetworks were observed even after lowering the cutoff because of increased noise.
In general, the tight connection of proteins and metabolites within pathways (Calvin-Benson cycle, photorespiration, glycolysis, TCA cycle, sucrose synthesis, and starch synthesis) was also observed in the 10-fraction gradient (supplemental Fig. S2E and supplemental Table S6). However, the network was more compact with glucose-1-phosphate and glucose-6-phosphate as further connecting metabolites between the Calvin-Benson cycle and glycolysis, besides dihydroxyacetonephosphate and fructose-6-phosphate. Serine, glycerate, and glycine were found as connecting photorespiration, the TCA cycle, and the Calvin-Benson cycle. Pyruvate was a key metabolite connecting glycolysis and the TCA cycle. Connections were identified between sucrose synthesis and starch synthesis, as well as between starch synthesis and the Calvin-Benson cycle.

Increased Information about the Location of Subcellular
Compartments in Non-aqueous Gradients-Previously, the use of non-aqueous fractionation to analyze the subcellular distribution of metabolites was based on the use of one or two marker enzymes for the plastids, cytosol, and vacuoles, and little or no information about the distribution of other subcellular components in the gradients. The mathematical analysis of such gradients depended on (i) the selected markers being representative for the corresponding subcellular compartment and (ii) the investigated metabolites being present only in these compartments and absent from all other subcellular compartments for which markers were not analyzed.
Our large-scale proteomic approach unequivocally shows that the plastidial, cytosolic, and vacuolar compartments are well separated by non-aqueous fractionation, and it gives insights into the homogeneity of the distribution for each compartment. It also provides important information about the distribution of other subcellular fractions in this type of gradient. For example, abundance profiles of proteins known to be localized to the mitochondria, nucleus, or peroxisome closely matched the abundance profiles of proteins from the cytosol. The abundance profiles of extracellular proteins showed similarities to those of vacuolar proteins, and plasma membrane proteins were distributed along with endoplasmic reticulum proteins. From PCA, in which additional subcellular compartments were defined a priori based on proteins of known location, clusters corresponding to subcellular compartments could be detected and their overlap statistically analyzed. Plastid proteins were significantly separated from all other compartments except the endoplasmic reticulum or cytosol, but even here, and especially for the latter, substantial separation could be seen. There were three identifiable but overlapping groups: (i) cytosol, endoplasmic reticulum, mitochondria, peroxisomes, and Golgi; (ii) Golgi and nucleus; and (iii) endoplasmic reticulum, peroxisomes, Golgi, and plasma membrane. Within organelles, subcompartmental structures, protein complexes, and metabolic chains such as thylakoid proteins, light harvesting complexes, the Calvin-Benson cycle, or starch synthesis formed tighter subnetworks than other proteins in the same organelle, although the separation was incomplete.
Taken as a whole, this analysis of protein distributions validates the use of non-aqueous fractionation to study metabolite distributions among the plastid, cytosol, and vacuole, but it also points to two considerations that will affect the precision of the results. First, we must consider the incomplete separation of these compartments from other compartments that up to now were rarely monitored in non-aqueous fractionation studies. The endoplasmic reticulum, cytosol, nucleus, mitochondria, peroxisomes, plasma membrane, and Golgi were poorly separated. Although plastidial and vacuolar proteins separated well from most other compartments, there was significant overlap with endoplasmic reticulum and extracellular proteins, respectively. This needs to be taken into account in interpreting the analyses of metabolite distributions in the gradient. Second, there was considerable heterogeneity within some compartments, which means the subcellular location assigned by analysis of the gradient in similar studies will depend on the choice of marker proteins. It is probably best to select marker proteins that are functionally related to the set of metabolites whose location is being queried.
Quality of Separation in Non-aqueous Gradients-We collected a greater number of fractions (10 -12) than has been customary in non-aqueous gradient studies. This corresponds more closely to classical fractionation using sucrose density gradients, which has been widely used for proteomic analyses of organelles (10,12). However, little improvement in resolution was achieved by dividing the non-aqueous gradients into more fractions, and no major differences were observed in the 12-fraction gradient versus the 10-fraction gradient; only the plastids and vacuoles were well separated from each other and from other organelles and compartments, as observed in previous non-aqueous fractionation studies. This indicates that the physical properties of the organelles and the derived particles (density, size, etc.) are an inherently limiting factor, and not sampling resolution in the density gradient. This confirms earlier studies that, based on metabolite distributions, concluded that taking greater numbers of fractions does not result in increased compartmental resolution (64).
One reason for the limited resolution is the variance shown by proteins derived from a given compartment. This might be partly due to cellular heterogeneity, and it might be decreased by using stringently chosen biological material in which this variance is minimized. However, our study also underlines the need to increase the resolution of non-aqueous gradient centrifugation. This could be done by applying orthogonal separation methods. For example, phase-partitioning-based separation has been successfully used in parallel with density-based separation to separate mitochondria from thylakoids (65), and two-phase partitioning is a common method for membrane separations (66). Otherwise, in order to further refine the separation based on density, one possibility would be to alter the ratio of the two non-polar solvents tetrachloroethylene and heptane, and therefore the range of the density along the gradient. Sequential centrifugation on gradients with different density ranges is another possibility. Non-aqueous density gradient fractionation was originally developed to separate mitochondria and cytosolic compartments in rat liver (67,68). For this purpose, a shallower gradient over a smaller density range (1.29 to 1.43 g/ml) was used, relative to gradients designed to fractionate plant material (e.g. 1.28 -1.50 plus a cushion at 1.59 mg/ml for spinach leaves (18); 1.28 -1.51 with a cushion at 1.62 g/ml for barley leaves (23); 1.43-1.62 g/ml for Arabidopsis leaves (25); and this study). This larger range results in a more compressed gradient and was necessary because the particles obtained from plant material after lyophilization and sonication show a wider distribution of densities, in particular particles enriched in material from plastids, which are slightly less dense than material from rat liver, and particles enriched in material from vacuoles, which are much denser (18). It is possible that recombination of fractions 5-11 and recentrifugation on a shallower gradient might improve the separation of the cytosol, mitochondria, peroxisomes, and various membrane fractions. Further refinement of non-aqueous fractionation will be greatly aided by the use of large-scale quantitative proteomics, as this will facilitate multi-parallel tracking of many subcellular compartments.
Use of Non-aqueous Gradients to Investigate Protein Compartmentation-Other separation methods such as continuous or discontinuous sucrose gradients and continuous iodixanol gradients, which have commonly been used for proteomic analyses (12,13,69), are more suitable for the separation of membrane compartments. For example, centrifugation of crude membrane preparations from Arabidopsis on self-generating iodixanol gradients (12) resolved plasma membrane, tonoplast, Golgi apparatus, and endoplasmic reticulum from one another and from mitochondrial and plastidial membranes. However, such separation requires additional purification steps prior to separation on an aqueous gradient. Stronger enrichment and separation of some recalcitrant organelles can be achieved by using protoplasts as a starting material, but, again, the prior formation of protoplasts is likely to lead to major perturbations of some metabolic processes (14). Therefore, and despite the issues raised above, non-aqueous fractionation remains the technique of choice for the assignment of subcellular location to metabolites in intact plant tissues and thus the technique of choice for doing combined metabolite-protein analysis of a single sample. However, our proteomic analysis showed that non-aqueous fractionation is suitable to investigate protein location; for example, in gradient 1 two-thirds of the 303 proteins defined as cytosolic in our study were also identified as cytosolic in a survey that yielded over 1,000 identifications of cytosolic proteins (70). In addition, our analysis highlighted three proteins (AT1G01090 and AT1G30120, pyruvate dehydrogenase subunits, and AT1G34430, an acetyl transferase) that are currently assigned to the plastid but for which non-aqueous fractionation points strongly to a mitochondrial location.
Effects of Tissue Heterogeneity-In this study the material used for non-aqueous fractionation consisted of whole rosettes with leaves of different age and developmental status. It is known that protein and metabolite composition varies along a leaf's developmental gradient, as demonstrated for photosynthetic complexes (71) and for about 2,000 proteins (72). In addition, within a leaf different cell types such as mesophyll tissue, vascular tissue, and epidermal cells are known to express different proteins and contain different metabolites (73,74). Although cell-type-specific protein expression has not yet been systematically studied in Arabidopsis leaves, it is likely to be substantial, as indicated by a recent study with Arabidopsis roots in which 700 proteins were found to be cell-type specific and only 300 proteins were identified in at least six different cell types (75).
Even though the mesophyll from source leaves is the predominant tissue in Arabidopsis rosettes (76), other cells and leaf developmental stages will make a contribution. In particular the observed heterogeneity of cytosolic proteins, but also that of other compartments, may partly reflect developmental and cell-type-specific differences. In addition, regulation of translation, and thus control of protein abundance, was shown to play a role in developmental processes (77). Based on available protein abundance data obtained at different stages of Arabidopsis leaf development (72), we annotated the cytosolic proteins identified in non-aqueous gradient 1 as preferentially expressed in "expanding" or "mature" leaves. PCA of these cytosolic protein abundances showed a clear tendency for separation of proteins from expanding leaves relative to mature leaves (supplemental Fig. S5). This finding is in line with the possibility that the observed broad distribution of cytosolic proteins along the non-aqueous gradient was a result of a mixture of cytosolic proteins from different leaf developmental stages. Cells may have contrasting organization and composition in compartments that may lead to dif-ferent densities in particles that are derived from them. As additional support of cellular heterogeneity on the gradient, we found rather high variability among the transcription factors and ribosomal proteins. These protein classes are particularly involved in developmental processes and differentiation and show strong abundance difference in developing leaves.
It is possible to obtain specific cell types via cryodissection or single cell sorting (78,79), but such processes are not suitable for measurements of metabolites that turn over rapidly. Even if non-aqueous fractionation could be performed with material obtained via these methods, obtaining sufficient material would be extremely difficult and time consuming. The most pragmatic route might be to use tissue samples that are as homogeneous as possible. For this, large-leaved species would bring advantages over Arabidopsis, and this will be possible in the future as genome sequencing and stringent gene annotations allow large-scale proteomics to be applied to an increasing number of plant species.
Overlap of Protein and Metabolite Distributions-Despite the issues raised above, non-aqueous fractionation remains the technique of choice for the assignment of subcellular location to metabolites in intact plant tissues, and it could also provide useful information for non-plant systems. Our study allowed comparison of the distributions of large numbers of proteins with those of metabolites. For metabolites that are wholly or mainly assigned to the plastid, more or less close agreement was obtained with the distribution of plastidial proteins in the gradients (not shown) and in the PCA (Fig. 4C). As shown in supplemental Fig. S6, the associations within proteins and within metabolites of the same compartment are similar in gradient 1 and in the 10-fraction gradient. There was a tendency for the association between proteins and metabolites to be slightly weaker, as shown by the greater PCA distance of these interactions (supplemental Fig. S6). For plastids, connections between metabolites were particularly strong, as indicated by the small PCA distance, relative to associations between proteins or between metabolites and proteins.
However, we observed close connections in our proteinmetabolite network between Calvin-Benson cycle enzymes and metabolites, which does support the general suitability of non-aqueous fractionation for the study of protein-metabolite interactions. As starch metabolism and the Calvin-Benson cycle are plastidial pathways, whereas sucrose synthesis and glycolysis are cytosolic pathways (or at least partially for glycolysis), it was surprising that starch metabolism and sucrose synthesis formed distinct subnetworks only in the 12fraction gradient where the Calvin-Benson cycle and glycolysis were connected. This variability might have arisen because starch granules and associated proteins/metabolites separated slightly differently from the Calvin-Benson cycle enzymes on non-aqueous gradient 1, and the slightly higher number of fractions might have influenced the resolution of this separation. However, we do not have experimental evidence for this hypothesis. In addition, the coverage of identified proteins might not have been high enough to generally reveal a connection between the Calvin-Benson cycle and starch metabolism. The starch metabolism subnetwork in the 12-fraction gradient strictly contained starch, starch synthase, and both adenosine diphosphate glucose pyrophosphorylase subunits, but no enzymes involved in early steps of starch synthesis such as plastidial phosphogluco-isomerase and phosphoglucomutase. In the 10-fraction gradient, ADPG was identified as a further metabolite in the starch synthesis network. Concerning sucrose synthesis and glycolysis, the inconsistent finding of connections between the two might be due to the fact that sucrose metabolism occurs in the cytosol of mesophyll cells, whereas glycolysis occurs in many different cell types. However, the elements identified from sucrose synthesis were strictly sucrose, sucrose-6-phosphate, cytosolic fructose-1,6-bisphosphatase, and sucrose-phosphate synthase (and additionally UDP-glucose in the 10-fraction gradient), and this small number might explain the absence of observed connections to other cytosolic proteins.
In this study, metabolites were measured in absolute amounts, whereas only relative values were obtained for proteins. This might have affected the dynamic range and resolution of the dataset and thus might also influence correlation profiles. For more accurate protein abundance measurements, more precise quantitation via the inclusion of reference peptides and targeted protein analysis could be used. Such a targeted quantitative analysis was recently carried out to classify the abundances of protein isoforms involved in the central metabolism in yeast cells (80). By using absolute quantification for the proteins, one might obtain a better resolution of the network and links between the subnetworks.
Conclusion-Non-aqueous density gradient centrifugation proved to be a suitable method for subcellular separation of proteins and metabolites, including those involved in major pathways of carbon metabolism. Analysis of the distributions of large numbers of proteins validated the use of non-aqueous fractionation to study the distribution of metabolites among the plastids, cytosol, and vacuole, but it also highlighted limitations of the technique and the need to improve resolution, especially separation of the cytosol from other compartments. Our analyses also suggest new possibilities for systems biology research for which parallel monitoring of proteins and metabolites is of high interest in response to environmental perturbations (81). The co-separation of proteins and metabolites could then be used for profiling type discovery studies, as well as for targeted analyses of particular metabolic pathways and metabolic models.