Characterization of Anaerobic Catabolism of p-Coumarate in Rhodopseudomonas palustris by Integrating Transcriptomics and Quantitative Proteomics*S

In this study, the pathway for anaerobic catabolism of p-coumarate by a model bacterium, Rhodopseudomonas palustris, was characterized by comparing the gene expression profiles of cultures grown in the presence of p-coumarate, benzoate, or succinate as the sole carbon sources. Gene expression was quantified at the mRNA level with transcriptomics and at the protein level with quantitative proteomics using 15N metabolic labeling. Protein relative abundances, along with their confidence intervals for statistical significance evaluation, were estimated with the software ProRata. Both -omics measurements were used as the transcriptomics provided near-full genome coverage of gene expression profiles and the quantitative proteomics ascertained abundance changes of over 1600 proteins. The integrated gene expression data are consistent with the hypothesis that p-coumarate is converted to benzoyl-CoA, which is then degraded via a known aromatic ring reduction pathway. For the metabolism of p-coumarate to benzoyl-CoA, two alternative routes, a β-oxidation route and a non-β-oxidation route, are possible. The integrated gene expression data provided strong support for the non-β-oxidation route in R. palustris. A putative gene was proposed for every step in the non-β-oxidation route.

In this study, the pathway for anaerobic catabolism of p-coumarate by a model bacterium, Rhodopseudomonas palustris, was characterized by comparing the gene expression profiles of cultures grown in the presence of p-coumarate, benzoate, or succinate as the sole carbon sources. Gene expression was quantified at the mRNA level with transcriptomics and at the protein level with quantitative proteomics using 15 N metabolic labeling. Protein relative abundances, along with their confidence intervals for statistical significance evaluation, were estimated with the software ProRata. Both -omics measurements were used as the transcriptomics provided nearfull genome coverage of gene expression profiles and the quantitative proteomics ascertained abundance changes of over 1600 proteins. The integrated gene expression data are consistent with the hypothesis that p-coumarate is converted to benzoyl-CoA, which is then degraded via a known aromatic ring reduction pathway. For the metabolism of p-coumarate to benzoyl-CoA, two alternative routes, a ␤-oxidation route and a non-␤-oxidation route, are possible. The integrated gene expression data provided strong support for the non-␤-oxidation route in R.

palustris. A putative gene was proposed for every step in the non-␤-oxidation route. Molecular & Cellular Proteomics 7:938 -948, 2008.
Lignin constitutes almost one-third of all plant dry mass, making it the second most abundant organic compound on earth after cellulose. Biodegradation of lignin during the decay of plant residues in natural environments is a massive biological process within the global carbon cycle (1). Lignin biodegradation is also of great practical significance because of its potential application to biological treatment and the reuse of agricultural wastes. Lignin is a polymer of phenylpropanoid units, and its biodegradation involves depolymerization and subsequent catabolism of the derived aromatic monomers (2). p-Coumarate, or 4-hydroxycinnamic acid, is one of the main aromatic monomers (3). The biodegradation of p-coumarate can take place not only under oxygen-rich environments but also under anoxic environments as in aquifers, aquatic sediments, and submerged soils.
The purple, non-sulfur phototrophic bacterium Rhodopseudomonas palustris has served as a model organism for studies of anaerobic aromatic compound degradation (4,5). R. palustris can grow aerobically by respiration or anaerobically using photophosphorylation. Under anaerobic conditions R. palustris generates energy from light and uses organic compounds, including aliphatic and aromatic compounds, as its source of carbon for growth. Structurally diverse single aromatic ring compounds, such as p-coumarate and p-hydroxybenzoate, are proposed to be degraded to a central intermediate, benzoyl-CoA, via various peripheral pathways (6,7). Then benzoyl-CoA is degraded to acetyl-CoA through a central pathway in the steps of ring reduction, ring cleavage, and ␤-oxidation (see Fig. 1) (6). Most enzymes in the benzoyl-CoA pathway have been identified and characterized. However, the peripheral pathway used by R. palustris to anaerobically convert p-coumarate to benzoyl-CoA has not been elucidated.
Different metabolic routes have been discovered for catabolizing ferulate, (4-hydroxy-3-methoxycinnamic acid), which differs from p-coumarate by an additional methoxyl group on the aromatic ring. The two major routes are 1) a ␤-oxidation route found in Pseudomonas acidovorans (8) and Rhodotorula rubra (9) and 2) a non-␤-oxidation route found in Pseudomonas fluorescens (10,11), Delftia acidovorans (12), and Pseudomonas sp. strain HR199 (13). Fig. 1 illustrates the two probable routes for p-coumarate catabolism in R. palustris. Both routes convert the intermediate p-coumaroyl-CoA to 4-hydroxybenzoyl-CoA, yielding an acetyl-CoA. However, the ␤-oxidation route uses a ␤-oxidation sequence of reactions for the conversion, whereas the non-␤-oxidation route directly cleaves an acetyl-CoA from p-coumaroyl-CoA to generate 4-hydroxybenzaldehyde, which is then oxidized to 4-hydroxybenzoate in a CoA-independent manner. Both routes could easily function under anaerobic conditions. Furthermore the genome sequence of R. palustris indicates the genetic potential for both routes. There are many ␤-oxidation genes in the R. palustris genome. Similarly there are several candidate genes for the enoyl-CoA hydratase/lyase and aldehyde dehydrogenase needed for the non-␤-oxidation route. In this study, we aimed to determine which of the two routes is likely used for anaerobic p-coumarate catabolism and furthermore to identify the probable genes for every step of this catabolic pathway. To this end, the global gene expression profile of R. palustris grown with p-coumarate as the sole organic carbon source was compared with those of R. palustris grown with succinate or benzoate ( Figs. 1 and 2).
Gene expression activity can be measured at the mRNA level with transcriptomics and at the protein level with proteomics. In many of the past studies, proteomics measurements have largely been qualitative with the aim of detecting the presence of as many proteins as possible from a proteome (14). However, it is not straightforward to correlate the presence of proteins detected by qualitative proteomics with the relative abundances of mRNAs quantified by transcriptomics (15). With the development of quantitative proteomics using stable isotope labeling, the relative protein abundances can now be quantitatively measured for thousands of genes (16). In this study, transcriptomics and quantitative proteomics were integrated for global gene expression profiling. The comparison of the p-coumarate growth condition with the succinate growth condition yielded the relative expression level for 1680 genes at both the mRNA level and the protein level. 1324 genes had no significant expression change at both levels. 58 genes were up-regulated significantly at both levels; some of these genes are known to be involved in aromatic compound degradation. Interestingly 41 genes were up-regulated significantly at the protein level but not at the mRNA level, warranting further investigation for post-transcriptional regulations and experimental artifacts. Similar trends were also observed in the other comparisons in this study.

MATERIALS AND METHODS
Bacterial Growth and Metabolic Stable Isotope Labeling-R. palustris strain CGA010 (17), a derivative of the sequenced strain CGA009 (18), was grown anaerobically in defined mineral growth medium (19) at 30°C with ample incandescent light illumination. (NH 4 ) 2 SO 4 was the sole nitrogen source for bacterial assimilation and was provided as ( 14 NH 4 ) 2 SO 4 for the unlabeled culture and as ( 15 NH 4 ) 2 SO 4 for the 15 N-labeled culture (Ͼ98 atom % excess, Sigma-Aldrich). p-Coumarate (3 mM) was supplied as the sole organic carbon source for the unlabeled p-coumarate culture. Benzoate (3 mM) and succinate (10 mM) were supplied as the sole organic carbon sources for the 15 Nlabeled benzoate and succinate cultures, respectively. Duplicate cultures were prepared for each of the three growth conditions as biological replicates. Cell growth was monitored spectrophotometri-cally at 660 nm, and cells were harvested in midlog phase at OD 660 nm of 0.6 by centrifugation and washed twice with ice-cold wash buffer (50 mM Tris-HCl buffer at pH 7.5 with 10 mM EDTA). The harvested cell pellet from each culture was divided for microarray analysis and quantitative proteomics measurements (Fig. 2).
Transcriptomics Analysis-RNA was isolated from the cell pellets of the duplicate unlabeled p-coumarate cultures, the duplicate 15 Nlabeled succinate cultures, and the duplicate 15 N-labeled benzoate cultures as described previously (Fig. 2) (20). Briefly cells were disrupted by bead beating, and RNAs were purified with RNeasy minikits (Qiagen), including DNase treatment on the columns. The quality of RNA (integrity and DNA contamination) was determined with an Agilent 2100 bioanalyzer (Agilent, Palo Alto, CA) and by RT-PCR using the 16 S rRNA-targeted primer set of 27F (Escherichia coli positions 8 -27) and 519R (positions 536 -519) (21). A high density oligonucleotide microarray (Affymetrix GeneChip) was custom-designed and manufactured by Affymetrix based on the sequences of ORFs and intergenic regions. cDNA synthesis, fragmentation, labeling, hybridization, and processing of custom-designed R. palustris GeneChips were carried out as recommended by the manufacturer. The Affymetrix GeneChip Operating Software Version 1.4 was used for initial data acquisition and processing. Transcript data were further analyzed by using the Cyber-T program (22). The mRNA abundance of a gene was considered significantly up-or down-regulated if the log ratio (log 2 abundance ratio) of the mRNA is greater than 1 (upregulated) or less than Ϫ1 (down-regulated), the p value is less than 0.001, and the posterior probability of differential expression threshold is greater than 0.97. The microarray data were deposited in the NCBI Gene Expression Omnibus database under accession number GSE6221.
Proteome Sample Preparation-Duplicate cell mixtures of the unlabeled p-coumarate culture and the 15 N-labeled succinate culture were prepared by mixing cell pellets of equal weights from duplicate cultures (Fig. 2). Duplicate cell mixtures of the unlabeled p-coumarate culture and the 15 N-labeled benzoate culture were prepared similarly. The cell mixtures were lysed by sonication in ice-cold wash buffer, and unbroken cells were removed by centrifugation at 5,000 ϫ g for 10 min. The cell lysates were fractionated by ultracentrifugation at 100,000 ϫ g for 1 h. The resulting supernatants were labeled as the soluble protein fractions. The pellets were resuspended by sonication and labeled as the membrane protein fractions. The protein concentration of each sample was determined using Lowry analysis (23). All samples were digested using the following protocol. The proteins were denatured and reduced with 6 M guanidine and 10 mM DTT (Sigma) at 60°C for 1 h. The samples were then diluted 6-fold with 50 mM Tris, 10 mM CaCl 2 (pH 7.6), and sequencing grade trypsin was added at 1:100 (w/w). The first digestion was run overnight at 37°C, and after adding additional trypsin, the second digestion was run for 5 h at 37°C. Finally the samples were reduced with 20 mM DTT for 1 h at 60°C and desalted using C 18 solid-phase extraction (Sep-Pak Plus, Waters, Milford, MA).
Quantitative Proteomics Measurement-All samples were examined with LC-MS using a 12-step, split-phase MudPIT 1 (24,25) technique. MudPIT measurements were repeated for every sample as technical replication. The samples were loaded via a pressure cell (New Objective, Woburn, MA) onto a 250-m-inner diameter back column packed with 2 cm of C 18 reverse-phase resin (Aqua, Phenomenex, Torrance, CA) and 2 cm of strong cation exchange resin (Luna, Phenomenex). The back column was connected to a 15-cmlong 100-m-inner diameter C 18 reverse-phase PicoFrit column (New Objective) and placed in line with an Ultimate quaternary HPLC system (LC Packings, a division of Dionex, San Francisco, CA). The two-dimensional LC separation was performed with 12 salt pulses, each of which was followed by a 2-h reverse-phase gradient elution. The LC eluent was directly electrosprayed into an linear trapping quadrupole linear ion trap mass spectrometer (ThermoFinnigan, San Jose, CA). Each full scan (400 -1700 m/z) was followed by three data-dependent MS/MS scans at 35% normalized collision energy with dynamic exclusion enabled. The full scans were averaged from five microscans, and the MS/MS scans were averaged from two microscans.
Quantitative Proteomics Data Analysis-A protein database from the annotated R. palustris genome containing a total of 4836 protein entries (Version 3; www.genome.ornl.gov) was used for protein identification with database searching. No redundant protein entry was present in the protein database. An in-house data preprocessing pipeline based on Xcalibur Development Kit was used to generate DTA files in the same way as the ThermoFinnigan Bioworks software. All DTA files were searched in two iterations against an R. palustris protein sequence database (18) using the SEQUEST program Version 27 (26) (enzyme type, trypsin; parent mass tolerance, 3.0; fragment ion tolerance, 0.5; up to four missed cleavages allowed; fully tryptic peptides only). In the first iteration, unlabeled amino acids were used. In the second iteration, 15 N-labeled amino acids were used (0.98 Da added to glycine, alanine, serine, proline, valine, threonine, cysteine, leucine, isoleucine, aspartic acid, glutamic acid, methionine, phenylalanine, and tyrosine; 1.96 Da added to asparagine, glutamine, lysine, and tryptophan; 2.94 Da added to histidine; and 3.92 Da added to arginine). The peptide identifications (OUT files) from the two iterations were combined. The DTASelect 1.9 program (27) was used to filter peptide identifications and assemble peptides into proteins using the following parameters: retaining the duplicate MS/MS scans for each peptide sequence (DTASelect option: Ϫt 0); fully tryptic peptides only; a ⌬Cn threshold of at least 0.08; cross-correlation score (Xcorr) thresholds of at least 1.8 (ϩ1), 2.5 (ϩ2), and 3.5 (ϩ3); and a minimum of two identified peptides for a protein. These widely accepted SE-QUEST-DTASelect filtering criteria have been shown to provide a high identification confidence with a maximum false-positive rate of 1-2% (27)(28)(29). Identification results from the DTASelect program are provided in supplemental Table 1. Selected ion chromatogram extraction, peptide abundance ratio estimation, and protein abundance ratio estimation were completed with the program ProRata 1.1 using default parameters as described previously (30,31). Briefly for each biological replicate of a direct comparison, identified peptides were quantified and filtered with a minimum profile signal-to-noise ratio cutoff of 2. Then proteins consisting of at least two quantified peptides were evaluated for quantification. 90% confidence intervals were calculated for every protein as an error bar of their abundance ratio estimate. Two biological replicates of a direct comparison were combined, and proteins were further filtered with a maximum confidence interval width cutoff of 3 (supplemental Tables 2 and 3). The two direct comparisons were combined for the succinate-versusbenzoate indirect comparison, and proteins were also filtered with a maximum confidence interval width cutoff of 3 (supplemental Table  4). The protein abundance of a gene was considered significantly upor down-regulated if the log ratio (log 2 abundance ratio) of the protein is greater than 1 (up-regulated) or less than Ϫ1 (down-regulated) and the 90% log ratio confidence interval excludes zero. The MS raw data and the data analysis configuration files and results are available upon request.

Experimental Design and Results
Overview-Both transcriptomics and quantitative proteomics measure gene expression changes between a reference condition and a treat-ment condition. The treatment condition in this study was anaerobic photosynthetic cell growth with p-coumarate as the sole organic carbon source. To identify the genes activated for p-coumarate catabolism, two reference conditions were selected in which succinate or benzoate replaced p-coumarate as the sole organic carbon source. As succinate is a simple dicarboxylic acid, the succinate growth condition provides a base-line gene expression profile of R. palustris without aromatic degradation activity; thus, comparison of the p-coumarate condition with the succinate condition should help elucidate the entire pathway for p-coumarate catabolism (Fig. 1). The comparison between the benzoate condition and the p-coumarate condition was expected to yield more focused information on the peripheral pathway from p-coumarate to benzoyl-CoA (Fig. 1).
The experimental scheme is shown in Fig. 2. At the cell culture step, R. palustris was grown with 15 N metabolic labeling for the two reference conditions. Two biological replicates were prepared for each condition and analyzed independently to capture biological variability. Each biological replicate was divided for microarray and quantitative proteomics analysis. We found that the correlation between transcriptomics results and quantitative proteomics results was improved by dividing the same sample for the two measurements as compared with using separate samples for the two measurements (data not shown).
Microarray analysis was performed with a custom-designed GeneChip, which contained probes for all 4836 predicted genes and 3190 intergenic regions in the R. palustris genome. Each biological replicate was analyzed with a GeneChip microarray. The reproducibility of raw signal intensities between two biological replicates (measured in Pearson correlation coefficient) was above 0.97 for all three growth conditions. The duplicate results for each comparison were combined with the Cyber-T program (Fig. 2). Note that RNA molecules from the succinate and benzoate conditions contained 15 Nenriched nitrogen as a result of 15 N metabolic labeling.
The unlabeled p-coumarate culture was mixed with the 15 N-labeled succinate or benzoate cultures for quantitative proteomics analysis (Fig. 2). Direct comparisons between the p-coumarate condition and the two reference conditions were obtained with biological replication at the cell growth stage and technical replication at the LC-MS/MS measurement stage. The ProRata program was used to merge technical replicates for each biological replicate, then to combine the two biological replicates for each direct comparison, and finally to integrate the two direct comparisons for the indirect comparison. The numbers of identified proteins and quantified proteins, before and after combination, are shown in Table I. For the combination of biological replicates, proteins identified in only one replicate were filtered out, and proteins quantified with inconsistent log ratio confidence intervals between the two replicates were also filtered out. The quantification reproducibility between the two biological replicates of a direct comparison was measured with the Pearson correlation coefficients (r) between the replicate protein abundance ratios in log 2 scale (protein log ratio) (Fig. 3). The level of reproducibility was comparable between transcriptomics and quantitative proteomics.
Integration of mRNA Abundance Profiles and Protein Abundance Profiles-The transcriptomics results and the quantitative proteomics results were cross-matched by gene locus (supplemental Tables 2-4). Due to the incomplete genome coverage by proteomics, "not available" (N/A) values were assigned to undetermined protein log ratios. Correlations between the mRNA log ratios and the protein log ratios of cross-matched genes are shown in the Fig. 4, left column; all have positive Pearson correlation coefficients. The majority of genes were distributed around the center within the square area of mRNA log ratio range (Ϫ1, 1) and protein log ratio range (Ϫ1, 1). This shows that there is little correlation between minor mRNA abundance changes and minor protein abundance changes. The relatively lower correlation coefficient of the p-coumarate-versus-benzoate comparison is a result of the lower number of genes with large expression changes. To quantify the discrepancy between mRNA log ratio and protein log ratio of individual genes, the differences between the mRNA log ratio and protein log ratio (⌬log ratio) of every gene were calculated. Histograms of the log ratio differences are shown for the three comparisons in Fig. 4, right column.
The log ratios measured by transcriptomics and quantitative proteomics were statistically assessed to categorize genes by their regulation directions, namely, up-, down-, or null-regulation at the mRNA level or the protein level. Fig. 5 shows the number of genes in each category. The categories were color-coded to illustrate the complementarity between transcriptomics and proteomics. The categories in yellow highlight the advantage of transcriptomics, near full coverage of the R. palustris genome, as opposed to ϳ30% genome coverage by proteomics. Membrane proteins and low abundance proteins are particularly difficult for proteomics to measure. The categories in red contain genes with concordant results from transcriptomics and proteomics. The agreement between the two independent measurements enhances the confidence in the expression change of these genes and alleviates the need for validating the expression change of these genes individually with other biochemical experiments, such as RT-PCR and Western blotting. The categories in green contain genes with inconsistent regulation directions between the mRNA level and the protein level. Only a few genes were up-regulated at the mRNA level and down-regulated at the protein level or vice versa. However, many genes showed significant abundance change at one level but insignificant abundance change at the other level. This indicates that one must be cautious in concluding that the presence or absence of a significant mRNA abundance change of a gene detected by transcriptomics necessarily corresponds to the presence or absence of a significant protein abundance change detected by proteomics. This is especially true when considering relatively small changes in mRNA abundance. The discrepancy between mRNA log ratio and protein log ratio of a gene can stem from measurement errors, sustained protein presence from transient transcriptional induction, post-transcriptional regulation, or any combinations of these causes (32).

Measurement of Up-regulated Expression of 4-Hydroxybenzoyl-CoA Degradation Enzymes during p-Coumarate Catabolism-Previous studies have suggested that p-coumarate
is converted to 4-hydroxybenzoyl-CoA, which is then degraded through the benzoyl-CoA pathway (33). The known steps for the 4-hydroxybenzoyl-CoA degradation are shown in Fig. 1 with solid arrows. Expression of the genes involved in each step was compared among the three R. palustris growth conditions: p-coumarate, benzoate, and succinate (Table II). Both mRNA abundances and protein abundances of all these genes were greatly up-regulated in the p-coumarate condition compared with the succinate condition, showing the induction of 4-hydroxybenzoyl-CoA degradation activity during pcoumarate catabolism. This supports the part of the proposed p-coumarate pathway downstream of 4-hydroxybenzoyl-CoA as shown in Fig. 1.
The comparison between the p-coumarate condition and the benzoate condition is also interesting. 4-Hydroxybenzoyl-CoA reductase (RPA0670 -0672) for dehydroxylation of 4-hydroxybenzoyl-CoA to benzoyl-CoA was more abundant in the p-coumarate condition than in the benzoate condition; this agrees with published data showing that this enzyme is not required for benzoate degradation (34) (Fig. 1). The degrada-  tion of both p-coumarate and benzoate is known to proceed through the benzoyl-CoA pathway. The comparison shows that this pathway was insignificantly or moderately up-regulated according to transcriptomics, whereas it was significantly down-regulated according to quantitative proteomics (Table II). Because the discrepancy between mRNA abundance change and protein abundance change was found for nine genes across multiple operons, it is probably of biological significance. Genes encoding the two forms of ribulose-bisphosphate carboxylase (RPA1559, RPA1560, and RPA4641), along with their associated genes for carbon dioxide fixation (RPA1561 and RPA4642-4645), were induced in both p-coumarate and benzoate conditions (supplemental Tables 1 and 3). 6 reducing eq of [H] are generated when a molecule of benzoyl-CoA is degraded into acetyl-CoA. Carbon compounds like p-coumarate and benzoate, which are electron-rich relative to cell material, cannot be fully assimilated into biomass unless an external electron acceptor like carbon dioxide is available. Thus, it makes sense that the main carbon dioxide-assimilat- FIG. 4. Comparison of mRNA log ratios and protein log ratios. The protein log ratios and mRNA log ratios of quantified genes are shown as scatter plots for the three comparisons. The histograms of the differences between the protein log ratio and the mRNA log ratio (⌬log ratio) were also constructed for the three comparisons.
ing enzymes of the Calvin cycle were expressed at higher abundances to serve as a reducing equivalent sink during growth on aromatic compounds.
The global gene expression profiling also revealed other induced cellular activities that are indirectly related to aromatic degradation in R. palustris. Several methyl-accepting chemotaxis proteins (RPA0139, RPA0142, RPA1678, RPA3185, RPA4302, and RPA4639) were induced in both aromatic degradation conditions (supplemental Tables 1 and  3). R. palustris is a motile bacterium, and these proteins could serve as chemoreceptors that enable cells to sense and swim toward plant-derived aromatic compounds (35). A multitude of different types of membrane transporters were also induced in both aromatic degradation conditions (Table III and  supplemental Tables 1 and 3). Some of them may facilitate the uptake of aromatic compounds. In addition, R. palustris often transiently excretes partially degraded aromatic intermediates into the growth media (36). Active transport may be required for the reuptake of these compounds into cells.
Identification of Key Genes for the Anaerobic Conversion of p-Coumarate to 4-Hydroxybenzoyl-CoA-A primary aim of this study is to infer whether the ␤-oxidation route or the non-␤-oxidation route is likely used to convert p-coumaroyl-CoA to 4-hydroxybenzoyl-CoA (Fig. 1). The ␤-oxidation route would consist of 1) an enoyl-CoA hydratase, 2) a 3-hydroxya-cyl-CoA dehydrogenase, and 3) an acyl transferase. The non-␤-oxidation route would consist of 1) an enoyl-CoA hydratase/ lyase (active for both hydration of the enoyl-CoA C-C double bond and lyation of the resultant C-C single bond to release an acetyl-CoA (11)), 2) an aldehyde dehydrogenase, and 3) the known 4-hydroxybenzoate-CoA ligase (RPA0669). Whichever route is used, the genes encoding the enzymes in that route were expected to be up-regulated in the p-coumarate condition in comparison with both the benzoate condition and the succinate condition. Table III lists all the genes measured by quantitative proteomics to be significantly up-regulated at the protein level in the p-coumarate condition in comparison with the other two reference conditions.
Based on the R. palustris genome sequence, the two most likely gene sets for the ␤-oxidation route, if this route is used, are RPA1703-1706 and RPA0674 -0676. Both gene sets could encode all ␤-oxidation enzymes. More importantly, the gene set RPA1703-1706 lies immediately next to a possible p-coumarate-CoA ligase, RPA1707. It seems reasonable to arrange these genes into an operon. The gene set RPA0674 -0676 is in a region of the R. palustris genome where many known aromatic degradation genes are located (see the loci of the aromatic degradation genes listed in Table II). However, neither gene set was found to be significantly up-regulated in the p-coumarate condition (supplemental Tables 1 and 2). Furthermore out of the up-regulated genes in the p-coumarate condition, we failed to find a ␤-oxidation gene set that is likely to be used for p-coumarate degradation.
On the other hand, every enzyme in the non-␤-oxidation route has a candidate gene identified from Table III with enhanced protein abundance in the p-coumarate condition (Fig. 6). Due to the similarity between p-coumarate and ferulate, their CoA ligases have a high sequence similarity, and three loci are annotated as ferulate-or p-coumarate-CoA ligases in the R. palustris genome, including RPA1707, RPA1787, and RPA4421. Among the three loci, RPA1787 is the only one up-regulated in the p-coumarate condition, making it the probable gene responsible for the p-coumarate-CoA ligation.
Only 9 base pairs upstream of RPA1787 in the R. palustris genome is similarly up-regulated RPA1786. The short intergenic distance between the two loci suggests their co-transcription and close functional link. RPA1786 belongs to a group of enzymes, the enoyl-CoA hydratases/isomerases (pfam00378), whose members have diverse activities. Thus, it is difficult to know with certainty what the specific function of RPA1786 might be based on its deduced amino acid sequence. We suggest that the enoyl-CoA hydratase/lyase in the non-␤-oxidation route could be encoded by RPA1786 (Fig. 6).
It is known that R. palustris readily oxidizes 4-hydroxybenzaldehyde to 4-hydroxybenzoate (33). RPA1206 is the only annotated aldehyde dehydrogenase with significantly up-regulated protein abundance in the p-coumarate condition. In-   terestingly this locus showed insignificantly down-regulated mRNA abundance in the p-coumarate condition. It is unclear whether RPA1206 has the substrate specificity for 4-hydroxybenzaldehyde. But given the expression data, this locus is a possible aldehyde dehydrogenase that could be used in the non-␤-oxidation route (Fig. 6). 4-Hydroxybenzoate-CoA ligase is encoded by RPA0669 (37) (Fig. 6). The substantial up-regulation of this enzyme in the p-coumarate condition suggests that a large flux of 4-hydroxybenzoate exists during p-coumarate catabolism. We have observed previously the accumulation of 4-hydroxybenzoate in culture media during the growth of R. palustris on p-coumarate (38). It is possible for the ␤-oxidation route to generate a flux of 4-hydroxybenzoate by hydrolyzing 4-hydroxybenzoyl-CoA and then religating 4-hydroxybenzoate with free CoA. But this is less plausible than the explanation with the non-␤-oxidation route in which 4-hydroxybenzoate is an intermediate metabolite.
Taken in total, the gene expression data support that the anaerobic catabolism of p-coumarate in R. palustris proceeds through the non-␤-oxidation route and then through the central benzoyl-CoA pathway (Fig. 1). Furthermore putative loci were identified for every enzyme in the non-␤-oxidation route (Fig. 6). Formulation of such a testable pathway in its entirety demonstrates the value of the integrated transcriptomics and quantitative proteomics approach. CONCLUSIONS Quantitative proteomics measurements were conducted to provide a global view of the cellular pathways involved in p-coumarate catabolism for R. palustris by using metabolic stable isotope labeling, MudPIT analysis, and the ProRata data analysis program. By characterizing technical and biological replicates, over 1600 proteins were confidently quantified from R. palustris cultures grown with succinate, benzoate, or p-coumarate as the sole carbon sources. The same R. palustris cultures were also examined by transcriptomics. The transcriptomics results and the quantitative proteomics results were integrated to reconstruct global gene expression profiles of R. palustris at the mRNA level and the protein level. This integrated gene expression data set provided evidence that the anaerobic degradation of p-coumarate proceeds through a non-␤-oxidation route, rather than an alternative ␤-oxidation route, and then through the central benzoyl-CoA pathway. RPA1787, RPA1786, and RPA1206 were hypothesized to be the probable p-coumarate-CoA ligase, p-coumaroyl-CoA hydratase/lyase, and 4-hydroxybenzaldehyde dehydrogenase, respectively, in the non-␤-oxidation route.