Arabinose-Induced Catabolite Repression as a Mechanism for Pentose Hierarchy Control in Clostridium acetobutylicum ATCC 824

Clostridium acetobutylicum can ferment a wide variety of carbohydrates to the commodity chemicals acetone, butanol, and ethanol. Recent advances in genetic engineering have expanded the chemical production repertoire of C. acetobutylicum using synthetic biology. Due to its natural properties and genetic engineering potential, this organism is a promising candidate for converting biomass-derived feedstocks containing carbohydrate mixtures to commodity chemicals via natural or engineered pathways. Understanding how this organism regulates its metabolism during growth on carbohydrate mixtures is imperative to enable control of synthetic gene circuits in order to optimize chemical production. The work presented here unveils a novel mechanism via transcriptional regulation by a predicted Crh that controls the hierarchy of carbohydrate utilization and is essential for guiding robust genetic engineering strategies for chemical production.

binds to CcpA, generating the functional repressive complex (22). Catabolite regulation by CcpA has been characterized in C. acetobutylicum, and genes encoding HPr and HPr kinase are present in the genome (23,24). In some other Gram-positive bacteria, a second protein Crh (catabolite repression HPr) can be phosphorylated by HPr kinase and activate CcpA (25). In contrast to CcpA, Crh proteins cannot phosphorylate enzyme I of the PTS because of a glutamine residue present at the N-terminal histidine phosphorylation site that is required for PTS phosphorylation (26). CRE sites are found within the promoter or coding sequence of operons for nonpreferred carbohydrates and allow PTS-mediated inhibition of their transcription (27). CcpA binding sites in the C. acetobutylicum genome have been identified computationally both within and upstream of coding regions for xylose and arabinose metabolic genes using the Bacillus subtilis CRE consensus sequence as well as by electrophoretic mobility shift assay (EMSA) (23,28). C. acetobutylicum transcriptional profiles confirm that genes with predicted CcpA binding sites are significantly repressed in the presence of glucose (14,19). It is unknown if pentose metabolism in C. acetobutylicum can generate similar regulation.
In C. acetobutylicum, pentoses are thought to be transported via proton symporters and therefore are not expected to interact with components of the PTS (29). The regulators of xylose and arabinose catabolic genes, XylR and AraR, respectively, act as (B) Map of arabinose and xylose metabolism-associated genes: xylose associated (yellow arrows), arabinose associated (blue arrows), PPP genes (green arrows), and phosphoketolase (red arrow). The pattern within each arrow illustrates operon grouping. Turquoise bars indicate AraR binding sites, purple bars indicate XylR binding sites, and gray bars indicate CcpA binding sites (CRE). Striped bars indicate color-respective repressor sites proposed in this article. Asterisk indicates gene previously identified as an XylR, but most likely incorrectly annotated. Data from references 9, 10, 23, and 28 were used and expanded upon to construct these schematics.
repressors of transcription in C. acetobutylicum (9,30). Although two proteins have been proposed to perform the function of the xylose repressor XylR (CA_C3673 and CA_C2613), it is most likely that CA_C2613 is the primary xylose repressor based on experimental findings and data presented here (10,28,30). XylR binding sites have been identified both computationally and experimentally upstream and within the predicted CA_C2612-CA_C2611-CA_C2610 operon and computationally for the CA_C3451-CA_C3452 operon (10,28). Elevated transcript levels for these genes have been recorded for cells grown on xylose compared to those grown on arabinose or glucose, suggesting that the predicted XylR and CRE sites are functional (14,19). Rodionov et al. identified AraR (CA_C1340) binding sites upstream of CA_C1339, CA_C1340, the predicted CA_C1341-CA_C1342 operon, CA_C1343, the predicted CA_C1344-CA_C1349 operon, and the predicted CA_C1529-CA_C1530 operon (28). The binding of AraR to these sites was confirmed by EMSA, and mRNA levels of these genes were found to be higher in cells grown on arabinose than in xylose or glucose cultures (9). One notable exception is CA_C1339, a proposed xylose importer gene, which has a unique transcription profile with high mRNA levels in cells grown on xylose, moderate levels for those on arabinose, and extremely low levels on glucose ( Fig. 1B) (9,14,19,28).
Pentose metabolism has been studied extensively in E. coli, a Gram-negative bacterium, where arabinose has been shown to be preferentially utilized over xylose (31)(32)(33)(34); however, this phenomenon has not been extensively explored in Grampositive bacteria. We previously observed a significant disparity in the growth rate of C. acetobutylicum ATCC 824 when fermenting arabinose compared to xylose, with xylosefermenting cultures growing significantly slower and having a substantially longer lag phase (19). This extended lag phase was also observed in Fig. S5 in the supplemental material from Aristilde et al. (17), in which the carbohydrate was almost completely consumed in an arabinose-fed culture before the culture fed with xylose began to appreciably utilize the sugar (17). Additionally, Aristilde used 13 C flux analysis to show that there was a preference for arabinose over xylose when the organism was provided both pentoses simultaneously (21).
Although the preferential utilization of arabinose over xylose in C. acetobutylicum has been documented, the mechanism (or mechanisms) driving this preference is not known. A better understanding of the mechanism behind the preferential use of arabinose over xylose would be beneficial in order to identify methods to fully utilize all the fermentable sugars present in readily available lignocellulosic biomass. We hypothesized that active repression of xylose utilization via transcriptional control will occur in the presence of arabinose. To test this hypothesis, we examined the effects of addition of glucose, arabinose, or xylose independently to C. acetobutylicum cultures actively growing on xylose. In addition to performing transcriptome sequencing (RNA-Seq) at various time points after addition of supplemental sugar, we examined growth, accumulation of organic end products, and carbohydrate utilization.

RESULTS
(i) Growth. C. acetobutylicum ATCC 824 was grown on xylose with different carbohydrate supplementations during early exponential growth to measure the effects of such additions. Noticeable effects on the cultures' growth profiles at the beginning of the exponential growth phase were observed when the cultures were initially grown on 0.5% xylose and subsequently supplemented with 0.25% arabinose [(ϩ)Ara], 0.25% glucose [(ϩ)Glu], 0.25% xylose [(ϩ)Xyl], or no additional sugar [(ϩ)None] as shown in Fig. 2A. (ϩ)None and (ϩ)Xyl cultures had similar doubling times, and addition of xylose extended exponential growth in (ϩ)Xyl by approximately 2 h beyond that seen in (ϩ)None. The growth rates in the (ϩ)Ara cultures were significantly increased over the other three culture conditions and reached their maximum doubling time by the 2nd hour after sugar supplementation, whereas the (ϩ)Glu cultures did not reach their maximum doubling time until between hours 3 and 5 ( Fig. 2A).
(ii) Sugar consumption. Addition of xylose increased the amount of time that it took (ϩ)Xyl to be completely depleted of xylose compared to (ϩ)None, which was expected (Fig. 2B). The (ϩ)Ara depleted arabinose and xylose in the medium after approximately 4 h and 8 h, respectively, while (ϩ)Glu depleted glucose from the medium after 5 h and still had greater than 25% of the initial xylose after 10 h of fermentation (Fig. 2B). Since there were differences in growth rates between the cultures on the various sugars, we normalized xylose consumption based on optical density at 600 nm (OD 600 ) and expressed it as the change in xylose concentration per the average OD 600 (average of OD 600 at beginning and end of time corresponding to ΔmM measurements) per hour designated ΔmM/OD 600(avg) /h (Fig. 2C). Xylose in (ϩ)None and (ϩ)Xyl was depleted initially at a rate of between 3.0 and 4.0 ΔmM/OD 600(avg) /h during the first 2 h postaddition and stayed above 2.0 ΔmM/OD 600(avg) /h through most of the exponential growth phase. Xylose consumption in (ϩ)Glu was dramatically slowed, dropping to below 1.0 ΔmM/OD 600(avg) /h by hour 3 and reaching 0.36 ΔmM/OD 600(avg) /h at hour 5, at which point the glucose was exhausted from the medium. Xylose consumption in (ϩ)Ara was even lower than was observed in (ϩ)Glu after 2 h [1.24 versus 1.72 ΔmM/OD 600(avg) /h] and equivalent to it after 3 h [0.87 ΔmM/OD 600(avg) /h]. Once arabinose was depleted at hour 4, the rate of xylose consumption matched that seen in (ϩ)None (Fig. 2C).
(iii) Metabolic output. Both xylose and arabinose are known to be metabolized through the PPP (which favors butyrate production) and PKP (which favors acetate production), and the relative flux through the two pathways is reflected in the ratio and concentration of acetate and butyrate produced (17). Compared to the PPP, the PKP oxidizes a lower proportion of carbon to CO 2 , which is coupled to a decrease in the reduction of the electron carriers NAD ϩ and ferredoxin. The decreased reduction of NAD ϩ to NADH in the PKP lessens the need to use acetyl coenzyme A (acetyl-CoA) as an electron acceptor to reoxidize NADH to NAD ϩ through butyrate formation. This allows the cells to use acetyl-CoA for ATP formation via conversion to acetate, which yields 1 ATP/acetyl-CoA, compared to butyrate formation, which yields 0.5 ATP/acetyl-CoA (18). (iv) Transcriptional analysis. To measure transcriptional responses to sugar supplementation of cultures actively growing on xylose, transcriptome sequencing (RNA-Seq) was performed for each condition over a 60-min time period after supplemental sugar addition. RPKM (Reads Per Kilobase of transcript per Million mapped reads) values were calculated, and genes that met the statistical cutoffs described in Materials and Methods were examined further. Hierarchical clustering of Euclidean distances for samples and genes was performed on the filtered gene set, and the results are presented in Fig. 3A. (ϩ)Ara 15-and 30-min samples clustered most closely to the preaddition samples, and (ϩ)Ara 60 min clustered most closely with the (ϩ)Glu (15-, 30-, and 60-min) samples, with (ϩ)Glu 60 min clustering closest to (ϩ)Ara 60 min. Further analysis of the differentially expressed genes indicated that 190 and 278 genes had altered mRNA expression levels after addition of arabinose and glucose, respectively. Of the differentially expressed genes, 146 were common between arabinose and glucose addition and 89 of these genes have been shown to be controlled by CcpA (23) (Fig. 3B). Gene clustering showed that genes controlled by CcpA and AraR or XylR were clustered together, but CcpA-controlled genes that lacked AraR or XylR control did not show a distinct clustering pattern (Fig. 3A, columns on right-hand side). Two notable genes are xfp (CA_C1343) and CA_C0149, which did not cluster closely with other differentially expressed genes. Phosphoketolase is encoded by xfp, and this gene differs from other AraR-controlled genes because it appears to lack a CRE site. CA_C0149, which has a predicted CRE site, was the most differentially regulated gene, with mean RPKM values of 69,182 and 61,658 for (ϩ)Glu and (ϩ)Ara preaddition samples, respec-tively, and 1,697 and 2,040 in the (ϩ)Glu 60 min and (ϩ)Ara 60 min samples, respectively (Fig. 3C).
All of the differentially expressed genes were divided into groups based on the presence and/or absence of a CcpA, AraR, and XylR binding site (predicted and known; see Materials and Methods), and the log 2 fold change (compared to preaddition samples) of these genes was plotted for all time points to capture the temporal response to sugar addition (Fig. 4). For (ϩ)Xyl and (ϩ)None, there was very little change in expression over the course of the experiment, regardless of presence or absence of CcpA, AraR, or XylR binding sites (see Fig. S1 in the supplemental material). There were many genes without known or predicted CcpA, AraR, or XylR binding sites that were differentially expressed in response to glucose and/or arabinose addition. We removed these genes from further analysis to focus on differential expression due to the interplay of known transcription factors. There was a single differentially expressed gene, CA_C1343 (xfp), with an AraR binding site and lacking CcpA and XylR binding sites that showed substantial changes in expression levels upon addition of arabinose (Fig. 4A, orange line) and glucose (Fig. 4A, yellow line). Differentially expressed genes with a CcpA binding site, but lacking AraR and XylR binding sites, showed significant changes in expression upon the addition of glucose, as expected (Fig. 4B, yellow lines), and many of these genes were also modulated by the addition of arabinose; however, the response to arabinose addition was delayed in some cases (Fig. 4B, orange lines). Differentially expressed genes with CcpA and XylR binding sites and lacking an AraR binding site showed decreased expression over time upon addition of glucose (Fig. 4C, yellow lines) and arabinose (Fig. 4C, orange lines). Differentially expressed genes with CcpA and AraR binding sites but lacking a XylR binding site were initially induced by the addition of arabinose, but expression levels returned to near preaddition levels by the 60-min time point (Fig. 4D, orange lines), whereas the expression of all of these genes was repressed upon addition of glucose (Fig. 4D, yellow lines). The final group analyzed were those differentially expressed genes with CcpA, AraR, and XylR binding sites (Fig. 4E). A single differentially expressed gene, CA_C1339, with all three binding sites had reduced expression upon addition of glucose and arabinose, with a notable delay in response to arabinose addition. The two other potential combinations of the binding sites, CcpA(Ϫ)/AraR(ϩ)/XylR(ϩ) and CcpA(Ϫ)/AraR(Ϫ)/XylR(ϩ), did not have any genes that met the criteria for differential gene expression and consequently were not included in this figure. (v) Putative catabolite repression HPr (Crh). CA_C0149, encoding a putative Crh of C. acetobutylicum, was the most differentially regulated gene upon addition of glucose or arabinose (Fig. 3C). This gene was repressed in the presence of both sugars, with a delayed response noticeable with addition of arabinose. Alignment of the CA_C0149-derived amino acid sequence with Crh sequences from several Bacillus species (Fig. 5A) shows similarity between these proteins. In the alignment, the CA_C0149-encoded protein has a glutamine residue at position 26, which aligns with the glutamine residues of the other Crh sequences and the histidine of the putative HPr of C. acetobutylicum at the same position. This glutamine residue is a key difference between Crh and HPr, as the histidine at this position is required for the HPr-mediated phosphate transfer to the PTS (26). In addition, the CA_C0149-derived sequence contains a serine at position 54, which aligns with serine residues of other Crh sequences and the putative HPr of C. acetobutylicum. HPr kinase phosphorylates this serine residue, thereby activating HPr/Crh, in turn leading to CcpA-mediated catabolite repression in other organisms. Transcriptional levels of CA_C0149 during growth on 11 different carbohydrates were obtained from a previous study and show that expression levels are higher during growth on less preferred carbohydrates (Fig. 5B) (19).

DISCUSSION
It has been demonstrated that C. acetobutylicum has a hierarchy of pentose utilization, with arabinose being utilized preferentially to xylose when both sugars are present at the beginning of the fermentation (17). This hierarchy could be due to inherent differences in growth rates on the two pentoses as there is an extremely long lag phase in C. acetobutylicum fermentations of xylose compared to arabinose, but there may be other levels of regulation that play a role in arabinose preference. To further the understanding of this pentose hierarchy, we monitored exponential-growth-phase cultures utilizing xylose that were supplemented with one of the following sugars: arabinose, glucose, or xylose. Measurements of growth, sugar utilization, and mRNA transcript levels revealed that, as with glucose, addition of arabinose inhibits xylose utilization and modulates transcription of xylose utilization genes and other CcpAcontrolled genes. Additionally, increased acetate/butyrate ratios after arabinose addition indicated that pentose metabolism likely shifted from the PPP to the PKP.
Previous studies showed that growth on glucose and arabinose is faster than on xylose, and the experiments shown here confirm and expand upon that knowledge (19). When arabinose was added to a xylose fermentation, the arabinose was rapidly utilized and the growth rate increased. Glucose addition had a different effect, instead maintaining growth comparable to control in the first 2 h with an increased growth rate between hours 3 and 5 ( Fig. 2A). This is counterintuitive to what was expected given the growth rates on glucose or arabinose alone. The difference in the time to utilize the two preferred carbohydrates could be due to the metabolic state of the cells upon addition of arabinose or glucose. The machinery for glucose utilization is present and only needs to be activated, which upon glucose addition could lead to a rapid influx of glycolytic intermediates or generation of toxic by-products such as methylglyoxal, both of which have been shown to inhibit metabolism (35,36). In contrast, when arabinose is added, importers and enzymes converting arabinose to xylulose-5-P must be synthesized, causing a graded increase in arabinose uptake and utilization. We speculate that this gradual increase in arabinose import unintuitively leads to rapid growth since it avoids a sudden production of toxic intermediates.
Addition of glucose or arabinose reduced xylose utilization ( Fig. 2B and C), thus confirming that the presence of arabinose in the medium inhibits xylose utilization via a mechanism that cannot entirely be due to a lower growth rate on xylose. Xylose utilization in the (ϩ)Ara cultures resumed rapidly upon depletion of arabinose. In contrast, xylose utilization in the (ϩ)Glu cultures took much longer to recover after glucose was depleted (Fig. 2B). The difference in the time of resumption of xylose utilization between the (ϩ)Ara and (ϩ)Glu is probably due to more metabolic machinery for xylose utilization being present in the (ϩ)Ara cultures than in the (ϩ)Glu cultures since arabinose and xylose are both metabolized via the PKP and PPP. These observations are consistent with a recent report showing that when cells are fed glucose-xylose mixtures, the xylose is utilized only to produce PPP intermediates for biosynthetic reactions (21). We cannot rule out the possibility that there are subpopulations that are subtilis Crh, CRH_BLI ϭ B. licheniformis Crh, CRH_BHA ϭ B. halodurans Crh, CRH_BCL ϭ B. clausii Crh. The blue box highlights the conserved N-terminal histidine residue of HPr required for phosphotransfer to the PTS that is not present in the Crh proteins. The red box highlights the conserved C-terminal serine residue that when phosphorylated promotes activation of CCR through interaction with CcpA. (B) Fold expression of CA_C0149 in the indicated carbohydrate source relative to glucose expression in C. acetobutylicum obtained from a previous transcriptomic study (19).
consuming different sugars, as this has been observed in other bacteria and may play a role in clostridia (31,37,38).
Recent publications show that arabinose favors carbon flux through the PKP compared to the PPP and that the reduced oxidation of carbon to CO 2 when the PKP is utilized results in an increase in acetate production (15)(16)(17)(18). Xylose flux through the PKP is concentration dependent, and the concentrations of xylose used in this study were not expected to cause increased flux of xylose via the PKP (16,18). This difference in metabolism is reflected in the arabinose-supplemented culture, which has significantly higher acetate production than any other condition, indicating that arabinose addition shifted carbon flux from the PPP to the PKP (Fig. 2D). This is consistent with a recent report by Aristilde et al., which showed high levels of flux through the PKP when cells were fed xylose-arabinose mixtures (17). The shift to PKP metabolism is further supported by increased expression of the xfp gene upon addition of arabinose, which was not seen under any of the other conditions (Fig. 4A). Final concentrations of acetate and butyrate in the (ϩ)Glu and (ϩ)Xyl cultures are similar, which was expected because molar equivalent amounts of carbon from glucose and xylose were converted to acetyl-CoA and CO 2 through the Embden-Meyerhof-Parnas (EMP) pathway and the PPP coupled to the EMP, respectively, resulting in nearly equivalent production of ATP and reduced electron carriers (15).
Expression of XylR-and AraR-controlled genes was severely reduced by 15 min after glucose addition and continued to drop through the hour (Fig. 4). These results were expected, since glucose-mediated CCR is well documented in C. acetobutylicum, and the presence of CRE sequences associated with AraR-and XylR-controlled genes is the likely mechanism for inhibition of xylose metabolism in the (ϩ)Glu culture. Several genes, including the PKP gene xfp (CA_C1343), lack identified CRE sites but were repressed by glucose addition, indicating that these genes have unidentified CRE sites or some other form of regulation (Fig. 4A). CCR mediated by glucose metabolism is well understood, but it was not obvious how arabinose mediated repression of xylose utilization in these cultures.
Sample clustering indicated that as time progressed, gene expression in the (ϩ)Ara and (ϩ)Glu converged (Fig. 3A). There was a large overlap in the differentially expressed genes at the 60-min time point, and many of these genes have been shown to be CcpA controlled (Fig. 3B). The time delay between the glucose and arabinose responses is evident in Fig. 4C when comparing the yellow and orange lines for glucose and arabinose, respectively. The overlap between the samples indicated that arabinose was activating CCR mediated by CcpA and Crh via the model proposed in Fig. 6. The proposed model for arabinose activation of CCR is founded in the fact that in C. acetobutylicum arabinose metabolism is more rapid than xylose metabolism. An increased metabolic rate would result in more rapid flux to the EMP pathway via glyceraldehyde 3-phosphate (G3P), leading to higher fructose 1,6-bisphosphate (FBP) levels and activation of HPr kinase (39) (Fig. 6). The aldolase reaction that converts FBP into dihydroxyacetone phosphate (DHAP) and G3P has been shown to be highly reversible in C. acetobutylicum, indicating that an increase of G3P would increase FBP levels (40). The result would be phosphorylation of HPr and/or the putative Crh, subsequently leading to CcpA-mediated CCR. The delay in repression by arabinose relative to glucose was probably due to the need to produce CA_C1343 (xfp) in order to increase flux to the EMP pathway via the PKP. This is in contrast to the glucose utilization enzymes, which would have been present and immediately available to metabolize glucose, consequently increasing EMP pathway flux and resulting in CCR.
There have been very few Crh orthologues identified, and to our knowledge, the only one putatively identified in Clostridium is in Clostridium cellulolyticum (41). In the current and previous studies, transcript levels of the putative C. acetobutylicum Crh gene (CA_C0149) were increased during growth on less preferred carbohydrates (Fig. 5B). Active repression of CA_C0149 when preferred carbohydrates are added coupled with a putative CRE site upstream of CA_C0149 indicates that expression of the gene is modulated by CcpA. The putative C. acetobutylicum Crh identified in this study needs further investigation to verify a regulatory role of the protein and determine if it interacts with HPr kinase and CcpA. Additionally, if Crh is involved with catabolite repression, the different roles for HPr and Crh require elucidation. having different regulatory schemes. Xylose to xylose: XylR binds xylose and XylR genes are derepressed, AraR is active and repressing AraR genes, CcpA is not active. Xylose to glucose: CcpA is activated, lack of intracellular arabinose and xylose causes repression via AraR and XylR. Xylose to arabinose 15 min: AraR binds arabinose resulting in derepression by AraR, XylR binds xylose, and XylR genes are derepressed, CcpA is not activated due to insufficient FBP levels to activate HPrK. Xylose to arabinose 60 min: AraR is derepressed due to arabinose binding, decreased import of xylose causes repression of XylR genes, and CcpA acts a repressor due to increased FBP levels as a result of increased metabolic rate following phosphoketolase activation. There is a possible second level of regulation for XylR-controlled genes. Addition of arabinose resulted in an increase in expression of both the PKP and PPP genes, likely due to the presence of AraR binding sites upstream of these operons/genes that enable transcription to proceed in the presence of arabinose. This would relieve any potential bottlenecks in xylose metabolism downstream of xylulose-5-P due to induction of the PKP, causing a drop in intracellular xylose concentrations. Decreased intracellular xylose concentration would alter the ratio of unbound to xylose-bound XylR repressor, which would promote XylR-mediated repression of the XylR-controlled gene. Reduction of intracellular xylose concentration could be further exacerbated if the predicted sugar/ cation symporter CA_C1339 (araE1) transports both arabinose and xylose, as has been demonstrated for many bacterial pentose transporters, such as in B. subtilis, E. coli, and Salmonella enterica serovar Typhimurium (42,43). Previous transcriptomic data suggested that CA_C1339 (araE1) was a xylose-specific transporter; however, bioinformatic analysis found an AraR binding site in its promoter region, implying a role in arabinose transport as well (19). The RNA-Seq data in this study showed that CA_C3451 (xynT) and CA_C1339 (araE1) are highly expressed in xylose-grown cultures; however, only CA_C1339 (araE1) has increased expression in response to arabinose, suggesting that it is involved in transport of xylose and arabinose. This expression pattern prompted us to investigate the CA_C1339 promoter region. Emboss matcher (44) identified a site 84 bases upstream of the transcription start site that has homology to sites identified upstream of XylR genes and overlaps the predicted SigA binding site in its promoter region, indicating that transcription of this predicted pentose transporter is controlled by both AraR and XylR (45). Promiscuous pentose transporters in other bacteria have been demonstrated to have a higher affinity and transport rate for arabinose than xylose. If also true in C. acetobutylicum, this could contribute to the preference for arabinose by creating a bottleneck for xylose import and requires further investigation.
It is unclear what the metabolic advantage is, if one exists, for the hierarchy of arabinose over xylose, but it is present among many lineages of bacteria and is mediated by a variety of regulatory schemes. One possible explanation is the relative availability of xylose and arabinose in the environment. While there is a higher ratio of xylose to arabinose in lignocellulosic biomass overall, the opposite is true for one of its components, pectin, a polysaccharide component found in primary cell walls (46). Clostridia are well known for their pectinolytic nature in the degradation of plant matter (47), and a selective advantage may exist for C. acetobutylicum to target the arabinose-rich pectin before utilizing the more recalcitrant secondary cell wall components. This strategy would be even more advantageous for consumption of the rapidly degraded pectin-rich plant matter found in plant-derived foods, such as fruits and vegetables (48).
The data presented here demonstrate that transcriptional regulation in C. acetobutylicum is a critical component of pentose hierarchy, and provide insight into the underpinning biochemical regulation through arabinose activation of CCR via a newly identified putative Crh. A deeper understanding of this regulation allows informed engineering of the organism and modulation of fermentation conditions to fine-tune desired chemical production from C. acetobutylicum grown on pentose-rich biomass. The ability to modulate metabolism via the PKP and PPP would provide a means to control carbon flow and the redox state of the cells so that they match the requirements for chemical synthesis via natural or engineered pathways. CA_C0149 may represent a potential control point for pentose and hexose utilization in C. acetobutylicum to allow full, simultaneous utilization of these abundant sugars in lignocellulosic biomass. Additionally, the RNA-Seq data provide information about genes that are potentially involved in pentose hierarchy mechanisms and will provide guidance for future studies.

MATERIALS AND METHODS
Bacterial strain and growth conditions. Clostridium acetobutylicum ATCC 824 was utilized for all studies. All growth was conducted anaerobically in a Coy anaerobic chamber (Coy Lab Products) at 37°C  (50). Experimental growth and sample collection. All cultures were incubated at 37°C under anaerobic conditions open to the environment of the anaerobe chamber, without shaking. Spores were heat shocked at 80°C for 10 min and inoculated into CGM supplemented with 0.5% xylose. Starter cultures were subcultured into 0.5% xylose CGM and incubated overnight. At an OD 600 of 0.2, the culture was aliquoted into milk dilution bottles (70 ml each; four conditions in three independent biological replicates) and incubated until the OD 600 reached 0.4. Precarbohydrate supplementation RNA and metabolite analysis samples were collected. Cultures were then supplemented with arabinose (0.25% final), glucose (0.25% final), or xylose (0.25% final), from a 50% stock solution of the respective sugars in water, or an equal volume of water (control). Separate samples for RNA, metabolites, and OD 600 readings were taken at 0 min (no RNA collected for this time point), 15 min, 30 min, and 60 min. Cultures were monitored for an additional 9 h, and metabolite samples and OD 600 readings were taken every hour. Final OD 600 readings and metabolite samples were taken the following day at 21 h after supplemental sugar addition. Samples collected for RNA isolation were immediately incubated with 0.03 mg/ml rifampin (final concentration) on ice, pelleted, treated with RNAprotect (Qiagen, Valencia, CA) according to the manufacturer's protocol, and stored at Ϫ80°C until RNA extraction. Metabolite samples were filter sterilized and stored at Ϫ20°C until analysis.
RNA extraction, purification, and rRNA depletion. Total RNA was isolated using the miRNeasy minikit (Qiagen; 217004) according to the manufacturer's protocol, with an additional homogenization and mechanical disruption step using a bead beater (BioSpec, Bartlesville, OK, USA) with zirconia-silica beads (BioSpec; 11079101z). RNA quality was assessed using the 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA), RNA was quantified using a spectrophotometer (DeNovix, Wilmington, DE, USA), and it was stored at Ϫ80°C until DNase treatment. DNA was removed using the Turbo DNA-free kit (Life Technologies; AM1907) according to the manufacturer's protocol. RNA was quantified and quality assessed, as stated above, and genomic DNA depletion was confirmed using quantitative PCR (qPCR) with 16S rRNA gene primers (19) and iQ SYBR Green Supermix (Bio-Rad; 170-8882) according to the manufacturer's instructions. rRNA was removed using the Ribo-Zero rRNA removal kit (Gram-positive bacteria) (Illumina; MRZGP126) according to the manufacturer's protocol. The quality of rRNA-depleted samples was assessed using a 2100 Bioanalyzer prior to processing for sequencing library generation.
Sequencing library preparation. The TruSeq stranded mRNA sample preparation kit (Illumina; RS-122-2101) was used to prepare the rRNA-depleted RNA for sequencing according to the manufacturer's protocol with the adaptations suggested for purified mRNA input. Libraries were quantified using the Kapa library quantification kit (Kapa Biosystems; KK4854) according to the manufacturer's instructions and then normalized and pooled for sequencing according to the Denature and Dilute Libraries Guide for the NextSeq 500 (Illumina; document no. 15048776 v02). Pooled libraries were 1-by 75-bp sequenced on a NextSeq 500 (Illumina, San Diego, CA, USA) in two sequencing runs.
RNA-Seq data analysis. Samples had an average of 14 million reads each, were assessed for quality using FastQC, and were trimmed to remove Illumina adaptors and low-quality bases using Trimmomatic (51,52). Samples were mapped to the Clostridium acetobutylicum ATCC 824 reference genome (GenBank accession numbers AE001437.1 and AE001438.3) using EDGE-pro (53), and differential gene expression was evaluated using DESeq2 with default parameters (54). (ϩ)Ara, (ϩ)Glu, and (ϩ)Xyl were compared to (ϩ)None at matching time points after supplemental sugar addition and analyzed for differential gene expression. For validation, the same comparisons were assessed for differential gene comparison and operon prediction using Rockhopper (55), and there was good agreement between the two pipelines. Genes that met a P value cutoff of 0.05 and a fold change cutoff of 4 were considered to have differential expression.
Statistical analysis. To focus on highly expressed genes, we filtered genes that had an RPKM greater than the overall third quartile RPKM value in at least one sample. We also removed putative phage Clo1 (CA_C1113-1197) and Clo2 (CA_C1878-1957) genes from the analysis (24). R v 3.2.3 (56) was used for all analyses, and plots were generated with ggplot2 (57). Venn diagrams were generated with the VennDiagram package (58). Heat maps were generated with the heatmaply package using hierarchical clustering of Euclidean distances (59). We assigned transcriptional regulation sites (AraR and XylR) according to RegPrecise (60) predictions and included all downstream genes in operons (according to DOOR [61][62][63] predictions). We also included CcpA site predictions from the work of Ren et al. in addition to the genes that had differential expression after ccpA inactivation (23).
Metabolite analysis. High-performance liquid chromatography (HPLC) analysis of external metabolites (carbohydrates, acetate, butyrate, acetone, butanol, and ethanol) was performed on an Agilent 1200 equipped with a refractive index detector using an Aminex HPX-87H cation exchange column (300 mm by 7.8 mm inside diameter [i.d.] by 9 m) from Bio-Rad Laboratories. Samples (20 l) were injected into the HPLC system and eluted isocratically with a mobile phase of 3.25 mM H 2 SO 4 at 0.6 ml/min and 65°C. Quantification was based on an external calibration curve using pure known components as standards (64).
Data availability. The raw RNA-Seq data sets and gene expression tables generated during this study are available through NCBI's GEO database (accession no. GSE107804).