Genome-wide motif predictions of BCARR-box in the amino-acid repressed genes of Lactobacillus helveticus CM4

A BCARR (branched-chain amino acid responsive repressor) identified in proteolytic gene expressions in Lactobacillus helveticus is considered to negatively control transcriptions by binding to operator sites at the promoter regions in the presence of BCAAs. However, the distributions and regulatory potential of the BCARR in all genes repressed by BCAAs in CM4 remains unclear. A genome-wide search for the BCARR-box was conducted to clarify the contribution of BCARR in the regulation of amino acid metabolism in L. helveticus CM4. Among all 2174 genes of CM4, 390 genes repressed by amino acids were selected for the search of the BCARR-box. The annotated 33 genes among the 67 predicted BCARR-boxes were mainly linked to amino acid metabolism. The BCARR-boxes were mainly located adjacent to the −35 sequence of the promoter; however, the repressive effects in different locations were similar. Notably, the consensus BCARR-box motif, 5′-A1A2A3A4A5W6N7N8N9W10T11T12W13T14T15–3′, observed in highly repressed genes, revealed more frequent A-T base pairing and a lower free energy than that in lowly repressed genes. A MEME analysis also supported the lower frequency of T at positions 12, 14, 13 and 15 in the BCARR-box sequence of the lowly repressed gene group. These results reveal that genes with a more stable palindromic structure might be preferable targets for BCARR binding and result in higher repressions in the target gene expressions. Our genome-wide search revealed the involvement of the proteolytic system, transporter system and some transcriptional regulator systems in BCARR-box regulation in L. helveticus CM4.


Background
The proteolytic system of lactic acid bacteria is crucial for cell growth in milk and important for the acceleration of ripening in cheesemaking and rapid yogurt manufacturing. The proteolytic system is activated at the beginning of fermentation to release peptides and amino acids for cell growth because of limited nitrogen sources in milk, but is negatively controlled by accumulated amino acids and peptides at the late phase of cell growth [1,2].
Generally, lactobacilli have stronger proteolytic activities and can release higher amounts of peptides and amino acids in fermented milk compared with lactococci [3]. Among Lactobacillus species, Lactobacillus helveticus has the highest proteolytic activity and can release antihypertensive peptides from casein during the milk fermentation process [3][4][5][6]. L. helveticus CM4 with the highest proteolytic activity can release the highest amount of these peptides [7,8]. However, the production of the antihypertensive peptides by L. helveticus CM4 was repressed by amino acids that accumulated in fermented milk because of the down-regulation of genes, such as pepO2, pepCE and pepE, that most likely encode enzymes involved in the processing of active peptides [7,9]. A novel type of regulator protein, a cystathionine β-synthase (CBS) domain protein involved in the regulation of the proteolytic system, was successfully identified in a previous study [10]. The CBS domain protein binds to a specific DNA sequence present at the promoter regions of the repressed proteolytic genes in response to intracellular BCAAs [10]. From a comparative sequence analysis of the promoter regions of the proteolytic genes, a gel shift assay and a footprinting analysis, a palindromic AT-rich motif, 5′-AAAAANNCTWTTATT-3′, was predicted as the consensus DNA motif for the branched chain amino acid responsive repressor (BCARR) protein binding box (BCARR-box). Therefore, the consensus DNA motif is thought to exist in many genes repressed by amino acids including those of the proteolytic enzymes of CM4 [9], but the contributions of BCARR via binding to the BCARR-box in the repressed genes of CM4 are unclear. In Lactococcus lactis and Bacillus subtilis most of the proteolytic genes are regulated by the CodY protein in response to branched chain amino acids (BCAAs) [1,[11][12][13][14]. CodY is activated by binding to accumulated BCAAs in the medium, which increases the affinity to its operator site, the CodY protein binding box (CodY-box) [11][12][13][14]. However, no CodY and no regulatory system of the proteolytic enzyme have been reported.
Genome-wide search is a powerful tool to understand the contribution of the regulatory system in specific gene expressions in response to some metabolites [15][16][17][18][19]. In the present study, we searched the BCARR-box previously predicted from six kinds of proteolytic genes which were down-regulated in response to amino acids in CM4 [10]. Then, we characterized the structural features of the BCARRbox, palindromic pair, free energy and location from the promoter, to determine the BCARR-driven repressed effect. We also investigated the impact of the BCARR system on amino acid metabolism, which plays a crucial role in cell growth in milk and other metabolism throughout the selection of the downregulated genes by amino acids.

Results
Distribution of the repressed genes in the whole CM4 genome Strategic steps to determine the contribution of BCARR on the regulation of specific gene expressions by amino acids in L. helveticus CM4 and the brief outcomes obtained at each step are illustrated in Fig. 1. In the genome-wide transcriptomic analysis, 390 genes of L. helveticus CM4 repressed over 30% at 0.5 h after the addition of peptides in fermented milk were observed (Additional file 1: Table S1). Various kinds of genes, such as protease, transporter, nuclease and regulatory protein genes, were repressed (detailed in Table S1). Among the 390 repressed genes, 185 genes (47.4%) encoded putative and unknown proteins. To visualize the genome-wide distributions of repressive gene expressions in L. helveticus CM4, the locations of the repressed genes from the origin of the CM4 genome and the repressive effects by peptides are illustrated in Fig. 2. Various genes with different repressive effects and different locations were positioned in the whole genome (Fig. 2b). Notably, highly repressed genes were located mainly in four large a b Fig. 1 Strategic workflow of the BCARR motif search and analysis (a) and brief outcomes (b). a Four-step analytical process from a motif search in the repressed genes (Step 1) to a motif analysis for repressive effects (Step 4). b Brief outcomes of each workflow step are summarized regions (from I to IV in Fig. 2b) and in the previously reported whole genome of L. helveticus CM4 (Fig. 2a) [20].

Prediction of BCARR-box in repressed genes
As the preliminary study for whole genes, the genomewide survey focused on the 390 genes down-regulated by amino acids. The homologue for an AT-rich palindromic motif, 5′-AAAAANNCTWTTATT -3′, predicted as the consensus DNA motif from 6 proteolytic genes in a previous study [10], was surveyed in promoter regions at −300 to 250 bp from the −35 sequence of the promoter region in 390 repressed genes by multiple sequence alignment with a CLUSTALW analysis (http:// www.genome.jp/tools-bin/clustalw). In all, 67 kinds of predicted BCAA-boxes were found in the repressed genes at the promoter regions (Table 1). Corresponding genes, the observed BCARR-box sequence, distances from the −35 sequence of the promoter and the repressive effects by peptides are listed in Table 1. Strands with an observed BCARR-box and the Waterman-Eggert score analyzed by LALIGN analysis (http://www.ch.embnet.org/software/LALIGN_form.html) are also listed in Table 1. There were no significant differences in the repressive effects of BCARR-boxes located between plus and minus strands. All six proteolytic genes had BCARR-box sequences, but only pepO2, pepD, pepC2 and dppD genes with repressive effects of 93.0%, 89.0%, 68.0% and 34.0%, respectively, are listed in Table 1; pepV and pepO genes showed lower repressive effects (27% and 25%, respectively). b a Fig. 2 Genome-wide distribution of the repressed genes of Lactobacillus helveticus CM4 in response to amino acids on the genome map (a) and locations from the origin with the repressive effects (b). a Location of the repressed genes by amino acids on a previously reported physical map of Lactobacillus helveticus CM4 chromosomal DNA [20]. Regions of highly repressed genes (I to IV) are shown on the physical map. Maps on the outside: Positions of tRNA (pale blue), rRNA operons (green), ORFs on the positive strand (blue), ORFs on the negative strand (red) and GC-skew plot (brown). Distances (kbp) from the origin are shown on a dotted line on the inside. b Locations of the repressed genes from the origin and the repressive effects by amino acids in CM4. Regions of highly repressed genes (I to IV) are also shown in these maps. The raw data is provided in Additional file 1:

Predicted genes with a BCARR-box
To understand the role of the BCARR in the regulatory system by amino acids, annotated genes with observed BCARR-boxes (Table 1) are summarized in Fig. 3. Among the 67 predicted BCARR-boxes in the promoter regions of the 390 repressed genes, 34 genes were uncharacterized or non-annotated genes listed in Table  1. Half of the annotated 33 genes were linked to amino acid metabolism, such as transporters, proteolytic system and purine synthesis. Nine genes (potE, ddpA, himM, hisM, potE, dppD, sunT, mdlB and hflC) were transporter genes. Nine genes (pepO, pepO2, pepE, pepV, pepQ, pepC2, pepN2, prtH&P and pepD) (reviewed in ref. [7]) were proteolytic enzymes. LytT, lytR and deoR were regulator genes (see Fig. 3). The guaA and guaB genes for purine synthesis [21] may have a link to amino acid metabolism. Changes of transcriptional regulators by amino acids, such as lytL, lytR and deoR, are of interest because these regulators may impact many gene expressions indirectly through BCARR action. The gene products of transcriptional regulators lytL and lytR have been suggested to influence alaD gene expression [22]. DeoR, which is widely present in bacteria and acts as a repressor in sugar metabolism [23], may have an indirect effect on sugar metabolism. Various transporter genes have BCARR-motifs at the promoter regions. The sunT gene product has been suggested to function as an antibiotic transporter with a protease domain [24,25]. The mdlB gene product also has a protease domain and is likely involved in multidrug transport and bacteriocin export [26]. The hflC product also has a protease domain and is involved in protein secretion [27]. The cydC gene product co-expressed with the cydD gene in E. coli showed ATPase-like activity [28]. PrtD is one of the ATP binding cassette components with low ATPase activity involved in the protease secretion system [29]. CitT is a component of the twocomponent system that plays a crucial role in citrate utilization [30].

Localization and the repressive effects of the BCARR-box
To understand how the location of the BCARR-box at the promoter regions could interfere with RNA polymerase-promoter binding and the transcriptional activity of the underlying gene, the distance of the predicted boxes from the origin of the CM4 genome and the repressive effects are summarized in Fig. 4. Most of the BCARRboxes were present between −120 to +150 bp from the −35 sequence of the promoter, and the boxes were most frequently observed at 0 bp (−30 to 0 bp). Unexpectedly, the average repressive effects of each box with different locations at the promoter regions were almost similar (Fig. 4). This result indicates that a wide promoter region, not the more frequent BCARR-box adjacent to the −35 sequence, might be sufficient to express the repressive effects on the transcription.

Comparison of highly and lowly repressive box sequences
No clear differences in average repressive effects that were dependent on the location of BCARR-boxes in the promoter regions were observed if the regions were limited from −300 to +250 bp from the −35 sequence (Fig. 4). Therefore, each location of the BCARR-box from the −35 sequence of the promoter and the repressive effects are illustrated in Fig. 5. BCARR-boxes were most frequently present at −120 to +150 bp from the −35 bp sequence of the promoter. Thus, BCARR-boxes from −120 to +150 bp with high and lowly repressions were selected for structural feature analysis. As listed in Table 2, the repressive effect for Group A shown by green box with repression over 80% was 88.9 ± 3.9%, and that of the low repressive Group B with repression less than 50% was 36.3 ± 3.8%. The repressive effect for Group A was significantly higher than that for Group B (P < 0.001).
To determine the reason for the different repressive effects in the two groups, the number of base pairs in the palindromic sequence and the free energy for each BCARR-box were analyzed ( Table 2). The average number of palindromic pairs was significantly higher in Group A (4.1 ± 0.8) than in Group B (3.3 ± 1.0) (P < 0.05) because of fewer Ts at positions 12, 13, 14 and 15 in Group B than in Group A. The free energy represented as the ΔG value for a BCARR-box was evaluated by Mfold analysis (http://unafold.rna.albany.edu/?q=mfold/ DNA-Folding-Form) and compared among the two groups. As for the ΔG analysis, all BCARR-box sequences in Group A were available in the M-fold tool, but approximately half of the sequences in Group B were not (Table 2) As expected from the number of base pairs shown in Table 2, the ΔG values, which reflect the stability of the palindromic pair, were significantly lower in Group A (−0.25 ± 1.67) than in Group B (0.85 ± 0.90) (P < 0.05). The above findings revealed that the predicted palindromic sequence might be more stable for Group A than for Group B.
For further consideration of the different repressive effects between the 2 groups, the predicted motifs of the BCARR-box from 18 sequences of Group A and 18 sequences of Group B were compared by MEME analysis. As shown in Fig. 6, the structural features of the motifs for Group A were relatively conserved AT-rich palindromic sequences, but Group B contained slightly fewer Ts than Group A at positions 12 (67%), 13 (33%), 14 (67%) and 15 (67%). This result suggests that fewer Ts at positions 12, 13, 14 and 15 in the BCARR-box of Group B might make a less stable palindromic structure than that of Group A. A more stable palindromic structure of the BCARR-box in Group A genes than in Group B might have the advantage of conferring a higher affinity to the BCARR protein and, thus, a higher repressive effect. Considering the above results, the repressive effects of amino acids through the BCARR system might be more dependent on an AT-rich stable palindromic structure than on the location within the promoter region.  Table 1 are illustrated in this Fig.. a: A highly repressed motif located −120 to +150 bp from the −35 bp sequence of the promoter and with repression over 80% (green box). b: A lowly repressed motif located −120 to +150 bp from the −35 bp sequence of the promoter and with repression from 50% to 30% (red box)

CodY-box in BCARR-box regions
In lactococci, CodY plays a crucial role in exerting negative regulation on proteolytic gene expressions by binding to the CodY-box in the presence of amino acids. However, no codY gene has been observed in the Lactobacillus genome including that of CM4, and there is no information in the literature about a CodY-box sequence, 5′-AATTTTCWGAAAATT-3′, in lactobacilli. On the other hand, B. subtilis has both CodY and BCARR genes, suggesting a regulatory system response to BCAAs [14]. So, to investigate the evolutional selection of the regulatory system in lactobacilli, a CodY-box sequence was searched at the promoter regions of 67 genes with BCARR-box sequences in CM4 (Table 1). A CLUSTALW analysis showed that most of the upstream DNA sequences had no CodY-box sequence, however, both the BCARR-box and CodY-box sequences were observed in deoR, nlpA and dppD genes and three putative genes at the promoter regions (Fig. 7).

Discussion
A novel transcriptional regulator protein, BCARR, identified by purification, was found to have an affinity to the upstream regions of six proteolytic genes that were repressed in response to BCAAs in L. helveticus [10]. BCARR is thought to exert downregulation in the proteolytic gene expressions by binding to the BCARR-box, 5′-AAAAANNCTWT-TATT-3′ in the presence of BCAAs [10]. BCARR, first found in the proteolytic system of L. helveticus, seems to be a global regulator of amino acids metabolism because many gene expressions are broadly repressed by amino acids in CM4 [9]. However, the contribution of BCARR in all repressed gene expressions in the presence of amino acids remains unclear. Various approaches have been introduced in various bacterial gene expressions to study global regulatory genes. Specific DNA sequences for regulatory protein biding have been searched for genome-wide in Escherichia coli [15], Sulfolobus acidophilus [16], Bacillus anthracis [17], Bacillus subtilis [19,21,[31][32][33][34], Clostridium difficile [35], L. lactis [18,36,37] and Streptococcus thermophilus [38]. Homologous sequences to the Cre-box sequence were searched for in the whole Bacillus genome and mapped on the genome, and the consensus sequence was newly deduced [33]. Currently, Cre-boxes are classified as high or low affinity sequences depending on the response b a Fig. 6 Search for consensus motifs of BCARR-boxes for highly (a) and lowly (b) repressive groups. The weight matrix shows the frequency of A, C, T or G nucleotides (as indicated in the legend) at each position of the motif. The frequencies of A and T (%) are shown in green and red, respectively, below the matrix. A graphical representation of the identified motif was obtained at the Weblogo website (http:// weblogo.berkeley.edu/logo.cgi) at low and high levels of CcpAs characterized in B. subtilis [33]. A genome-wide analysis of BCARR-boxes was performed to understand the impact of BCARR on specific, 390 down-regulated genes by amino acids. For a more strategic analysis in the present study, BCARRboxes located far from the promoters were considered to be less effective in repressing the gene expressions because BCARR can influence promoter activity by covering the surrounding promoter region ranging over approximately 200 bp of DNA [14]. Therefore, the BCARR-box was searched in the upstream regions from −300 to +250 bp at the promoter of the 390 down-regulated genes. The BCARR-box search at the promoter regions in the 390 repressed genes based on CLUSTALW analysis, 67 kinds of putative BCARRboxes were found, especially at promoter regions of the proteolytic system, transporters and some transcriptional regulator genes among the 390 repressed genes (Table 1).
Among the predicted 67 genes with BCARR-boxes, 19 genes, shown on the right in Fig. 3 involved in the proteolytic system, amino acid and peptide transport system, transporters with a protease domain, and purine synthesis have a link to amino acid metabolism among the 33 annotated genes (57.5%) as illustrated in Fig. 3. These cell responses to excess amounts of intracellular amino acids seem to be a catabolite repression-like regulation because there is no need for more amino acid supply via these actions under nutrient-rich conditions. These results revealed that the BCARR system in L. helveticus might be the main regulatory system for the proteolytic system and transporters to link to amino acid metabolism as reported for the CodY system in L. lactis [12-14, 36, 37], Bacillus subtilis [11,34] and Streptococcus thermophilus [38].
The right side of Fig. 3 shows that DapF, involved in L-lysine biosynthesis [39] from L-aspartate, may be controlled to decrease L-lysine production. GuaA and GuaB [31,40]  GMP with conversion of Gln to Glu. So, BCARR may repress the supply of Glu throughout repressions of guaA and guaB gene expressions in amino acid rich conditions. For genes shown on the left side of Fig. 3, the reason for the repression of the gene expressions remains unclear. Repressions of some transporter genes, such as deoR, lytL and lytR [22,41], are of interest, which will have a wide impact on many kinds of gene expressions. DeoR [22,41] is present widely in Grampositive and negative bacteria and acts as a repressor in sugar metabolism. The transcriptional regulators LytL and lytR were reported to be linked to alaD gene expression and may be involved in amino acid metabolism, but their role remains unclear. cydC [28,42], prtD [29] and uup [43] gene products with ATPase activity have been suggested to contribute to membrane control against environmental stress. However, there is no clear evidence to explain a link between amino acid metabolism and these gene products.
To discern the repressive effects of the BCARR, all predicted motifs listed in Table 1 were mapped by location from the promoter and repressive effects (Fig. 5). The different repressive effects were thought to be caused by the location from the promoter, which influences RNA polymerase binding to the promoter, and/or by the preferred structural motif for BCARR. A BCARR-box was most frequently found in regulatory genes adjacent to the −35 sequence of the promoter regions of 67 proteolytic and some transporter genes (Fig. 4). Unexpectedly, the average repressive effects of the gene expressions through a BCARR-box with different locations were similar if the data were collected from a location ranging from −300 to 250 bp from the promoter (Fig. 4). A footprinting analysis in a previous study revealed a wide range of protection of the BCARR-box at the promoter region ranging over 200 bp of DNA by binding of the BCARR in the presence of amino acids [10]. Therefore, the binding of a BCARR to a BCARR-box located between −300 to +250 bp could be sufficient to interfere with the binding of the RNA polymerase to the promoter and repress the following transcription of the corresponding genes.
The structural features of the motif were more important than the location of the promoter because the distances of motifs from promoters did not influence repressive effects. To clarify the influence of the BCARR-box sequence on repressive effects, the structural features of BCARR-box sequences with high (Group A) and low (Group B) repressive effects were compared, and the number of base pairs in the expected palindromic structures was counted ( Table 2). The average number of palindromic pairs was significantly higher in Group A (4.1 ± 0.8) than in Group B (3.3 ± 1.0) because of fewer Ts at positions 12 (67%), 13 (33%), 14 (67%) and 15 (67%) in Group B than in Group A. Moreover, the ΔG values were significantly lower in Group A (−0.25 ± 1.67) than in Group B (0.85 ± 0.90) (P < 0.05). These results suggest a more stable palindromic structure in the highly repressive Group A. This idea was also supported by the MEME analysis for all Group A and Group B sequences. The predicted motif for the less effective Group B, 5′-A1A2A3A4A5(W)6N7N8N9W10T11(T)12 W13(T)14(-T)15-3′, showed more variable and fewer Ts at positions 12, 13, 14 and 15 (Fig. 6). These results suggest that a stable palindromic structure might have more frequent BCARR binding. For more precise analysis of the structure preferred by BCARR, a binding assay with purified BCARR toward each motif and a reporter assay involving each BCARR must be performed.
A genome-wide search revealed that higher gene repressions by amino acids might be distributed at some limited loci as shown by I to IV in Fig. 2 in the whole genome, whereas all predicted BCARR-box were distributed evenly in the whole genome in the present study (data not shown). The distributions of the potent repressions at the limited regions may be a more effective system in the acceleration of effective repressions of the neighboring genes. A wide range of unknown regulatory actions by amino acids may be involved in the gene repressions. A comparative analysis between the repressive effects measured like the whole cell response in the present study and a reporter assay containing the corresponding promotor regions may support this suggestion in L. helveticus CM4.

Conclusion
The genome-wide search for the BCARR-box based on amino acid repressed genes in L. helveticus CM4 revealed frequent involvement in amino acid linked metabolism, such as the proteolytic system, transporter system, and some transcriptional regulator systems. The genes with more stable palindromic structures evaluated by BCARR-box motif analysis were preferable targets for BCARR binding and resulted in higher repressions in the target gene expressions. These results revealed that the BCARR system in L. helveticus might be the regulatory system of amino acid metabolism.

Strategic steps of genome-wide analysis
A transcriptome analysis was performed using CM4 cells collected 0.5 h after the addition of casein hydrolysate and the cells were compared to those without the peptides in the fermented milk. Genes down-regulated by peptides over 30% compared to control cells cultured without peptides were selected for the analysis of the BCARR-box in the corresponding genes (Step 1). Then, to select specific genes that were repressed by the binding of a BCARR to the BCARR-box, a homolog of the predicted BCARR-box was searched by CLUSTALW and LALIGN analyses in the repressed genes (Step 2). Next, the BCARR-box found in the promoter region of the repressed genes was analyzed for the relationship between the location at the promoter region and the repressive effect (Step 3). Then, the palindromic structures of the predicted BCARR-boxes in highly and lowly repressed groups were compared (Step 4).