Comparative genomics of Fructobacillus spp. and Leuconostoc spp. reveals niche-specific evolution of Fructobacillus spp.

Fructobacillus spp. in fructose-rich niches belong to the family Leuconostocaceae. They were originally classified as Leuconostoc spp., but were later grouped into a novel genus, Fructobacillus, based on their phylogenetic position, morphology and specific biochemical characteristics. The unique characters, so called fructophilic characteristics, had not been reported in the group of lactic acid bacteria, suggesting unique evolution at the genome level. Here we studied four draft genome sequences of Fructobacillus spp. and compared their metabolic properties against those of Leuconostoc spp. Fructobacillus species possess significantly less protein coding sequences in their small genomes. The number of genes was significantly smaller in carbohydrate transport and metabolism. Several other metabolic pathways, including TCA cycle, ubiquinone and other terpenoid-quinone biosynthesis and phosphotransferase systems, were characterized as discriminative pathways between the two genera. The adhE gene for bifunctional acetaldehyde/alcohol dehydrogenase, and genes for subunits of the pyruvate dehydrogenase complex were absent in Fructobacillus spp. The two genera also show different levels of GC contents, which are mainly due to the different GC contents at the third codon position. The present genome characteristics in Fructobacillus spp. suggest reductive evolution that took place to adapt to specific niches.


Background
Lactic acid bacteria (LAB) are found in a variety of environments, including dairy products, fermented food or silage, and gastrointestinal tracts of animals. Their broad habitats exhibit different stress conditions and nutrients, forcing the microbe to develop specific physiological and biochemical characteristics, such as proteolytic and lipolytic activities to obtain nutrients from milk [1], tolerance to phytoalexins in plants [2], or tolerance to bile salts to survive in the gastrointestinal tracts [3]. Fructobacillus spp. in the family Leuconostocaceae are found in fructose-rich environments such as flowers, (fermented) fruits, or bee guts, and are characterized as fructophilic lactic acid bacteria (FLAB) [4][5][6].
The genus Fructobacillus is comprised of five species: Fructobacillus fructosus (type species), F. durionis, F. ficulneus, F. pseudoficulneus and F. tropaeoli [6,7]. Four of the five species formerly belonged to the genus Leuconostoc, but were later reclassified as members of a novel genus, Fructobacillus, based on their phylogenetic position, morphology, and biochemical characteristics [8].
Fructobacillus is distinguished from Leuconostoc by the preference for fructose over glucose as the carbon source and the need for an electron acceptor (e.g. pyruvate or oxygen) during glucose assimilation. Fructobacillus is further differentiated from Leuconostoc by the production of acetic acid instead of ethanol when glucose is metabolized. We previously compared these microorganisms with special attention to the activities of alcohol and acetaldehyde dehydrogenases; Fructobacillus lacks the bifunctional acetaldehyde/alcohol dehydrogenase gene (adhE) [9] and its enzyme activities. They are the only obligately heterofermentative LAB without adhE to date, suggesting that niche-specific evolution occurred at the genome level. Recent comparative genomic studies also revealed niche-specific evolution of several LAB, including vaginal lactobacilli and strains used as dairy starter cultures [10][11][12].
This is the first study to compare the metabolic properties of the draft genome sequences of four Fructobacillus spp. with those of Leuconostoc spp., with a special focus on fructose-rich niches. Results obtained confirm the general trend of reductive evolution, especially metabolic simplification based on sugar availability.
Draft genome sequencing and de novo assembly  [15] with manual verification. In the pipeline, protein coding sequences (CDSs) were predicted by MetaGeneAnnotator 1.0 [16], tRNAs were predicted by tRNAscan-SE 1.23 [17], rRNAs were predicted by RNAmmer 1.2 [18], and functional annotation was finally performed based on homology searches against the RefSeq, TrEMBL, and Clusters of Orthologous Groups (COG) protein databases.
Genomic data of Fructobacillus durionis and Leuconostoc spp.
Draft genome sequence of Fructobacillus durionis DSM 19113 T was obtained from the JGI Genome Portal (http:// genome.jgi.doe.gov/) [19] and annotated using MiGAP in the same way as other Fructobacillus spp. Annotated genome sequences for nine of the twelve Leuconostoc species were obtained from the GenBank or RefSeq databases at NCBI. Of Leuconostoc spp., genomic data of Leuconostoc holzapfelii, Leuconostoc miyukkimchii and Leuconostoc palmae were not available at the time of analysis (December 2014) and were not included in the present study. When multiple strains were available for a single species, the most complete one was chosen. GenBank accession numbers of the strains used are listed in Table 1.

Quality assessment of the genomic data
The completeness and contamination of the genomic data were assessed by CheckM (Version 1.0.4) [20], which inspects the existence of gene markers specific to the Leuconostocaceae family, a superordinate taxon of Fructobacillus and Leuconostoc.

Comparative genome analysis and statistical analysis
To estimate the size of conserved genes, all protein sequences were grouped into orthologous clusters by GET_HOMOLOGUES software (version 1.3) based on the all-against-all bidirectional BLAST alignment and the MCL graph-based algorithm [21]. The conserved genes are defined as gene clusters that are present in all analyzed genomes (please note the difference from the definition of specific genes). The rarefaction curves for conserved and total genes were drawn by 100-time iterations of adding genomes one by one in a random order. From this analysis, two genomes (L. fallax and L. inhae) were excluded to avoid underestimation of the size of conserved genes, since they contained many frameshifted genes, probably due to the high error rate at homopolymer sites of Roche 454 sequencing technology.
For functional comparison of the gene contents between Fructobacillus spp. and Leuconostoc spp., CDS predicted in each strain were assigned to Cluster of Orthologous Groups (COG) functional classification using the COGNITOR software [22]. Metabolic pathway in each strain was also predicted using KEGG Automatic Annotation Server (KAAS) by assigning KEGG Orthology (KO) numbers to each predicted CDS [23]. The numbers of genes assigned to each COG functional category were summarized as a table (Table 2). In the present study, Fructobacillus-specific genes were defined as those conserved in four or more Fructobacillus spp. (out of five) and in two or less Leuconostoc spp. (out of nine). Leuconostoc-specific genes were defined as those conserved in seven or more Leuconostoc spp. and one or less Fructobacillus spp.
The Mann-Whitney U test was applied to compare genome features and gene contents of Fructobacillus spp. and Leuconostoc spp. The p value of 0.05 was considered statistically significant. Statistical analysis was performed using IBM SPSS Statistics for Windows (Version 21.0. Armonk, NY: IBM Corp.).

Phylogenetic analysis
Orthologous clusters that were conserved among all Fructobacillus spp., all Leuconostoc spp. and Lactobacillus delbrueckii subsp. bulgaricus ATCC 11842 (as the outgroup) were determined by GET_HOMOLOGUES as described above. For phylogenetic reconstruction, 233 orthologs that appeared exactly once in each genome were selected. The amino acid sequences within each cluster were aligned using MUSCLE (version 3.8.31) [24]. Poorlyaligned or divergent regions were trimmed using Gblocks [25], and conserved regions were then concatenated using FASconCAT-G [26]. A partitioned maximum likelihood analysis was performed to construct the phylogenetic tree with RAxML (version 8.1.22) [27] using the bestfit evolutionary models predicted for each alignment by ProtTest [28]. The number of bootstrapping was 1,000 replicates.

Polysaccharides production and reaction to oxygen
Polysaccharides production from sucrose were determined by the methods as described previously [29]. Briefly, the strains were inoculated on agar medium containing sucrose as sole carbon source and incubated aerobically at 30°C for 48 h.
To study reaction to oxygen on growth, the cells were streaked onto GYP agar [8], which contained D-glucose as the sole carbon source, and cultured under anaerobic and aerobic conditions at 30°C for 48 h as described previously [4]. The anaerobic conditions were provided by means of a gas generating kit (AnaeroPack, Mitsubishi Gas Chemical, Japan). These studies were conducted for the type strains of five Fructobacillus species, Leuconostoc mesenteroides subsp. mesenteroides NRIC 1541 T , Leuconostoc citreum NRIC 1776 T and Leuconostoc fallax NRIC 0210 T .

Results and discussion
General genome features of Fructobacillus spp. and Leuconostoc spp.
The DNA G + C contents of both species are also significantly different (p < 0.001): median ± SD is 44.4 % ± 0.30 % in Fructobacillus and 38.1 % ± 2.05 % in Leuconostoc (Fig. 1c). The difference in G + C contents is caused by the composition at the third codon (GC3): 46.0 % ± 1.02 % in Fructobacillus and 30.9 % ± 4.12 % in Leuconostoc. The low GC3 value in Leuconostoc spp. shows a good contrast with the high GC3 value in Lactobacillus delbrueckii subsp. bulgaricus [11]. In L. delbrueckii subsp. bulgaricus, the changes in GC3 are attributed to ongoing evolution [11], and similar selection pressure might be responsible here. Overall, these distinct genomic features strongly support the reclassification of Fructobacillus spp. from the genus Leuconostoc.
Since most of the genomes analyzed in this study were in draft status, quality assessment of the genomes was conducted using CheckM. The average completeness values for Fructobacillus and Leuconostoc genomes were 94.3 and 98.7 %, respectively (Table 1). Except for the genome of L. inhae, which exhibited the contamination value of 5.4 %, all genomes satisfied the criteria required to be considered a near-complete genome with low contamination (≥90 % completeness value and ≤ 5 % contamination value) [20]. The lower completeness values for Fructobacillus genomes might be attributable to insufficiency of the reference gene were not reflected at the time of writing this paper (December 2014), rather than the lower quality of these genomes. In addition, the lower completeness may indicate specific gene losses in the genus Fructobacillus since the closer investigation of CheckM results showed that seven gene markers were consistently absent among five Fructobacillus genomes while on average, 14.6 markers were absent out of 463 Leuconostocaceae-specific gene markers.
Conserved genes in Fructobacillus spp. and Leuconostoc spp.
The numbers of conserved genes in the nine genomes of Leuconostoc and five genomes of Fructobacillus were estimated as 1,026 and 862, respectively. They account for 52 % and 62 % of average CDS numbers of each genus (Fig. 2a). The difference in the average CDS numbers reflects their genomic history including ecological differences between the two genera. A previous study also reported 1162 conserved genes in three genomes of Leuconostoc species [30]. The smaller number and the higher ratio of fully conserved genes in Fructobacillus spp. is probably due to a less complex and consistent habitat with specific sugars only, such as fructose. It is a major carbohydrate found in habitats of Fructobacillus spp., e.g. flowers, fruits and associated insects. On the other hand, Leuconostoc spp., that are usually seen in wide variety of habitats, including gut of animals, dairy products, plant surfaces, or fermented foods and soils, possess a larger number of conserved genes. Figure 2b shows the distribution of gene clusters in two genera. The frontmost peak (721 gene clusters) represents conserved genes that are shared by both Leuconostoc and Fructobacillus spp. Genus-specific conserved genes are indicated as leftmost and right peaks in Fig. 2b. The leftmost peak (159 gene clusters) represents genes that are present in all Leuconostoc genomes, but absent in all Fructobacillus genomes, and the right peak (24 gene clusters) represents vice versa. The much smaller peak of the right compared to that of the left indicates that Fructobacillus spp. have lost more genes or have acquired less genes than Leuconostoc spp. during diversification after they separated into two groups. In addition, the number of gene clusters located near the center of the figure was small, which indicates that the exchange of genes between the two genera is not frequent and that they share distinct gene pools. This supports the validity of the classification of Fructobacillus as a distinct genus [8].
Comparison of gene contents between Fructobacillus spp. and Leuconostoc spp.
The identified genes were associated with COG functional categories by COGNITOR software at the NCBI. The sizes of COG-class for each strain are summarized in Table 2, and for each genus in Additional file 1: Figure S1. In addition, ratio of genes assigned in each COG category against the total number of genes in all COGs were determined for each genus and shown in Fig. 3. Fructobacillus spp. have less genes for carbohydrate transport and metabolism compared to Leuconostoc spp. (Class G in Fig. 3 and Additional file 1: Figure S1): Class G ranked 9 th largest in Fructobacillus whereas it ranked 3 rd in Leuconostoc. Similarly, the number of genes in Class C (energy production and conversion) was significantly less in Fructobacillus When compared based on the ratio of genes (Fig. 3), Class D (cell cycle, cell division and chromosome partitioning), Class J (translation, ribosomal structure and biogenesis), Class L (replication, recombination and repair) and Class U (intracellular trafficking, secretion and vesicular transport) were overrepresented in Fructobacillus spp. than in Leuconostoc spp. However, the numbers of genes classified in the four classes were comparable between the two genera (Additional file 1: Figure S1). The conservation of genes in these classes against the genome reduction may indicate that their functions are essential for re-production, and the class names roughly correspond to housekeeping mechanisms.
To understand gene contents involved in metabolic/ biosynthesis pathways in more detail, ortholog assignment and pathway mapping against the KEGG Pathway Database were performed using the KAAS system. The number of mapped genes was significantly less for Fructobacillus spp. as compared to Leuconostoc spp. (Table 3). Firstly, Fructobacillus spp. lack respiration genes. Whereas oxygen is known to enhance their growth [8], the strains have lost genes for the TCA cycle, and keep only one gene for ubiquinone and other terpenoid-quinone biosynthesis (Table 3). Presumably they do not perform respiration and use oxygen only as an electron acceptor. This characteristic is not applicable to certain Leuconostoc species: L. gelidum subsp. gasicomitatum [31], formerly classified as L. gasicomitatum [32], has been reported to conduct respiration in the presence of heme and oxygen [33].
Secondly, Fructobacillus spp. lack pentose and glucuronate interconversions (Table 3). They lost genes for pentose metabolism, unlike other obligately heterofermentative LAB that usually metabolize pentoses [34]. They do not metabolize mannose, galactose, starch, sucrose, amino sugars or nucleotide sugars, either [7,8]. Moreover, the species possess none or at most one enzyme gene for the phosphotransferase systems (PTS), significantly less than the number of respective genes in Leuconostoc spp. (13 ± 3.13, average ± SD). This validates the observation that Leuconostoc spp. metabolize various carbohydrates whereas Fructobacillus spp. do not [8] (Fig. 4.) However, the genome-based prediction does not Fig. 3 Comparison of ratio (%) of gene content profiles obtained for the genera Fructobacillus and Leuconostoc. The Mann-Whitney U test was done to compare Fructobacillus spp. and Leuconostoc spp., and significant differences (P < 0.05) are denoted with an asterisk (*) always coincide with observed metabolism: Fructobacillus species do not metabolize ribose [8], against its metabolic prediction (Fig. 4). The discrepancy is due to an absence of ATP-dependent ribose transporter. On the other hand, some Leuconostoc spp. have the transporter and metabolize ribose.
Thirdly, Fructobacillus spp. have more genes encoding phenylalanine, tyrosine and tryptophan biosynthesis compared to Leuconostoc spp. (Table 3), although this difference is statistically not significant (p = 0.165). The difference is mainly due to presence/absence of tryptophan metabolism, and the production of indole and chorismate. This is important to wine lactobacilli [35]. The reason of the sporadic conservation of indole biosynthesis in Fructobacillus remains unknown.

Comparison of genus-specific genes
To further investigate their differences, we defined genes as Fructobacillus-specific when they are conserved in four or more Fructobacillus species (out of five) and two or less in the nine Leuconostoc species. On the other hand, genes are Leuconostoc-specific when they are possessed by seven or more Leuconostoc species (out of nine) and zero or one in the five Fructobacillus species. According to this definition, 16 genes were identified as Fructobacillus-specific and 114 as Leuconostoc-specific (Additional file 2: Table S1). These numbers are smaller than the numbers of fully conserved genes in each genus (24 for Fructobacillus and 159 for Leuconostoc), because we defined genus-specific genes after mapping them to the KEGG Orthology (KO) database; genes without any KO entry were excluded from the analysis.

] was characterized as
Leuconostoc-specific. There was no alternative acetaldehyde dehydrogenase gene in Fructobacillus. These results are consistent with our previous study reporting the lack of adhE gene and acetaldehyde dehydrogenase activity in Fructobacillus spp. [9] and their obligately heterofermentative nature with no ethanol production [6,8]. No production of ethanol is due to an absence of acetaldehyde dehydrogenase activity, but it conflicts with the NAD/ NADH recycling. Therefore, there must be a different electron acceptor in glucose metabolism [4,6,9]. NAD(P)H dehydrogenase gene was found as Fructobacillus-specific (Additional file 2: Table S1). This is the only gene used for the quinone pool in Fructobacillus spp., suggesting that the gene does not contribute to respiration. Rather, it is used for oxidation of NAD(P)H under the presence of oxygen. This helps to keep the NAD(P)/NAD(P)H balance, since their sugar metabolism produces imbalance in NAD(P)/NAD(P)H cycling as described above. Indeed, Fructobacillus spp. can be easily differentiated from Leuconostoc spp. based on the reaction to oxygen [8]. In our validation study, Fructobacillus spp. grew well under aerobic conditions but poorly so under anaerobic conditions on GYP medium (Fig. 5). Presence of oxygen had smaller impacts on growth of Leuconostoc spp., but they generated larger colonies under anaerobic conditions than under aerobic conditions. Genes for subunits of the pyruvate dehydrogenase complex were undetected in the genomes of Fructobacillus, The values indicate means and standard deviations of number of genes used for the pathways but were found in Leuconostoc. Fructobacillus also lack TCA cycle genes. This suggests that, in Fructobacillus, pyruvate produced from the phosphoketolase pathway is not dispatched to the TCA cycle but metabolized to lactate by lactate dehydrogenase. The lack of pyruvate dehydrogenase complex was also reported in Lactobacillus kunkeei [35], which is also a member of FLAB found in fructoserich environment [4,36]. The levansucrase gene was also characterized as Fructobacillus-specific (Additional file 2: Table S1). The enzyme has been known to work for production of oligosaccharides in LAB [36,37] and for biofilm production in other bacteria [38]. However, production of polysaccharides was unobserved in Fructobacillus spp. when cultured with sucrose. The reason for this discrepancy is yet unknown. Incompetence of sucrose metabolism, including no dextran production, in Fructobacillus spp. has been reported [7,8], and systems to metabolize sucrose, e.g. genes for sucrosespecific PTS, sucrose phosphorylase and dextransucrase, were not detected in their genomes. On the other hand, L. citreum NRIC 1776 T and L. mesenteroides NRIC 1541 T produced polysaccharides, possibly dextran. Production of dextran from sucrose in the Fig. 4 Predicted sugar metabolic pathways in Fructobacillus spp. and Leuconostoc spp. The orange and blue lines represent the pathways exist in Leuconostoc spp. and Fructobacillus spp., respectively. The bold lines represent conserved genes among each genus (core) and the narrow lines represent dispensable genes that are exist in some but not all species in each genus. The dotted lines represent electron flow genus Leuconostoc is strain/species dependent [39], and dextransucrase gene was identified in six Leuconostoc genomes (out of nine) in this study. A number of genes coding peptidases and amino acids transport/synthesis/ metabolism were also found as Leuconostoc-specific genes (Additional file 2: Table S1), suggesting that Leuconostoc spp. can survive various environments with different amino acid compositions. Several PTS related genes and genes for teichoic acid transport were also characterized as Leuconostoc-specific. LAB cells usually contain two distinct types of teichoic acid, which are wall teichoic acid and lipoteichoic acid. The identified genes are involved in biosynthesis of wall teichoic acid in Bacillus subtilis [40].
Few studies have been reported for wall teichoic acid in Leuconostoc spp. and none in Fructobacillus spp.

Phylogenetic analysis
To confirm the phylogenetic relationship between Fructobacillus spp. and Leuconostoc spp., a phylogenetic tree was produced based on concatenated sequences of 233 orthologous genes which were conserved as a single copy within the tested strains. The tree showed a clear separation of the two genera (Fig. 6), indicating that Fructobacillus spp. have distinct phylogenetic position from Leuconostoc spp. This agrees well with the previous reports using 16S rRNA gene or house-keeping genes [7,8].