The complete genome sequence of Eubacterium limosum SA11, a metabolically versatile rumen acetogen

Acetogens are a specialized group of anaerobic bacteria able to produce acetate from CO2 and H2 via the Wood–Ljungdahl pathway. In some gut environments acetogens can compete with methanogens for H2, and as a result rumen acetogens are of interest in the development of microbial approaches for methane mitigation. The acetogen Eubacterium limosum SA11 was isolated from the rumen of a New Zealand sheep and its genome has been sequenced to examine its potential application in methane mitigation strategies, particularly in situations where hydrogenotrophic methanogens are inhibited resulting in increased H2 levels in the rumen. The 4.15 Mb chromosome of SA11 has an average G + C content of 47 %, and encodes 3805 protein-coding genes. There is a single prophage inserted in the chromosome, and several other gene clusters appear to have been acquired by horizontal transfer. These include genes for cell wall glycopolymers, a type VII secretion system, cell surface proteins and chemotaxis. SA11 is able to use a variety of organic substrates in addition to H2/CO2, with acetate and butyrate as the principal fermentation end-products, and genes involved in these metabolic pathways have been identified. An unusual feature is the presence of 39 genes encoding trimethylamine methyltransferase family proteins, more than any other bacterial genome. Overall, SA11 is a metabolically versatile organism, but its ability to grow on such a wide range of substrates suggests it may not be a suitable candidate to take the place of hydrogen-utilizing methanogens in the rumen. Electronic supplementary material The online version of this article (doi:10.1186/s40793-016-0147-9) contains supplementary material, which is available to authorized users.


Introduction
Methane produced by methanogenic archaea during the fermentation of plant material in the rumen is widely regarded as a significant contributor to anthropogenic greenhouse gas emissions from ruminant livestock. Several approaches to reduce CH 4 emissions from farmed animals are currently being investigated, and the genomes of several rumen methanogens have been sequenced to support strategies designed to reduce the number or metabolic activity of methanogens in the rumen [1]. Hydrogen is necessary for methanogenesis and this has led to proposals that organisms which compete with methanogens for H 2 could be used to reduce CH 4 production [1][2][3][4]. Anaerobic bacteria capable of reductive acetogenesis are of particular interest as these organisms use the Wood-Ljungdahl pathway to synthesize acetyl-CoA by the reduction of CO or CO 2 and H 2 with the resulting acetate available to the animal [5]. Thus an additional strategy proposed is the use of acetogens in conjunction with methanogen inhibition so that hydrogen does not accumulate and inhibit fermentation.
In some gut environments acetogens can compete with methanogens for H 2 , although the process is not energetically favoured by conditions found in the mature rumen [6]. Nevertheless, reductive acetogenesis has been shown to occur in batch cultures when methanogenesis is inhibited and acetogens are added [7,8]. Acetogenic bacteria are thought to be the dominant hydrogenotrophs in early rumen microbiota [9,10], and understanding their ecology in the developing digestive tract of ruminants may reveal key features that lead to the prevalence of methanogens and the restriction of homoacetogens in the adult rumen.
Consequently, rumen acetogens are of interest in the development of microbial approaches to methane mitigation. Several acetogens have been isolated from the rumen [2], and analyses of sequences of formyltetrahydrofolate synthetase, a key enzyme of the Wood-Ljungdahl pathway, indicate that additional species remain uncultured [11,12]. Here we present the genome sequence of E. limosum strain SA11 isolated from the rumen of a sheep [2].

Classification and features
Eubacterium limosum SA11 was isolated from the rumen of a New Zealand sheep grazing fresh forage [2], and was originally described as sheep acetogen SA11 but not characterized further. Cells of SA11 are Gram positive non-motile rods occurring singly and in pairs (Fig. 1). The 16S rRNA from SA11 is 97 % similar to the E. limosum type strain ATCC 8486 T which was isolated from human faeces, and as such SA11 can be considered as a rumen strain of E. limosum (Fig. 2). Strains of E. limosum have been isolated from various anaerobic environments including the gastrointestinal tract of various animals, sewage and mud [13,14]. E. limosum was the first rumen acetogen to be isolated [13], and this strain (RF) was characterized [15,16] and used in co-culture studies with the pectin-degrading rumen bacterium Lachnospira multipara [17]. These studies showed E. limosum to be a metabolically versatile bacterium able to grow on a wide variety of compounds including CO, CO 2 /H 2 , hexoses, pentoses, alcohols, methyl-containing compounds, formate, lactate, and some amino acids. Acetate and butyrate are the main fermentation endproducts, although butyrate production is low when grown on CO 2 /H 2 [13]. Additional characteristics of strain SA11 are shown in Table 1.

Genome sequencing information
Genome project history Eubacterium limosum SA11 was selected for genome sequencing as an example of a rumen acetogen isolated in New Zealand with potential application in methane mitigation strategies. A summary of the genome project information is shown in Table 2 and Additional file 1: Table S1 .

Growth conditions and genomic DNA preparation
Strain SA11 was able to grow in CO 2 -containing media with the following energy sources (all tested at 10 mM): hydrogen, formate, D-glucose, D-fructose, D-xylose, Dribose, maltose, pyruvate, L-lactate, methanol, vanillate, syringate, and 3,4,5-trimethoxybenzoate. Growth was assessed as an increase in culture density compared to cultures that contained none of the added energy sources. The following did not support growth: D-mannose, D-galactose, Larabinose, L-rhamnose, D-cellobiose, sucrose, lactose, melibiose, raffinose, D-mannitol, D-sorbitol, glycerol, succinate, ethanol, ethylene glycol, 2-methoxyethanol, gallate, ferulate, aesculin, glycine, L-glutamate, and betaine. Glucose and methanol are the best substrates and support the growth of SA11 to a high cell density. Strain SA11 grew most rapidly at pH values of 6.5 to 7.0 ( Fig. 3a) and at a temperature of about 40°C (Fig. 3b). These are typical of its rumen environment.
Cells of SA11 grown with hydrogen or glucose were resuspended in fresh medium and 5000 Pa hydrogen was added to the culture headspace. Cells grown with both substrates were able to used gaseous hydrogen to a threshold concentration of 347 to 375 Pa (Fig. 4), at which point hydrogen use stopped. These concentrations are equivalent to 2.10 to 2.25 μM dissolved hydrogen. Normal ruminal hydrogen concentrations can exceed this directly after feeding, but are also below this over the animal feeding cycle [18], meaning that strain SA11 probably can grow as a hydrogen-dependent homoacetogen at times when hydrogen concentrations are high in the rumen. SA11 cells for genome sequencing were grown in RM02 medium [19] with 10 mM glucose and 0.1 % yeast extract but without rumen fluid. Culture purity was confirmed by Gram stain and sequencing of the 16S rRNA gene. Genomic DNA was extracted from freshly grown cells by standard cell lysis methods using lysozyme, proteinase K and sodium dodecyl sulphate, followed by phenol-chloroform extraction, and purified using the Qiagen Genomic-Tip 500 Maxi kit (Qiagen, Hilden, Germany). Genomic DNA was precipitated by the addition of 0.7 vol isopropanol, and collected by centrifugation at 12,000 × g for 10 min at room temperature. The supernatant was removed, and the DNA pellet was washed in 70 % ethanol, re-dissolved in TE buffer (10 mM Tris-HCl, 1 mM EDTA pH 7.5) and stored at -20°C until required.

Genome sequencing and assembly
The complete genome sequence of SA11 was determined using pyrosequencing of a paired-end 454 GS-FLX sequence library with Titanium chemistry (Macrogen, Korea). Pyrosequencing reads provided 43× coverage of the genome and were assembled using the Newbler assembler version 2.0 (Roche 454 Life Sciences, USA). The assembly process resulted in 39 contigs across 1 scaffold. Gap closure was managed using the Staden package [20] and gaps were closed using additional Sanger sequencing by standard and inverse PCR based techniques.

Genome annotation
Genome annotation of the SA11 genome was managed as described previously [21]. The genome sequence was prepared for NCBI submission using Sequin [22], and the adenine residue of the start codon of the chromosomal replication initiator protein DnaA (ACH52_0001) gene was chosen as the first base for the genome.

Genome properties
The genome of E. limosum SA11 consists of a single 4,150,332 basepair (bp) circular chromosome with an average G + C content of 47.4 %. A total of 3902 genes were predicted, of which 3805 were protein-coding genes. The properties and statistics of the SA11 genome are summarized in Tables 3 and 4, and the nucleotide sequence has been deposited in Genbank under accession number CP011914. The genome atlas for E. limosum SA11 is shown in Fig. 5. Three other E. limosum strains have had their genome sequences determined. These are the closed genome of strain KIST612 (4,276,902 bp) isolated from an anaerobic digester [23], the draft genome of the type strain ATCC 8486 T (4,370,113 bp) isolated from human faeces [24], and the draft genome of strain 32_A2 isolated from a deep subsurface shale carbon reservoir (Project ID: Gp0114934).

Cell envelope
Chemical analysis of the cell wall of the type strain of E. limosum (ATCC 8486 T ) shows the presence of the amino sugars N-acetylmuramic acid (2.9 % dry weight), N-acetylglucosamine (2.1 %) and N-acetylgalactosamine (3.9 %) together with larger amounts of rhamnose (20.4 %), glucose and galactose (together 14.9 %). Amino Fig. 2 Phylogenetic tree highlighting the position of E. limosum SA11 relative to the type strains of the other Eubacterium species. The evolutionary history was inferred using the Neighbor-Joining method [43]. The optimal tree with the sum of branch length = 0.83983608 is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) is shown next to the branches [44]. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Kimura 2-parameter method [45] and are in the units of the number of base substitutions per site. The rate variation among sites was modeled with a gamma distribution (shape parameter = 1). The analysis involved 16 nucleotide sequences. All positions containing gaps and missing data were eliminated. There were a total of 1214 positions in the final dataset. Evolutionary analyses were conducted in MEGA6 [46]. Species with strain genome sequencing projects registered in the Genomes Online Database (GOLD) [47] are labeled with an asterisk acids identified as present in peptidoglycan were alanine (3.6 %), glutamic acid (8.0 %), lysine (9.0 %), ornithine (12.1 %) and serine (3.4 %) and a putative structure of the peptidoglycan was proposed [25]. In strain SA11 the genes for peptidoglycan biosynthesis are similar to those from other Gram positive bacteria but without the mreBCD genes predicted to control cell shape. The SA11 genome contains a large number of genes predicted to be involved in the synthesis of cell wall glycopolymers. These are ordered in six clusters, (ACH52_0663-687 which contains rhamnose biosynthesis genes, ACH52_1029-1040*, ACH52_1350-1371* which contains sialic acid biosynthesis genes, ACH52_1470-1484, ACH52_1620-1630* and ACH52_2094-2105*). Four of these clusters (marked *) are located next to transposase genes. There are also numerous cell surface proteins which contain a variety of domains. SA11 has one cluster of genes (ACH52_2223-2229) predicted to be involved in the biosynthesis and export of a nonribosomally synthesised peptide of unknown function. The non-ribosomal peptide synthetase gene (ACH52_2225) encodes a 2442 amino acid protein which shows 90 % identity with a similar protein (also 2442 amino acids) from E. limosum KIST612. The genomic location of the non-ribosomal peptide synthetase gene differs in the two strains.
Mobile elements SA11 has a 55 kb prophage (Fig. 6) integrated into the genome (ACH52_1707-1805) adjacent to a serine tRNA. Strain KIST612 does not have a prophage at this location but has three prophages at other sites on the chromosome. In terms of phage defense systems the SA11 chromosome has one cluster of CRISPR genes and two spacer regions at the same locations as found in strain KIST612, but does not contain genes for components of restriction/modification systems. However, there is a gene for a restriction alleviation protein (ACH52_1751) located in the prophage. In addition to the prophage several other gene clusters appear to have been acquired by horizontal transfer. These include all six of the cell wall glycopolymer gene clusters as well as genes for a type VII secretion system (ACH52_0209-0234), cell surface proteins (ACH52_0843-0846), and genes of unknown function ACH52_1057-1076, ACH52_1256-1271, and ACH52_3658-3696). SA11 also has chemotaxis  genes (ACH52_0307-0324 and ACH52_3642-3645) which are not present in strain KIST612, but the function of these is unknown as no flagella genes are found in either genome.

Metabolism
SA11 has a large repertoire of genes involved in central metabolism and grew with hydrogen, formate, some sugars, some compounds containing methoxyl-groups such as methanol and methoxylated benzoates, lactate and pyruvate. These are all typical energy sources for homoacetogenic bacteria.

The Wood-Ljungdahl pathway and energy conservation
The Wood-Ljungdahl pathway is central to the metabolism of acetogens and the genes encoding this pathway are found in three distinct clusters in SA11 (ACH52_291-295, ACH52_2912-2912, ACH52_3087-3089) as has been reported for strain KIST612 [26]. SA11 produced only acetate from hydrogen plus carbon dioxide and from glucose, consistent with the use of the Wood-Ljungdahl pathway. Energy conservation in the Wood-Ljungdahl pathway and in acetogens in general has been the focus of extensive study but is not yet fully understood [27]. Key elements of energy conservation systems in E. limosum are the membrane-bound Na + -translocating Rnf (ACH52_1410-1415) and ATP synthase complexes [26]. As reported for strain KIST612 [26], SA11 has two sets of ATP synthase genes which show different gene orders (ACH52_1610-1617 and ACH52_1920-1928).

Polysaccharides
In contrast to most rumen bacteria, SA11 has very few genes encoding glycoside hydrolases. There are two  genes encoding GH3 family proteins, one of which (ACH52_0577) has a signal peptide and probably also has a role in cell wall biosynthesis. The gene for a secreted GH4 family protein is located next to an alpha-glucoside specific PTS transport system protein (sa1_0874-0875). SA11 has six genes encoding GH13 family proteins, all of which are predicted to be intracellular and one of which is part of a gene cluster involved in glycogen biosynthesis and degradation (ACH52_0652-0657).
Purines SA11 has a large conserved genetic region associated with selenium-dependent molybdenum hydroxylases (ACH52_1581-1608) [28] which ends with the molybdate ABC transporter genes. The role of these genes in SA11 is not known but it is likely that they encode the selenium-containing xanthine dehydrogenase characterized from the closely related Eubacterium barkeri [29].

1,2 propanediol
Rhamnose and fucose are common components of plant cell walls and bacterial exopolysaccharides, and their degradation in the rumen results in lactaldehyde, which is reduced by lactaldehyde reductase to 1,2 propanediol (1,2-PD). There is no literature on the metabolism of 1,2-PD by E. limosum, but the acetogen Acetobacterium woodii can grow on 1,2-PD producing propionate and propanol as end products [31]. This process occurs independently of acetogenesis. The 1,2-PD degradative pathway has been determined in Salmonella enterica and, because the propionaldehyde intermediate is highly toxic to the cell, the process occurs within an organelle called a bacterial microcompartment (BMC) [32]. The BMC consists of a thin protein shell made up of several thousand copies of polypeptides with conserved domains described by the Pfams PF00936 (found in 7 proteins in SA11) and PF03319 (1 protein in SA11). SA11 has a cluster of 19 pdu genes encoding degradative enzymes and BMC production (ACH52_0472-490). The gene arrangement is identical to A. woodii [31], except the pduO' gene (Awo_c25780) is not present.

Methyl-containing compounds
Pectins make up a significant proportion of plant cell walls and their complex structures are often highly methylated so that action of the enzyme pectin methyl esterase produces methanol in the rumen [33]. E. limosum grows well on methanol [15] and has a methanol:corrinoid methyltransferase (ACH52_2073) as part of a larger gene cluster. Phenyl methyl ethers are degradation products of lignin, and their methyl groups can be utilized as carbon and energy sources by acetogens including E. limosum [16] and the closely related E. callanderi [34]. The ether cleavage is mediated by the Odemethylases, which consist of four different proteins: two methyltransferases, a corrinoid protein, and an activating enzyme. SA11 has several genes similar to those described from other bacteria [35] and one gene cluster (ACH52_0344-0347), which is not present in the KIST612 strain, may be involved in the metabolism of these compounds. An unusual feature of the SA11 genome is the presence of multiple copies of genes encoding trimethylamine methyltransferase family proteins (COG05598). SA11 has 39 genes in this category, more than any other bacterial  genome, and this seems to be a characteristic of the species as the KIST612 strain has 31 examples. These genes are restricted to the two-thirds of the genome closest to the origin of replication with none found between ACH52_1572 and ACH52_2975. All of these genes are similar in size and predicted to encode proteins between 458 and 492 amino acids. They are usually associated with genes for cobalamin B12-binding proteins (COG05012), BCCT (betaine/carnitine/choline transporter, COG01292, [36]) and MFS family transporters and GntR family transcriptional regulators. Their substrate is not known. E. limosum is known to have the ability to demethylate, and thereby increase the bioactivity of, a range of plant isoflavonoids [37][38][39]. This has led to it being linked with possible health benefits and longevity [40].

Lactate
Lactate is used for growth by E. limosum SA11, and the mechanism of lactate utilization in acetogens has recently been determined in Acetobacterium woodii [41].
In this species a stable complex is formed between lactate dehydrogenase and the two subunits of an electrontransferring flavoprotein. This complex uses flavin-based electron bifurcation for energetic coupling. The genes for this complex have been identified in A. woodii, and a similar gene cluster is found in E. limosum KIST612 [41] and also in SA11 (ACH52_2109-2113).

Butyrate
Butyrate is produced when E. limosum is grown on a range of substrates and butyrate production by strain KIST612 grown on CO has been studied [26]. The genes for the pathway from acetyl-CoA to butyryl-CoA have been identified and are also found in SA11 (ACH52_3484-3489). The cluster of butyrate genes also includes the two subunits of an electron-transferring flavoprotein (EtfAB) and it is proposed that butyryl-CoA dehydrogenase forms a complex with EtfAB and also uses flavin-based electron bifurcation as reported in Clostridium kluyveri [42]. E. limosum does not have a butyrate kinase and uses the alternative pathway that transfers the CoA moiety from butyryl-CoA onto acetate (butyryl-CoA:acetate CoAtransferase, ACH52_2647) as the final step in butyrate formation. The SA11 genome contains three EtfAB pairs (ACH52_238-239, ACH52_3175-3176 and ACH52_3178-3179) additional to the ones involved in lactate and butyrate metabolism but the function of these is not known, and is not apparent from their genome context.

Conclusion
The genome sequence of Eubacterium limosum SA11 provides insights into the metabolism of this versatile rumen acetogen. SA11 can grow autotrophically using CO 2 /H 2 or heterotrophically using a diverse range of substrates with the best growth on glucose or methanol. If autotrophic growth could be encouraged, and hydrogenotrphic methanogens inhibited, then SA11 could be a useful addition to methane mitigation strategies. However, it is apparent that in the rumen SA11 would have a number of different substrates to select from and that autotrophic growth is unlikely to be the norm. Consequently, it is unlikely to be a suitable candidate to take the place of hydrogenotrophic methanogens in the rumen. SA11 does grow well on methanol and it would be interesting to determine if it is able to compete with the methylotrophic methanogens such as Methanosphaera species and members of the order Methanomassiliicoccales that are present in the rumen.