Complete Genome sequence of Burkholderia phymatum STM815T, a broad host range and efficient nitrogen-fixing symbiont of Mimosa species

Burkholderia phymatum is a soil bacterium able to develop a nitrogen-fixing symbiosis with species of the legume genus Mimosa, and is frequently found associated specifically with Mimosa pudica. The type strain of the species, STM 815T, was isolated from a root nodule in French Guiana in 2000. The strain is an aerobic, motile, non-spore forming, Gram-negative rod, and is a highly competitive strain for nodulation compared to other Mimosa symbionts, as it also nodulates a broad range of other legume genera and species. The 8,676,562 bp genome is composed of two chromosomes (3,479,187 and 2,697,374 bp), a megaplasmid (1,904,893 bp) and a plasmid hosting the symbiotic functions (595,108 bp).


Introduction
Rhizobia are a functional class of bacteria able to enter into nitrogen-fixing symbioses with legumes. The bacterial symbiont induces the formation of nodules on the roots of the plant where they differentiate into nitrogen-fixing bacteroids. Bacteria then allocate combined nitrogen to the plant, which in return provides the bacteria with energy derived from photosynthesis. This symbiosis confers agricultural advantages to the legumes by reducing the need for fertilization and allows them to be pioneer plants on degraded or contaminated soils.
Rhizobia are polyphyletic and are placed within two classes of Proteobacteria, the Alphaproteobacteria and the Betaproteobacteria. They are closely related to non-symbiotic species, including important human, animal or plant path-ogens or saprophytes. Most research has focused on the α-rhizobia, since the β-rhizobia were only recently discovered [1,2]. The α-rhizobia include 10 genera (Sinorhizobium, Mesorhizobium, Rhizobium, Methylobacterium, Devosia, Azorhizobium, Bradyrhizobium, Ochrobactrum, Bosea and Phyllobacterium) and have a worldwide distribution associated with a diversity of legume species (from herbs to trees). To date, the β-rhizobia include only two genera, Burkholderia and Cupriavidus (ex Ralstonia), and a dozen species (for review [3], updated in [4]). They are found preferentially associated with Mimosa species (at least 68 nodulated species, and especially M. pudica, M. pigra, and M. bimucronata) in Asia, Australia, and Central and South America [5,6]. Based on a comparison of house-keeping and nodulation gene phylogenies, Burkholderia species have been postulated to be ancestral symbionts of South Standards in Genomic Sciences American Mimosa and Piptadenia species [4,5].
Here we describe the genome sequence of one of the first described β-rhizobia, the type strain of Burkholderia phymatum, STM815 T .

Classification and features
Burkholderia phymatum STM815 T is a motile, Gram-negative rod (Figure 1) in the order Burkholderiales of the class Betaproteobacteria. It is fast growing, forming colonies within 3-4 days when grown on yeast-mannitol agar (YMA [7],) at 28°C. It is one of the first described members of the β-rhizobia. The strain STM815 T , which is the type strain of the species, was isolated from nodules of Machaerium lunatum in French Guiana in 2000 [1], and the species, B. phymatum, was described based on this single isolate [8]. However, the species has subsequently been shown not to nodulate Machaerium species [9], but it can nodulate species in the large genus Mimosa [9,10]. Indeed, the symbiotic abilities of STM815 T have been demonstrated on numerous Mimosa species, and this strain is now considered to be an efficient symbiont of a broad range of legumes, particularly in Mimosa and related genera in the sub-family Mimosoideae [9]. Strain STM815 T is also able to fix nitrogen in free-living conditions [9]. Many isolates of B. phymatum have been sampled from Mimosa pudica in French Guiana [10], Papua New Guinea [9], China [11] and India [12]. Phylogenetic analyses of core and symbiotic genes have illustrated the ancestral status of Burkholderia species in symbioses with Mimosa [4,5]. Burkholderia phymatum STM815 T is now considered to be a model system for studying the adaptive processes of Burkholderia in symbioses with legumes, in comparison with α-rhizobia. The B. phymatum species is phylogenetically related to symbiotic and non-pathogenic species, and is distant from the "cepacia" clade of Burkholderia (which contains many pathogenic species) ( Figure 2, Table 1)

Symbiotaxonomy
Burkholderia phymatum STM815 T forms nodules (Nod + ) and fixes N2 (Fix + ) with a broad range of Mimosa species [6,9] as well as with other genera in the tribe Mimoseae in the Mimosoideae legumes sub-family [9]. Nodulation data were compiled in Table 2.  Phylogenetic tree highlighting the position of Burkholderia phymatum strain STM815 T relative to other type strains within the genus Burkholderia. The 16S rDNA sequences from type strains were obtained from the ribosomal database project [13], aligned with muscle 3.6, and a neighbor-joining tree was built from a Kimura-2P corrected distance matrix using BioNJ on the www.phylogeny.fr server [14]. Numbers at nodes are % bootstraps from 1000 replicates (shown only if >50%). Accession numbers of 16S rDNA are indicated between parentheses for each strain. C. taiwanensis LMG19424 T was used as outgroup. Standards in Genomic Sciences  [1,9] MIGS-14 Pathogenicity None a Evidence codes -IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [26].

Genome sequencing information Genome project history
The genome was selected by a consortium of researchers led by M. Riley, to be sequenced by the DOE Joint Genome Institute as part of the "Recommendations for Sequencing Targets in Support of the Science Missions of the Office of Biological and Environmental Research". Initially, the strain was chosen to enrich genome data in the Burkholderia genus for comparative genomics. The genome was selected for genome determina-tion because strain STM815 T is a legume symbiont, as compared to the large number of genome sequences available for opportunistic and human-pathogens. The genome sequence was completed in 2007 and presented for public access on April 2008. Automatic annotation was performed using the JGI-Oak Ridge National Laboratory annotation pipeline [28]. Additional automatic and manual sequence annotation, as well as comparative genome analysis, were performed using the MicroScope platform at Genoscope [29]. Table 3 presents the project information and its association with MIGS version 2.0 compliance [30].
http://standardsingenomics.org 767 6 7 Legend: O = no nodules formed; N = outgrowths on roots, superficially similar to nodules but ineffective; I = nodules formed are inefficient; F = nitrogen fixing nodules formed (these may not all be fully effective, but plants gave acetylene reduction values at least twice that of non-nodulated control plants).
*This is taken to include Acacia subgenus Acacia, now thought to be closely related to tribe Mimoseae and given the generic name Vachellia by some.

Growth conditions and DNA isolation
The strain was grown in 50 ml of broth Yeastmannitol medium (YM [7],) and DNA isolation was performed using a CTAB (Cetyl trimethyl ammonium bromide) bacterial genomic DNA isolation method [31].

Genome sequencing and assembly
The genome of Burkholderia phymatum STM815 T was sequenced by Sanger technology at the Joint Genome Institute (JGI) using a combination of 3 kb, 8 kb and 40 kb (fosmid) DNA libraries. All general aspects of library construction and sequencing performed at the JGI can be found at the DOE JGI website [32]. Draft assemblies were based on 115,329 total reads and resulted in approximately 11.2× coverage of the genome. The Phred/Phrap/Consed software package was used for sequence assembly and quality assessment [33][34][35]. Gaps between contigs were closed by custom primer walks on gap spanning clones or PCR products. A total of 1,282 additional reactions were necessary to close gaps and to raise the quality of the finished sequence. The completed genome sequences of B. phymatum STM815 T contain 115,487 reads, achieving an average of 11.2-fold sequence coverage per base with an error rate less than 1 in 100,000.

Genome annotation
Automatic annotation was performed using the Integrated Microbial Genomes (IMG) platform [36] developed by the Joint Genome Institute, Walnut Creek, CA, USA [28]. Additional automatic and manual sequence annotation, as well as comparative genome analysis, were performed using the MicroScope platform at Genoscope [29]. Gene calling in Microscope resulted in the prediction of 940 additional protein coding sequences compared to the 7,496 detected at IMG. These additional genes were mostly short coding sequences considered as gene remnants or fragmented CDS, so that genome statistics presented here are from the IMG platform.

Genome properties
The genome includes two chromosomes and two plasmids, for a total size of 8,676,562 bp (62.3% GC content). Chromosome 1 is 3.48 Mb in size (63.0% GC), chromosome 2 is 2.69 Mb (62.3% GC), plasmid 1 is 1.90 Mb (62.0% GC) and plasmid 2 0.59 Mb (59.2% GC). For chromosomes 1 and 2, 3,140 and 2,358 genes were predicted, respectively. For plasmid 1 and 2, 1,627 and 449 genes were predicted, respectively. A total 7,496 of protein coding genes were predicted, of which 5,601 were assigned to a putative function with the remaining annotated as hypothetical proteins. 5,630 protein coding genes belong to COG families in this genome. The properties and the statistics of the ge-nome are summarized in Tables 4-6, and circular maps of each replicon are shown in Figure 3 (chromosomes) and Figure 4 (plasmids). Plasmid 2 was identified as the symbiotic plasmid of STM815, as it carried nod, nif and fix genes directly involved in symbiosis as well as several other genes coding for proteins indirectly linked to symbiotic interactions with plants. Among them were found genes coding for the biosynthesis of phytohormones such as indol acetic acid (iaaHM), ACC deaminase (acdS), and genes involved in the biosynthesis of rhizobitoxine (rtxAC-like). A Type 4 secretion system was also identified on this plasmid, while no type 3 system could be detected in the whole genome. The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome. b Also includes 39 pseudogenes.

Venn diagram (family number)
Gene families specific to, or shared by, Burkholderia phymatum STM815 T and 3 other Burkholderia species, were determined using MICFAM [ Figure 5]. This tool is based on Micro-Scope gene families [39] which are computed us-ing an algorithm implemented in the SiLiX software [40]: a single linkage clustering algorithm of homologous genes sharing an amino-acid alignment coverage and identity above a defined threshold. This algorithm operates on the "The friends of my friends are my friends" principle of gene comparison. If two genes are homologous, they are clustered. Moreover, if one of the genes is already clustered with another one, the three genes are clustered into the same MICFAM. Standards in Genomic Sciences     [37,38]; a soil bacterium (B. xenovorans LB400) and a human opportunistic pathogen (B. cenocepacia AU1054). The core genomes of all four bacteria yielded 1,582 gene families. Each bacterium had more gene families specific to its species, (from 3,002 to 5,656 depending on strain) than shared ones (1,582 core gene families). There were 418 gene families specific to the two Mimosa symbionts (STM815 and BR3459a), including symbiosis-related genes (nod genes) and nitrogen fixation genes (nif, fix), glutamine transporters, biosynthesis genes of the phytohormone indol acetic acid (IAA), and hydrogenase genes (hup, hyp). Standards in Genomic Sciences

Conclusion
Burkholderia phymatum STM815 T possesses a large genome composed of two chromosomes and two plasmids, one of which encodes the symbiotic functions. Further studies on the genome of this bacterium will help elucidate the high nodulation competitiveness [41], broad host range and symbiotic efficiency of this strain.