Genome sequence of Burkholderia mimosarum strain LMG 23256T, a Mimosa pigra microsymbiont from Anso, Taiwan

Burkholderia mimosarum strain LMG 23256T is an aerobic, motile, Gram-negative, non-spore-forming rod that can exist as a soil saprophyte or as a legume microsymbiont of Mimosa pigra (giant sensitive plant). LMG 23256T was isolated from a nodule recovered from the roots of the M. pigra growing in Anso, Taiwan. LMG 23256T is highly effective at fixing nitrogen with M. pigra. Here we describe the features of B. mimosarum strain LMG 23256T, together with genome sequence information and its annotation. The 8,410,967 bp high-quality-draft genome is arranged into 268 scaffolds of 270 contigs containing 7,800 protein-coding genes and 85 RNA-only encoding genes, and is one of 100 rhizobial genomes sequenced as part of the DOE Joint Genome Institute 2010 Genomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB) project.


Introduction
Members of the versatile genus Burkholderia occupy a wide range of ecological niches and are found in soil, hospital environments, associated with plants either as epiphytes, endophytes or as pathogens and some are endosymbionts in phytopathogenic fungi or plant-associated insects [1]. As several Burkholderia strains are known to exert plant-beneficial and biocontrol effects, and also contribute to adaptation to environmental stresses, there is increased interest in the use of Burkholderia in agriculture [1,2]. In addition to the different groups of rhizobia from the Alphaproteobacteria, a number of Betaproteobacteria belonging to Burkholderia and Cupriavidus are now also known to be present in legume nodules; they are sometimes referred to as betarhizobia [3][4][5]. Several Burkholderia species have been described from root nodules of different Mimosa species: B. caribensis from M. pudica and M. diplotricha [4,6], B. mimosarum from M. pigra and M. scabrella [7], B. nodosa from M. bimucronata and M. scabrella [8], B. phymatum from M. invisa and Machaerium lunatum [6,9] and B. sabiae from M. caesalpiniifolia [10]. Moreover, several Burkholderia strains have been shown to enter into effective symbiosis with their host [11]. B. mimosarum was described for a collection of isolates obtained from M. pigra in Taiwan, Venezuela and Brazil and one strain from M. scabrella in Brazil [7]. Since its first description, B. mimosarum has also been isolated from M. pigra nodules in China and Australia [12,13], from M. diplotricha in Papua New Guinea [14] and M. pudica in French Guiana [15]. M. pigra, as well as M. pudica and M. diplotricha, are notoriously invasive species [16]. M. pudica (sensitive plant) is a small South Ameri-can shrub that has become a pan-tropical weed, while M. pigra (giant sensitive plant, black mimosa, prickly wood weed, catclaw mimosa) is a shrub that thrives in floodplains, swamps and river banks, where it creates dense spiny thickets [17]. M. diplotricha (creeping sensitive plant, nila grass, giant sensitive plant) is a climbing shrub that scrambles up other plants, quickly producing dense growth [18]. The success of these invasive weeds may in part be due to their highly effective symbiotic associations.
B. mimosarum LMG 23256 T (=BCRC 17516, CCUG 54296, NBRC 106338, PAS44) originates from nodules of M. pigra in Taiwan. This legume weed is predominantly nodulated by B. mimosarum in Taiwan. Other Taiwanese Mimosa species are nodulated mainly by Cupriavidus taiwanensis and it has therefore been suggested that the Burkholderia strains were introduced to Taiwan, along with the invasive M. pigra from its native South America, where Burkholderia strains have been isolated more frequently from Mimosa sp. than C. taiwanesis [7,19].
Here we present a summary classification and a set of features for B. mimosarum strain LMG 23256 T (Table 1), together with the description of the complete genome sequence and its annotation. It is fast-growing, forming colonies within 3-4 days when grown on half strength Lupin Agar (½LA) [32], tryptone-yeast extract agar (TY) [33] or a modified yeast-mannitol agar (YMA) [34] at 28°C. Colonies on ½LA are white-opaque, slightly domed and moderately mucoid with smooth margins (Figure 1, Right). Minimum Information about the Genome Sequence (MIGS) is provided in Table 1. Figure 2 shows the phylogenetic neighborhood of B. mimosarum strain LMG 23256 T in a 16S rRNA sequence based tree. This strain shares 99% (1,121/1,124 bp) and 98% (1,101/1,125 bp) sequence identity to the 16S rRNA of the fully sequenced strain B. mimosarum STM3621 (Gi08839) and to B. nodosa Br3461 T , respectively.

Symbiotaxonomy
B. mimosarum LMG 23256 T was isolated from M. pigra growing in Anso, Taiwan and was able to nodulate its original host with high efficiency [19], as well as M. pucida and M. diplotricha [14]. LMG 23256 T was shown to outcompete other rhizobia to the point of exclusion for the nodulation of the invasive M. pigra, M. pudica and M. diplotricha under flooded conditions. This predominance was negatively affected by increased nitrate levels in the soil, which thus seems to be a factor affecting rhizobial competition [14]. With regard to other plant growth promoting properties, LMG 23256 T displayed no antifungal activity against Fusarium oxysporum f. sp. phaseoli, did not solubilize calcium-, iron-or aluminum phosphates nor reduce acetylene (ARA) on the Nfree media containing fructose, lactate or mannitol as sole carbon source [39].  Phylum Proteobacteria TAS [22] Class Betaproteobacteria TAS [23,24] Order Burkholderiales TAS [24,25] Family Burkholderiaceae TAS [24,26] Genus Burkholderia TAS [27][28][29] Species Burkholderia mimosarum TAS [7] Strain LMG 23256 T , not directly observed for the living, isolated sample, but based on a g enerally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [31]. . All sites were informative and there were no g ap-containing sites. Phylogenetic analyses were performed using MEGA, version 5 [35]. The tree was built using the Maximum-Likelihood method with the General Time Reversible model [36]. Bootstrap analysis [37] with 500 replicates was performed to assess the support of the clusters. Type strains are indicated with a superscript T. Brackets after the strain name contain a DNA database accession number and/or a GOLD ID (beginning with the prefix G) for a sequencing project registered in GOLD [38]. Published genomes are indicated with an asterisk.

Genome project history
This organism was selected for sequencing on the basis of its environmental and agricultural relevance to issues in global carbon cycling, alternative energy production, and biogeochemical importance, and is part of the Community Sequencing Program at the U.S. Department of Energy, Joint Genome Institute (JGI) for projects of relevance to agency missions. The genome project is deposited in the Genomes OnLine Database [38] and an improved-high-quality-draft genome sequence in IMG. Sequencing, finishing and annotation were performed by the JGI. A summary of the project information is shown in Table 2.

Growth conditions and DNA isolation
B. mimosarum strain LMG 23256 T was cultured to mid logarithmic phase in 60 ml of TY rich medium on a gyratory shaker at 28°C [40]. DNA was isolated from the cells using a CTAB (Cetyl trimethyl ammonium bromide) bacterial genomic DNA isolation method.

Genome annotation
Genes were identified using Prodigal [45] as part of the DOE-JGI annotation pipeline [46]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. The tRNAScanSE tool [47] was used to find tRNA genes, whereas ribosomal RNA genes were found by searches against models of the ribosomal RNA genes built from SILVA [48]. Other non-coding RNAs such as the RNA components of the protein secretion complex and the RNase P were identified by searching the genome for the corresponding Rfam profiles using INFERNAL [49]. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes (IMG-ER) platform [50].

Genome properties
The genome is 8,410,967 nucleotides 63.89% GC content ( Table 3) and comprised of 268 scaffolds (the four largest scaffolds are shown in Figures 3a, 3b, 3c and Figure 3d) of 270 contigs. From a total of 7,885 genes, 7,800 were protein encoding and 85 RNA only encoding genes. The majority of genes (75.13%) were assigned a putative function whilst the remaining genes were annotated as hypothetical. The distribution of genes into COGs functional categories is presented in Table 4.      Genes on forward strand (color by COG categ ories as denoted by the IMG platform), Genes on reverse strand (color by COG categ ories), RNA g enes (tRNAs g reen, sRNAs red, other RNAs black), GC content, GC skew.