Draft genome sequence of Methylibium sp. strain T29, a novel fuel oxygenate-degrading bacterial isolate from Hungary

Methylibium sp. strain T29 was isolated from a gasoline-contaminated aquifer and proved to have excellent capabilities in degrading some common fuel oxygenates like methyl tert-butyl ether, tert-amyl methyl ether and tert-butyl alcohol along with other organic compounds. Here, we report the draft genome sequence of M. sp. strain T29 together with the description of the genome properties and its annotation. The draft genome consists of 608 contigs with a total size of 4,449,424 bp and an average coverage of 150×. The genome exhibits an average G + C content of 68.7 %, and contains 4754 protein coding and 52 RNA genes, including 48 tRNA genes. 71 % of the protein coding genes could be assigned to COG (Clusters of Orthologous Groups) categories. A formerly unknown circular plasmid designated as pT29A was isolated and sequenced separately and found to be 86,856 bp long.


Introduction
Fuel oxygenates like MTBE, ETBE and TAME have been blended into gasoline for decades to boost octane ratings and to improve the efficiency of fuel combustion in engines. But being the most water-soluble components of gasoline they have simultaneously become some of the most frequently detected pollutants in groundwater posing a serious threat to drinking water supplies [1]. Moreover, recent studies have reported that they can be carcinogenic in humans [2], so remediation of the sites polluted with these compounds became an important issue. Several microbial consortia and individual bacterial strains were isolated so far being capable of their degradation to various extents [3,4]. However, only a few of them were studied in detail and there are even fewer cases where the genetic and enzymatic background of the degradation is elucidated at least in some aspects.
Methylibium petroleiphilum PM1 was one of the first isolated individual MTBE-degrading strains originated from a compost-filled biofilter in Los Angeles, California, USA [5]. To date it is the only representative of the genus identified at the species level [6,7]. During laboratory experiments it proved to have outstanding MTBE-degrading ability and it was tested in a bioaugmentation field study, too [8]. Afterwards, a number of bacteria closely related to M. petroleiphilum PM1 were detected based on 16S rDNA sequences at MTBE-contaminated sites at different geographic locations suggesting that the genus might have an important role in MTBE biodegradation [8,9]. Later its complete genome sequence was published which revealed that besides the 4 Mb circular chromosome, M. petroleiphilum PM1 possesses a~600 kb megaplasmid carrying the genes involved in MTBE degradation [10]. At present, no genome sequence information is available for other members of the Methylibium genus. As part of a French-Hungarian project aiming to characterize novel fuel oxygenate-degrading bacteria at the genomic level, we have isolated a novel Methylibium strain. The MTBEdegrading capacity of the strain was as high as the M. petroleiphilum PM1's but some of its genetic and metabolic characteristics were found to be significantly different.
Here we present the classification and features of Methylibium sp. T29 together with the description of the draft genome sequence and annotation compared to the reference strain M. petroleiphilum PM1.
Initial taxonomic assignment of the strain was established by comparing its 16S ribosomal RNA gene sequence to the nonredundant Silva SSU Ref database [13,14]. Phylogenetic analysis was conducted using MEGA 6 [15]. According to the phylogenetic analysis, strain T29 belongs to the genus Methylibium ( Table 1). The closest relative of strain T29 is M. petroleiphilum PM1 (Fig. 2).
Despite its close relatedness based on 16S rDNA sequences, the new strain differs from the type strain M. petroleiphilum PM1 in several aspects. For example, unlike M. petroleiphilum PM1, strain T29 is resistant to tetracycline, ampicillin [16] and mercury, and cannot grow on n-alkanes [10]. Moreover, PCR primers designed for mdpA and other known genes involved in MTBE degradation in M. petroleiphilum PM1 [17] failed to detect any related sequences in strain T29 suggesting that the genetic makeup of MTBE metabolism in this strain differs significantly from the one in M. petroleiphilum PM1. Pulsed field gel electrophoresis of restriction enzyme digested genomic DNA of strain T29 and M. petroleiphilum PM1 revealed major differences in the genomic sequences of the two strains (data not shown). Based on the evidences above, the new strain was named as Methylibium sp. T29.

Genome project history
The genome of M. sp. T29 was sequenced by using Ion Torrent technology in our facility. The draft genome was assembled de novo using the overlap layout consensus methodology by the freely available software GS De Novo Assembler 2.9 (Roche). This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/Gen-Bank under the accession number AZND00000000. The version described in this paper is AZND01000000. The plasmid pT29A was isolated and sequenced separately by the same technology. The assembly was performed by a different approach using SPAdes 3.0 [18]. The sequence was circularized and finished by manual editing. The full sequence of the plasmid pT29A is also available in GenBank under the accession number NC_024957.1.

Growth conditions and genomic DNA preparation
M. sp. T29 was isolated from a mixed bacterial culture enriched from gasoline-contaminated groundwater samples collected from the area of Tiszaújváros, Hungary, in November 2010. The strain was deposited into the National Collection of Agricultural and Industrial Microorganisms (NCAIM) [19] under the accession number NCAIM B.02561.
For genomic DNA preparation, bacteria were grown under aerobic conditions in a tightly sealed bottle at 28°C for 14 days in mineral salts medium supplemented with 200 mg/l MTBE. Genomic DNA was isolated using UltraClean Microbial DNA Isolation Kit (MO BIO) according to the protocol provided by the manufacturer.

Genome sequencing and assembly
The genomic library was prepared using IonXpress Plus Fragment Library Kit (Life Technologies) and was sequenced using Ion PGM 200 Sequencing Kit v2 with an Ion Torrent PGM Sequencer. The raw data were processed using Torrent Suite 4.0.1. The number of usable reads was 3,100,682 with a total base number of   Table 2).
The pT29A plasmid was purified using a modified plasmid miniprep method [20] and treated with Plasmid-Safe™ ATP-dependent DNase (Epicentre) before sequencing with Ion Torrent technology using the kits mentioned above. 40,770 reads were obtained with a total base number of 8,500,697. The mean read length was 208.50 ± 51.50 bp, the mode length was 234 bp. The

Genome annotation
The assembled draft genome and the pT29A sequences were annotated using Prokka 1.8 [21]. For the prediction of signal peptides and transmembrane domains SignalP 4.1 Server [22,23] and TMHMM Server v. 2.0 [24] were used, respectively. Assignment of genes to the COG database [25,26] and Pfam domains [27] was performed with WebMGA server [28].

Genome properties
The

Not in COGs
The total is based on the total number of protein coding genes in the genome  Tables 3 and 4). The map of the draft genome of M. sp. T29 aligned to the full genome of the closest relative M. petroleiphilum PM1 is illustrated in Fig. 3 and Fig. 4. The plasmid pT29A carries 90 protein coding genes, of which 72.2 % has functional prediction and 70 % could be assigned to COG categories ( Table 5). The most abundant functional category was the coenzyme transport and metabolism ( Table 6). The map of the plasmid is shown in Fig. 5.

Conclusions
On average, the draft genome of M. sp. T29 shows 97 % identity to the M. petroleiphilum PM1 chromosome and 85 % identity to a small part of the M. petroleiphilum PM1 megaplasmid at the nucleotide level as measured by NUCmer [29] (Fig. 4) but significant differences were also found. Notably, most parts of the 600 kb megaplasmid are missing from M. sp. T29. A pulsed field gel electrophoretic analysis to detect megaplasmids [30] revealed that unlike M. petroleiphilum PM1 our isolate does not harbor the megaplasmid which carries the genes for MTBE-degradation [10]. Instead, a~87 kb plasmid is present (Fig. 5) that we named pT29A.
The fact that in M. petroleiphilum PM1 the genes for MTBE-metabolism are located on the pPM1 megaplasmid suggested that in M. sp. T29 these genes are also carried by the pT29A plasmid. Surprisingly, no known genes associated with MTBE-degradation were found among the plasmid coded genes besides a cobalamin-synthesis operon which differs from the one in M. petroleiphilum PM1. Cobalt ions or cobalamin are required for complete MTBE-degradation in some strains for the utilization of 2-HIBA which is a key intermediate in the metabolic pathway [31,32]. However, we were able to identify the putative components of the MTBE-degradation pathway in the whole genome of the M. sp. T29 including orthologous genes coding for the MTBE monooxygenase [16] and the TBA monooxygenase [33] showing only 84 and 81 % identity at the amino acid level to their M. petroleiphilum PM1 counterparts, respectively ( Table 7). As opposed to the considerably high similarity of the majority of the two genomes, the significantly lower sequence conservation of the MTBE-degradation pathway components and the fact that these genes are not linked to the pT29A plasmid indicate that the gene cluster for MTBE-metabolism is probably located on a transposon which resides on the megaplasmid and the chromosome in M. petroleiphilum PM1 and M. sp. T29,  Barton et al. [30]. The arrows show the~600 kb partially linearized megaplasmid of M. petroleiphilum PM1 described in [10], and the~87 kb partially linearized pT29A plasmid described in this paper. b Circular representation of the pT29A plasmid of M. sp. T29 displaying relevant features. The circular map was visualized by CGView [36]. The features are the following from outside to center: genes on forward strand, genes on reverse strand (colored by COG categories), GC content and GC skew  The total is based on the total number of protein coding genes in the plasmid genome CRISPR repeats 0 0.0 strain in the field at MTBE-contaminated sites in Hungary. The nucleotide sequences of other genes in the MTBE-degradation pathway can also be used to construct better oligonucleotide chips to detect the potentially active genes in environmental samples.