Draft genomic sequence of a selenite-reducing bacterium, Paenirhodobacter enshiensis DW2-9T

Paenirhodobacter enshiensis is a non-photosynthetic species that belongs to family Rhodobacteraceae. Here we report the draft genome sequence of Paenirhodobacter enshiensis DW2-9T and comparison results to the available related genomes. The strain has a 3.4 Mbp genome sequence with G + C content of 66.82 % and 2781 protein-coding genes. It lacks photosynthetic gene clusters and putative proteins necessary in Embden-Meyerhof-Parnas (EMP) pathway, but contains proteins in Entner-Doudoroff (ED) pathway instead. It shares 699 common genes with nine related Rhodobacteraceae genomes, and possesses 315 specific genes.


Introduction
Family Rhodobacteraceae belongs to Proteobacteria which was established by Garrity et al. [1] and contains 105 genera including both chemoorganotrophic and photoheterotrophic bacteria. The type genus was Rhodobacter which was first proposed by Imhoff et al. in 1984 [2] and comprised of only photosynthetic species [3][4][5][6][7][8]. In 2013, we proposed Paenirhodobacter enshiensis DW2-9 T to represent one of the non-photosynthetic genera of Rhodobacteraceae [9]. The main differences between Paenirhodobacter and its closest relative Rhodobacter are their photosynthetic characteristics and major polar lipid types [9]. Haematobacter is another non-photosynthetic genus of Rhodobacteraceae [10] and the main difference between Haematobacter and Paenirhodobacter is the cultivation condition [9][10][11].
So far, the genus Paenirhodobacter contains only one species, Paenirhodobacter enshiensis. The main characters of P. enshiensis DW2-9 T are non-photosynthetic and possessing phosphatidylglycerol, phosphatidylethanolamine and aminophospholipid as the major polar lipids [9]. In addition, we found that strain P. enshiensis DW2-9 T was able to reduce soluble selenite (Se 4+ ) into insoluble elemental selenium nanoparticle (Se 0 ). Since Se 0 is less bioavailable, this strain could potentially been used in bioremediation of soil or water with selenite-contamination.
In order to provide genomic information for elucidating the mechanism of bacterial selenite reduction, as well as the taxonomic study, we performed genome sequencing of strain P. enshiensis DW2-9 T , together with its close relatives Haematobacter missouriensis CCUG 52307 T [10] and Haematobacter massiliensis CCUG 47968 T [11]. In this study, we report the genomic features of P. enshiensis DW2-9 T and the comparison results to the close relatives. This microorganism is not belonged to a larger genomic survey project.

Classification and features
Strain P. enshiensis DW2-9 T was isolated from soil near a sewage outlet of the Bafeng pharmaceutical factory, Enshi city, Hubei province, PR China. The general features of P. enshiensis DW2-9 T are shown in Table 1. The 16S rRNA gene based phylogenetic tree showing the phylogenetic relationships of P. enshiensis DW2-9 T to other taxonomically classified type strains of the family Rhodobacteraceae could be found in our previous study [9].
Colonies are convex, circular, smooth and white after 2 days of incubation on modified Biebl & Pfennig's agar at 30°C [9]. The strain was able to reduce 0.2 mmol/L of sodium selenite (Na 2 SeO 3 ) into Se 0 within 2 days when grown in Luria-Bertani medium.

Growth conditions and genomic DNA preparation
Strain P. enshiensis DW2-9 T was grown aerobically in LB medium at 28°C for 36 h. The DNA was extracted, concentrated and purified using the QiAamp kit according to the manufacturer's instruction (Qiagen, Germany).

Genome sequencing and assembly
The genome of P. enshiensis DW2-9 T was sequenced by Illumina technology [19]. An Illumina standard shotgun library was constructed and sequenced using the Illumina MiSeq 2000 platform, which generated 3,128,974 reads totaling 941.8 Mbp.
All original sequence data can be found at the NCBI Sequence Read Archive [20]. The following steps were performed for removing low quality reads: (1) removed the adapter in the reads, (2) cut the 5' end bases which were not A, T, G, C, (3) filtered the reads which have a quality score lower than 20, (4) filtered the reads which The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome b Also includes 19 pseudogenes, 10 RNA genes, 45 rRNAs and 1 ncRNA Fig. 2 A graphical circular map of the genome performed with CGview comparison tool [32]. From outside to center, ring 1, 4 show protein-coding genes colored by COG categories on forward/reverse strand; ring 2, 3 denote genes on forward/reverse strand; ring 5 shows G + C% content plot, and the innermost ring shows GC skew contained N more than 10 percent, (5) removed the reads which have the length less than 25 bp after processed by the previous four steps. The processed reads were assembled by SOAPdenovo v1.05 [21]. The final draft assembly contained 153 contigs in 85 scaffolds. The total size of the genome is 3.4 Mbp and the final assembly is based on 764.6 Mbp of Illumina data, which provides an average 222× coverage of the genome. The simulated genome of P. enshiensis DW2-9 T is a set of contigs ordered against the complete genome of Rhodobacter capsulatus SB1003 (NC_013034) using Mauve software [22].

Genome annotation
The draft genome of P. enshiensis DW2-9 T was annotated through the RAST server version 2.0 [23] and the National Center for Biotechnology Information Prokaryotic Genome Annotation Pipeline, which combines the gene caller GeneMarkS + [18] with the similarity-based gene detection approach.
Protein function classification was performed by WebMGA [24] with E-value cutoff 1-e 10 . The transmembrane helices were predicted by TMHMM Server v. 2.0 [25]. Internal gene clustering was performed by OrthoMCL using Match cutoff of 50 % and E-value Exponent cutoff of 1-e 5 [26,27]. Signal peptides in the genome were predicted by SignalP 3.0 server [28]. The translation predicted CDSs were also used to search against the Pfam protein family database [29], KEGG [30] and the NCBI Conserved Domain Database through the Batch web CD-Search tool [31].

Genome properties
The whole genome of P. enshiensis DW2-9 T is 3,439,591 bp in length, with an average GC content of 66.82 %, and is distributed in 112 contigs (>200 bp). The genome properties and statistics are summarized in Table 3 and Fig. 2. A total of 2781 protein-coding genes are identified and 78.99 % of them are distributed into COG functional categories (Table 4). The total is based on the total number of protein coding genes in the annotated genome

Profiles of metabolic network and pathway
Strain DW2-9 T is facultatively anaerobic and can utilize a variety of sole carbon substrates, including acetate, propionate, pyruvate, fumarate, malate, citrate, succinate, D-glucose, D-fructose and maltose [9]. Genome analysis showed that this strain has the corresponding enzymes to utilize these sole carbon sources and to catabolize them via different pathways (mainly by the TCA cycle and pentose phosphate). Especially in glycolysis, strain P. enshiensis DW2-9 T lacks the key enzyme 6phosphofructokinase that is essential in Embden-Meyerhof-Parnas (EMP) pathway. Instead, it contains 6-phosphogluconate dehydratase (KFI24690) and 2-keto-3-deoxyphosphogluconate aldolase (KFI24689) that were characterized in Entner-Doudoroff (ED) pathway.
All key genes necessary for fatty acid biosynthesis are present. All genes required for de novo synthesis of 15 common amino acids are present. Genes for biosynthesis of Ala, Asn, Met, Tyr and His are not present. Fig. 3 A phylogenetic tree highlighting the phylogenetic position of P. enshiensis DW2-9 T . The conserved protein was analyzed by OrthoMCL with Match Cutoff 50 % and E-value Exponent Cutoff 1-e 5 [26,27]. The phylogenetic tree was constructed based on the 699 single-copy conserved proteins shared among the ten genomes. The phylogenies were inferred by MEGA 5.05 with NJ algorithm [38], and 1000 bootstrap repetitions were computed to estimate the reliability of the trees. The genome accession numbers of the strains are shown in parenthesis As a non-photosynthetic bacterium, the known photosynthetic gene clusters, including the bch genes, puf genes and crt genes were not found in the genome of P. enshiensis DW2-9 T .
In this study, strain DW2-9 T was found to be capable of reducing selenite into selenium nanoparticle. It has been reported that low-molecular weight thiols such as glutathione [33] and cysteine [34], nitrite reductase [35], fumarate reductase [36], glutathione reductase and thioredoxin reductase [37] could reduce selenite into elemental selenium. In the genome of strain DW2-9 T , all the encoding genes of the respective enzymes mentioned above were found (e.g. KFI26491, KFI30857, KFI28250, KFI28810, KFI29698, KFI24274 and KFI29723).
In this study, we also sequenced the genomes of two members of Haematobacter genus, strain H. missouriensis CCUG 52307 T [10] and H. massiliensis CCUG 47968 T [11]. The draft genome sequences were 3.9 and Fig. 5 A graphical circular map of the comparison between reference strain Rhodobacter capsulatus SB 1003 and the three strains sequenced in this study. From outside to center, rings 1, 4 show protein-coding genes colored by COG categories on forward/reverse strand; rings 2, 3 denote genes on forward/reverse strand; rings 5, 6, 7 show the CDS vs CDS BLAST results of Rhodobacter capsulatus SB 1003 with P. enshiensis DW2-9 T , H. massiliensis CCUG 47968 T and H. missouriensis CCUG 52307 T , respectively; ring 8 shows G + C% content plot, and the innermost ring shows GC skew 4.1 Mbp, the G+C contents were 64.31 % and 64.56 %, and the numbers of predicted protein-coding genes were 3,612 and 3,806, respectively. Figure 5 shows the genome comparison results of strain P. enshiensis DW2-9 T , H. missouriensis CCUG 52307 T and H. massiliensis CCUG 47968 T using CGview comparison tool [32]. Table 5 presents the difference of the gene number (in percentage) in each COG category between strain P. enshiensis DW2-9 T , H. missouriensis CCUG 52307 T and H. massiliensis CCUG 47968 T .

Conclusions
Genomic analysis of P. enshiensis DW2-9 T revealed a high degree of consistency between genotypes and phenotypes, especially in sole carbon source utilization and nonphotosynthetic nature. Genome sequencing of strain P. enshiensis DW2-9 T provides extra supports for its taxonomic classification. The genome sequence of strain DW2-9 T also provides insights to better understand the molecular mechanisms of selenite reduction. In addition, this strain could potentially been used for bioremediation of environmental selenite-contamination.
The associated MIGS records are shown in Additional file 1: Table S1.

Additional file
Additional file 1: Table S1. Associated MIGS record.
Abbreviations RAST: Rapid annotation using subsystem technology; KEGG: Kyoto encyclopedia of genes and genomes.