High quality draft genome sequences of Pseudomonas fulva DSM 17717T, Pseudomonas parafulva DSM 17004T and Pseudomonas cremoricolorata DSM 17059T type strains

Pseudomonas has the highest number of species out of any genus of Gram-negative bacteria and is phylogenetically divided into several groups. The Pseudomonas putida phylogenetic branch includes at least 13 species of environmental and industrial interest, plant-associated bacteria, insect pathogens, and even some members that have been found in clinical specimens. In the context of the Genomic Encyclopedia of Bacteria and Archaea project, we present the permanent, high-quality draft genomes of the type strains of 3 taxonomically and ecologically closely related species in the Pseudomonas putida phylogenetic branch: Pseudomonas fulva DSM 17717T, Pseudomonas parafulva DSM 17004T and Pseudomonas cremoricolorata DSM 17059T. All three genomes are comparable in size (4.6–4.9 Mb), with 4,119–4,459 protein-coding genes. Average nucleotide identity based on BLAST comparisons and digital genome-to-genome distance calculations are in good agreement with experimental DNA-DNA hybridization results. The genome sequences presented here will be very helpful in elucidating the taxonomy, phylogeny and evolution of the Pseudomonas putida species complex.


Introduction
During a taxonomic study of Pseudomonas strains isolated from rice, petroleum fields and oil-brine in Japan, Iizuka and Komagata [1] proposed two new species in 1963, Pseudomonas fulva and Pseudomonas straminea (as cited in Uchino et al. [2]). These new species produced a water-insoluble yellow pigment, but not a water-soluble fluorescent pigment. Later, seven P. fulva strains obtained from culture collections were re-characterized and compared with the strains of related species by Uchino and collaborators [2]. Phylogenetic analysis based on 16S rRNA sequences, experimental DNA-DNA hybridization results and phenotypic characteristics led to the proposal of two new species: Pseudomonas parafulva (2 strains) and Pseudomonas cremoricolorata (1 strain). Three of the four remaining strains were maintained in the species P. fulva, and the other strain was identified as P. straminaea [2]. In a multilocus sequence analysis, the type strains of P. fulva, P. parafulva and P. cremoricolorata clustered in the Pseudomonas putida phylogenetic branch and are considered members of the P. putida group in the Pseudomonas fluorescens lineage [3]. The three species are taxonomically and ecologically closely related. Strains from these species have been isolated from rice paddy samples or from Japanese unhulled rice. P. fulva strains have also been studied for their endophytic growth in Scots pines and for their roles in plant growth promotion and protection against plant pathogenic fungi [4,5]. The antagonistic effect against plant pathogenic bacteria has also been demonstrated in other strains of Pseudomonas putida [6]. Additionally, P. fulva strains have been isolated from water collected from human-made container habitats of mosquitoes [7]. P. fulva was one of the most abundant species found in a survey of pseudomonads in human homes [8], and very recently P. fulva was identified as a member of a polymicrobial ventriculitis in humans [9]. The difficulty in identifying species closely related to P. putida in the clinical laboratory is highlighted by Rebolledo and collaborators [9]. Biosynthesis of medium-chain-length poly(3-hydroxyalkanoates) by a volatile aromatic hydrocarbons-degrading P. fulva has been proposed as candidate for the biotechnological conversion of toxic petrochemical wastes to valuable biopolymers [10].
In the context of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) project [11], the permanent, high quality draft genomes of the type strains of P. fulva, P. parafulva and P. cremoricolorata are presented. The genome sequences have been annotated, and the results are discussed in relation to the taxonomy of members of the P. putida phylogenetic group.

Classification and features
The type strains of the three species, P. fulva DSM 17717 T (=JCM 11242 T =NRIC 0180 T ), P. parafulva DSM 17004 T (=AJ 2129 T =JCM 11244 T =NRIC 0501 T ) and P. cremoricolorata DSM 17059 T (=JCM 11246 T =NRIC 0181 T ), were obtained from the DSMZ. All strains were isolated by Iizuka and Komagata [1,2] from Japanese rice paddies and were initially proposed as members of the new species P. fulva due to the deep yellow color of their colonies. P. fulva was included in the Approved Lists of Bacterial Names [12]. Uchino and collaborators re-characterized several strains obtained as P. fulva from culture collections and proposed two new species: P. parafulva (2 strains) and P. cremoricolorata (1 strain) [2].
All three type strains shared the basic phenotypic traits of the genus Pseudomonas: Gram-negative rods, motility via polar flagella, with strictly respiratory type of metabolism, catalase and oxidase activity and phylogenetic placement in the genus Pseudomonas on the basis of 16S rRNA gene sequencing. None of the three species produced water-soluble fluorescent pigments but produced a characteristic water-insoluble yellow pigment. Colonies appear smooth, round, flat to convex and pale/ creamy yellow on nutrient agar. The three species were differentiated from each other by several phenotypic tests: presence of the arginine dihydrolase pathway, growth at 37°C and assimilation of D-ribose, D-mannose, adonitol, 2-keto-D-gluconate, butyrate, valerate, caprate, isovalerate, itaconate, citraconate, glycerate, levulinate, Tween 80, phydroxybenzoate, inosine, glycine, L-ornithine, L-citrulline and nicotinate. An extensive list of phenotypic characteristics can be found in the original publication by Uchino et al. [2]. The classification and general features of P. fulva, P. parafulva and P. cremoricolorata type strains are given in Tables 1, 2 and 3.

Chemotaxonomic data
As reported by Uchino and collaborators [2] the DNA GC-content of the three type strains, as determined by chemical analysis, was 60.0 mol % in P. fulva and P. parafulva and 62.1 mol % in P. cremoricolorata. The percentages of G + C bases based on the genome analysis were 61.71 % for P. fulva DSM 17717 T , 62.42 % for P. parafulva DSM 17004 T and 63.47 % for P. cremoricolorata DSM 17059 T . The GC-contents determined by chemical analysis were slightly lower than those inferred from genome sequences. The predominant respiratory quinone was ubiquinone Q-9, but Q-8 and Q-10 were also detected in smaller amounts. The major cellular fatty acids were C16:0, C16:1 and C18:1, and the major 3-hydroxy fatty acids were C10:0 and C12:0 [2].
For protein analysis, cells were cultured in Luria-Bertani broth aerobically, with shaking at 30°C, harvested in the exponential growth phase and prepared for Whole-cell MALDI-TOF MS analysis using an Autoflex III mass spectrometer (Bruker Daltonik, Germany) as recommended by the manufacturer. Protein profiles clearly distinguished the type strains in the P. putida phylogenetic group [3]. A list of major proteins that met a minimum intensity threshold of 700, a minimum signal to noise threshold of 15, and a mass to charge ratio (m/z) higher than 3,000 and lower than 10,000 is included in Additional file 1.

Extended feature descriptions
Phylogenetic trees were reconstructed using different methods, namely the maximum-likelihood, maximumparsimony and neighbor-joining algorithms integrated in MEGA version 6 bioinformatics package [13], and also using the FastME 2.0 phylogeny inference program [14]. All phylogenetic trees tested showed similar topologies and the same strain groupings. The derived phylogeny of the species in the P. putida phylogenetic group based on 16S rDNA gene sequencing had low resolution, and the bootstrap values of branches were low (Fig. 1a). Therefore, a phylogenetic tree based on a multilocus sequence analysis with the partial sequences of three housekeeping genes (16S rDNA, gyrB, and rpoD) was constructed as recommended by Mulet et al. [3] (Fig. 1b). Most branches were supported in the MLSA phylogenetic tree by high bootstrap values, and all type strains were clearly separated in the P. putida phylogenetic group. The strain groupings (P. putida/Pseudomonas monteilii/P. parafulva/P. fulva; Pseudomonas soli/Pseudomonas mosselii/Pseudomonas entomophila/

Genome sequencing information
Genome project history Sequencing of the three type strains is part of Genomic Encyclopedia of Type Strains, Phase I: the one thousand microbial genomes (KMG-I) project [16], a follow-up of the GEBA pilot project [11,17]. Project information is deposited in the Genomes on Line Database (GOLD) [18], and the high quality draft genome sequence is deposited in GenBank and in the Integrated Microbial Genomes database (IMG) [19]. Draft sequencing, initial gap closure and annotation were performed by the DOE Joint Genome Institute (JGI) using state-of-the-art sequencing technology [20]. A summary of the project information is shown in Table 4. Genbank IDs are as follows: JHYU00000000 for P. fulva DSM 17717 T , AUEB00000000 for P. parafulva DSM 17004 T and AUEA00000000 for P. cremoricolorata DSM 17059 T .  [21], simulated pairedend reads were created from Velvet contigs using wgsim and simulated read pairs were reassembled using Allpaths-LG (version r42328) [22].

Genome annotation
Protein-coding genes were identified using Prodigal [23], as part of the DOE-JGI genome annotation pipeline [24]. Additional gene prediction analysis and manual functional annotation were performed within the Integrated Microbial Genomes (IMG) platform, which provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context [19]. Genome annotation procedures are detailed in Markowitz et al. [19] and references therein. Briefly, the predicted CDSs were

Genome properties
The assembly of the three genomes consisted of 4.7 Mb in 48 scaffolds for P. fulva, 4.9 Mb in 33 scaffolds for P. parafulva and 4.6 Mb in 27 scaffolds for P. cremoricolorata ( Table 5). The G + C content for each strain was 61.72, 62.42 and 63.47 %, respectively. The majority of protein-coding genes (78.96, 80.59 and 79.68 %) were assigned a putative function. The properties and statistics of the genomes are summarized in Table 5, and the number of genes associated with general COG functional categories is shown in Table 6.

Insights from the genome sequence
Experimental DNA-DNA hybridizations were performed by Uchino et al. [2], following the fluorometric procedure proposed by Ezaki et al. [25]. Taxonomic genome comparisons were calculated by two different procedures: Average nucleotide identity based on BLAST was calculated with the JSpecies program [26]. Digital DDH similarities among the genomes of the three type strains were calculated using GGDC web server version 2.0 [27] under recommended settings. The results are given in Table 7 [28], which is computed for all bacterial genomes in the Integrated Microbial Genomes system, P. fulva DSM 17717 T clustered in the same gANI clique with eight plant-associated genome-sequenced Pseudomonas sp. not yet classified at the species level, with an intra-clique ANI of 99.57 %, indicating that the 9 strains belong genomically to the same species, P. fulva. The strain P. fulva NBRC 16637 T is the equivalent type strain of the NITE (Biological Resource Center) and was included in the same clique. The GC-content variation within the clique was less than 1 % (61.58 %-61.88 %), which is proof of the value of draft genomes for taxonomy because the GC-content varies no more than 1 % within species [29], and all strains in the clique should be considered strains in the same genomic species [28]. Three additional genomes of strains identified as P. fulva, P. parafulva and P. cremoricolorata, available in the Genbank database on June 17, 2015, were also analyzed. The completely sequenced genome of P. cremoricolorata ND07 (CP009455) showed ANIb and dDDH values of 92 and 50 %, respectively, with P. cremoricolorata DSM 17059 T , indicating a close relationship that is below the species threshold. The complete genome of P. parafulva CRS01-1 (CP009747) showed an ANIb value of 81 % with the type strain of P. parafulva, the closest related type strain. Finally, as was previously documented, the complete sequenced genome of strain P. fulva 12-X (CP002727) demonstrated that it is clearly a distinct species, with an ANIb value of 75.24 % [30] with the P. fulva type strain. In all three cases, the genome comparisons did not support a correct species affiliation of the strains. The presence of genes related to carbohydrate and amino acid transport and metabolism is relevant for the fitness of environmental bacteria. These genes represent 12-13 % of the total genes detected in the three strains, and they also have taxonomic consequences. Substrate utilization is an essential criterion for Pseudomonas taxonomy, and several tests routinely used in Pseudomonas identifications have been employed in the present study.
Catalase and superoxide dismutase are relevant enzymes for protecting the cell against reactive oxygen and are characteristic of most Pseudomonas. Catalase activity was detected by Uchino et al. [2] in P. fulva, P. parafulva, and P. cremoricolorata. Accordingly, 3 genes potentially coding for catalase were found in P. fulva and P. parafulva, but only 2 were found in P. cremoricolorata; three genes coding for superoxide dismutase were detected in P. fulva and P. cremoricolorta genomes, but only two were found in P. parafulva. Testing for the presence of the arginine dihydrolase (or arginine deiminase) pathway, in combination with other biochemical tests, can also be of diagnostic value [31]. The arginine dihydrolase pathway transforms arginine to ornithine with ATP gain and allows limited growth in several Pseudomonas under anaerobic conditions. The arcA gene is present in the P. fulva and P. parafulva genomes but is absent in P. cremoricolorata, in accordance with the experimental data obtained by Uchino et al. [2]. All three strains were negative for nitrate reduction, nitrate respiration and PHB synthesis, and, accordingly, no gene related to these pathways was detected in any of the genomes. Cleavage of aromatic compounds was also tested using protocatechuate as a substrate; a gene coding for the protocatechuate 3,4-dioxygenase (3-oxoadipate pathway) was found in P. fulva and P. parafulva but was absent in P. cremoricolorata, confirming the ortho cleavage of the aromatic ring as reported by Uchino and collaborators [2]. All three strains possessed genes encoding key enzymes involved in glucose catabolism via the glycolysis, pentose-phosphate and 2-keto-3-deoxi-6 phosphogluconate pathways. The three species were recorded as amylase negative, but an alpha-amylase gene (amyA) was detected in all three genomes, indicating the potential ability to grow with starch as a substrate.
Bacterial secretion systems transport proteins across the cell envelope of Gram-negative bacteria to the external milieu and are considered critical for persistence in an ecological niche and for conquering a new one [32]. Type VI secretion system seems to be the most common and appears to be confined to proteobacteria. The TVISS consists of 13 essential conserved genes, many of which contain a number of functionally accessory elements. Several TVISS are often present in a single The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome b Number of DNA scaffolds and contigs, available in the JGI website and NCBI databases, respectively genome [33]. They have been mainly studied for their pathogenic role in the interaction between bacteria and hosts, but TVISS seems to play a role in mutualistic relationships between bacteria and eukaryotic cells or between bacteria, as well. A set of 15 conserved TVISS genes were found in P. fulva DSM 17717 T but were absent in the other two strains. P. fulva DSM 17717 T also has three copies of a Rhs element Vgr protein not present in the other strains that can be exported by the TVISS, but its exact function is still not known. The  The total is based on the total number of protein coding genes in the genome possible role of TVISS genes in the pathogenesis or in the interactions of P. fulva DSM 17717 T with the environment remains to be elucidated. Prophage-like elements in microbial genomes represent one of the main contributors of mobile DNA, also known as the mobilome [34], and are the main reason for bacterial intraspecies variability. The prophage contribution to the bacterial genome is highly variable. It can represent up to 8 % of the total chromosomal DNA [35], but phages may also be absent. The mobilomes of P. fulva, P. parafulva and P. cremoricolorata were predicted to contain 32, 30, and 44 genes, respectively. In addition to transposases, integrases and regulatory elements, clusters of bacteriophage structural genes (6 to 13 genes in a cluster) were found in the 3 strains: 2 clusters in P. fulva DSM 17717 T (6 and 12 genes in each cluster), 2 in P. parafulva DSM 17004 T (12 and 9 genes) and 3 in P. cremoricolorata DSM 17059 T (9, 13 and 10). CRISPR elements were not detected.

Conclusions
Genome comparisons confirmed the distinct species status of the three type strains analyzed, as well as the close relationships between them. The complete genome analysis also revealed important taxonomic results, highlighting the relevance of the correct species assignation of strains and the need for the genome sequences of species type strains to build a phylogenomic taxonomy. No discrepancies were found between the genome insights and the phenotypic traits previously published for the species. However, the gene content revealed potential properties not yet detected, such as the presence of secretion systems, whose relevance remains to be explored. The genome sequences of the three type strains will be very helpful in elucidating the phylogeny and evolution of the P. putida species complex, a relevant coherent group of closely-related species with important ecological and biotechnological implications.

Additional file
Additional file 1: Major protein profiles of P. fulva DSM 17717 T , P. parafulva DSM 17004 T and P. cremoricolorata DSM 17059 T type strains. The intensity value was determined as an average from all spectra containing that peak and the relative intensities with respect to the base peak in percentage are indicated within parenthesis. (m/z: mass to charge ratio; + and -: presence or absence of the corresponding protein). (PDF 99 kb)