Complete genome sequence of Kosakonia oryzae type strain Ola 51T

Strain Ola 51T (=LMG 24251T = CGMCC 1.7012T) is the type strain of the species Kosakonia oryzae and was isolated from surface-sterilized roots of the wild rice species Oryza latifolia grown in Guangdong, China. Here we summarize the features of the strain Ola 51T and describe its complete genome sequence. The genome contains one circular chromosome of 5,303,342 nucleotides with 54.01% GC content, 4773 protein-coding genes, 16 rRNA genes, 76 tRNA genes, 13 ncRNA genes, 48 pseudo genes, and 1 CRISPR array.

Strain Ola 51 T (=LMG 24251 T =CGMCC 1.7012 T ) is the type strain of the species Kosakonia oryzae and was isolated from surface-sterilized roots of the wild rice species Oryza latifolia grown in Guangdong, China [3]. Here we present the summary of the features of the K. oryzae type strain Ola 51 T and its complete genome sequence, which provides a reference for resolving the phylogeny and taxonomy of closely related strains and the genetic information to study its plant growthpromoting potential and its plant-associated life style.
The 16S rRNA gene sequence of K. oryzae Ola 51 T was deposited in GenBank under the accession number EF488759 [3]. A phylogenetic analysis of the 16S rRNA gene sequences from the strains belonging to the genus Kosakonia and Escherichia coli ATCC11775 T (the type strain of the type species of the type genus of the family Enterobacteriaceae) showed that K. oryzae Ola 51 T is most closely related to the strains belonging to the species K. radicincitans (Fig. 2) [3,[8][9][10][11].

Chemotaxonomic data
Whole-cell fatty acids were extracted from cells grown aerobically at 28°C for 24 h on the TSA medium according to the recommendations of the Microbial Identification System (MIDI Inc., Delaware USA). The whole-cell fatty acid composition was determined using a 6890 N gas chromatograph (Agilent Technologies, Santa Clara, USA) and the peaks of the profiles were identified using the TSBA50 identification library version 5.0 (MIDI). K. oryzae Ola 51 T shows the typical cell fatty acid profile of the genus Kosakonia [8]. The major fatty acids are C 16:0 , C 18:1 ω7c , C 16:1 ω7c/15:0 iso 2OH , C 17:0 cyclo and C 14:0 3OH/16:1 iso I [8,11].

Genome sequencing information
Genome project history K. oryzae Ola 51 T was selected for sequencing based on its taxonomic significance. The genome sequence is deposited in GenBank under the accession number CP014007. A summary of the genome sequencing project information and its association with MIGS version 2.0 [15] is shown in Table 2.
Growth conditions and genomic DNA preparation K. oryzae Ola 51 T was grown aerobically in liquid Luria-Bertani medium at 30°C until early stationary phase. The genome DNA was extracted from the cells by using a TIANamp bacterial DNA kit (Tiangen Biotech, Beijing, China). DNA quality (OD260/OD280 = 1.8) and quantity (22 μg) were determined with a Nanodrop spectrometer (Thermo Scientific, Wilmington, USA).

Genome sequencing and assembly
The genomic DNA of K. oryzae Ola 51 T was constructed into 8 -11 kb insert libraries and sequenced using PacBio SMRT sequencing technology [16] at the Duke University Genome Sequencing & Analysis Core  Phylum Proteobacteria TAS [35] Class Gammaproteobacteria TAS [36,37] Order "Enterobacteriales" TAS [38] Family Enterobacteriaceae TAS [39,40] Genus Kosakonia TAS [8] Species Kosakonia oryzae TAS [3,8] Type strain: Ola 51 T TAS [3] Gram stain Negative TAS [3] Cell , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [41] Resource. Sequencing was run on two SMRT cells and resulted in 124,997 high-quality filtered reads with an average length of 8,260 bp. High-quality reads were assembled by the RS_HGAP_Assembly.3 in the SMRT analysis v2.3.0. The final assembly produced 128-fold coverage of the genome.

Genome annotation
Automated genome annotation was done using the NCBI Prokaryotic Genome Annotation Pipeline [17]. Genes with signal peptides were predicted using SignalP [21]. Genes with transmembrane helices were predicted using TMHMM [22].

Genome properties
The genome of K. oryzae Ola 51 T contains one circular chromosome (Fig. 3). The chromosome contains 5,303,342 nucleotides with 54.0% G + C content. The genome contains 4,926 predicted genes, 4773 proteincoding genes, 105 RNA genes (16 rRNA genes, 76 tRNA genes, and 13 ncRNA genes), 48 pseudo genes, and 1 CRISPR repeats. Among the 4,773 protein-coding genes, 3,765 genes (78.88%) have been assigned functions, while 1008 genes (21.12%) have been annotated as hypothetical or unknown proteins ( Table 3). The distribution of genes into COG functional categories is presented in Table 4 and Fig. 3.

Insights from the genome sequence
The genome sequences of K. cowanii JCM 10956 T , K. radicincitans DSM 16656 T (=D5/23 T ) [23], K. radicincitans UMEnt01/12 [24], K. radicincitans YD4 [25], K. sacchari SP1 T [26], "K. pseudosacchari" JM-387 T [11], K. oryzae KO348 [27], and Enterobacter sp. R4-368 [28] which was close to K. sacchari SP1 T [26] had been deposited in the GenBank database.  The genome ANIs (Additional file 1: Table S1) between Ola 51 T and the other strains belonging to the genus Kosakonia were calculated using the Orthologous Average Nucleotide Identity tool [29]. The cut-off ANI value for species boundary was set at 95% -96% [30]. The ANI value (95.85%) between K. oryzae Ola 51 T and K. radicincitans DSM 16656 T is in the fuzzy zone 95% -96%. The digital DDH value between Ola 51 T and DSM 16656 T calculated by the Genome-to-Genome Distance Calculator [31] with the Formula 2 is 66.2%, below the 70% cut-off value for species boundary. Moreover, Ola 51 T and DSM 16656 T were differentiated by metabolic phenotypes [3,11] and ribosomal protein mass profiles [5]. Therefore, K. oryzae and K. radicincitans are closely related sister species. Strain YD4 was closer to K. radicincitans DSM 16656 T than K. oryzae Ola 51 T on the phylogenetic tree based on the 16S rRNA genes (Fig. 2). However, the ANI value and the digital DDH value between YD4 and K. radicincitans DSM 16656 T is 95.56% and 64.4%, respectively, while between YD4 and K. oryzae Ola 51 T is 97.04% and 74.3%, respectively. Therefore, the strain YD4 belongs to K. oryzae but not K. radicincitans. Fig. 3 Circular map of the chromosome of the Kosakonia oryzae strain Ola 51 T . From outside to the center: CDS on forward strand colored according to their COG categories (oranges/reds: information storage and processing; greens/yellows: cellular processes and signaling; blues/ purples: metabolism; grays: pooly characterized), CDS and RNA genes on forward strand, CDS and RNA genes on reverse strand, CDS on reverse strand colored according to their COG categories, GC content, and GC skew. The circular map was generated by CGView [44]  Strain KO348 was grouped with K. sacchari SP1 T , Enterobacter sp. R4-368, and "K. pseudosacchari" JM-387 T on the phylogenetic tree based on the 16S rRNA genes (Fig. 2). The ANI value between KO348 and K. oryzae Ola 51 T is 84.04%. The strain KO348 thus does not belong to K. oryzae. The ANI value between KO348 and Enterobacter sp. R4-368 [27], K. sacchari SP1 T , or "K. pseudosacchari" JM-387 T is 98.80%, 94.56%, or 94.05%, respectively. Therefore, KO348 and R4-368 belong to the same species, likely a novel species closely related to K. sacchari and "K. pseudosacchari".
K. oryzae Ola 51 T and YD4, K. radicincitans DSM 16656 T and UMEnt01/12, K. sacchari SP1 T , "K. pseudosacchari" JM-387 T , and Kosakonia sp. KO348 and R4-368 were all isolated from plants. Their genomes contain genes encoding multiple enzymes degrading plant cell wall polysaccharides and removing reactive oxygen species, likely facilitating endophytic colonization [32]. They all contain genes encoding the regulatory protein (Fha1) and structural proteins (Lip, IcmF, DotU and ClpV) and secreted proteins (VgrG and Hcp) of the type VI secretion system, which may play a role in the plant-associated lifestyle [32]. Except K. radicincitans DSM 16656 T and UMEnt01/12, these strains contain the most structural proteins (YscCJRSTUVN) of the type III secretion system, which is not widespread among the previously studied endophytic bacteria [32].
These plant-associated Kosakonia strains contain genes contributing to multiple plant growth-promoting activities. They all contain the nif gene cluster (nifJHDK-TYENXUSVWZMFLABQ) for the Mo-Fe nitrogenasedependent nitrogen fixation, the genes encoding indole-3acetaldehyde dehydrogenase, aspartate aminotransferase, aromatic amino acid aminotransferase and phenylpyruvate decarboxylase for producing the phytohormone auxin, and the budABC genes for producing volatile acetoin and 2,3-butanediol which induce plant systemic resistance to pathogens [33]. In addition, K. oryzae Ola 51 T and YD4, and K. radicincitans DSM 16656 T and UMEnt01/12 also contain the anf gene cluster (anfHDGK) for the Fe-Fe nitrogenase-dependent nitrogen fixation. In contrast, the clinical strain K. cowanii JCM 10956 T does not contain the nif gene cluster.

Conclusions
The phylogeny of the members of the genus Kosakonia based on the 16S rRNA gene sequences is roughly in agreement with their overall genome relatedness. The complete genome sequence of K. oryzae Ola 51 T provides the reference genome for genomic identification of strains belonging to K. oryzae. Analyses of the overall genome relatedness indices (ANI and digital DDH values), easily and reliably show that K. oryzae and K. radicincitans are closely related sister species and that the strain YD4, which shows close 16S rRNA gene-based phylogeney to K. radicincitans and was classified into K. radicincitans, belongs to K. oryzae. As well as YD4, which is able to promote growth of the yerba mate plants in low-fertility soils [14], K. oryzae Ola 51 T contains both the nif gene cluster and the anf gene cluster for nitrogen fixation and genes contributing to production of auxin and volatile acetoin and 2,3-butanediol. Therefore, K. oryzae Ola 51 T may be able to promote plant growth. Genomic analyses also show that K. oryzae Ola 51 T and YD4 may have the type III and VI secretion systems and thus motivate us to study the functions of the type III and VI secretion systems in the interactions between beneficial Kosakonia bacteria and plants.

Additional file
Additional file 1: Table S1. Average nucleotide identities (ANIs) between genomes of the strains belonging to the genus Kosakonia.

Not in COGs
The total is based on the total number of protein coding genes in the genome